{"id":793,"date":"2024-11-07T16:27:42","date_gmt":"2024-11-07T08:27:42","guid":{"rendered":"https:\/\/fwq.ai\/blog\/793\/"},"modified":"2024-11-07T16:27:42","modified_gmt":"2024-11-07T08:27:42","slug":"%e6%b2%a1%e5%81%9a%e8%bf%87python%e6%80%8e%e4%b9%88%e7%88%ac%e8%99%ab","status":"publish","type":"post","link":"https:\/\/fwq.ai\/blog\/793\/","title":{"rendered":"\u6ca1\u505a\u8fc7python\u600e\u4e48\u722c\u866b"},"content":{"rendered":"<blockquote><p>\n  \u5bf9\u4e8e\u6ca1\u6709 python \u7ecf\u9a8c\u7684\u4eba\uff0c\u53ef\u4ee5\u4f7f\u7528\u66ff\u4ee3\u65b9\u6848\u8fdb\u884c\u7f51\u7edc\u722c\u866b\uff0c\u5305\u62ec\uff1a\u7f51\u7edc\u722c\u866b\u5de5\u5177\uff1awebharvy\uff08\u514d\u8d39\uff0c\u6613\u4e8e\u4f7f\u7528\uff09\u3001scrapy\uff08\u9700\u8981 python \u77e5\u8bc6\u4f46\u6709\u5728\u7ebf\u6559\u7a0b\uff09\u65e0\u4ee3\u7801\u5de5\u5177\uff1aimport.io\u3001octoparse\u3001parsehubapi \u548c\u670d\u52a1\uff1agoogle search api\u3001webhose.io\u3001mozenda\u9009\u62e9\u6700\u5408\u9002\u7684\u89e3\u51b3\u65b9\u6848\u53d6\u51b3\u4e8e\u6570\u636e\u590d\u6742\u6027\u548c\u5927\u5c0f\u3002\n<\/p><\/blockquote>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-1111\" src=\"https:\/\/fwq.ai\/blog\/wp-content\/uploads\/2024\/11\/2024101814491335656.jpg\" width=\"800\" height=\"320\" srcset=\"https:\/\/fwq.ai\/blog\/wp-content\/uploads\/2024\/11\/2024101814491335656.jpg 800w, https:\/\/fwq.ai\/blog\/wp-content\/uploads\/2024\/11\/2024101814491335656-300x120.jpg 300w, https:\/\/fwq.ai\/blog\/wp-content\/uploads\/2024\/11\/2024101814491335656-768x307.jpg 768w, https:\/\/fwq.ai\/blog\/wp-content\/uploads\/2024\/11\/2024101814491335656-670x268.jpg 670w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" title=\"\u6ca1\u505a\u8fc7python\u600e\u4e48\u722c\u866b\u63d2\u56fe\" alt=\"\u6ca1\u505a\u8fc7python\u600e\u4e48\u722c\u866b\u63d2\u56fe\" \/><\/p>\n<p><strong>\u6ca1\u6709Python\u7ecf\u9a8c\u4e5f\u80fd\u8fdb\u884c\u7f51\u7edc\u722c\u866b<\/strong><\/p>\n<p>\u7f51\u7edc\u722c\u866b\u662f\u4e00\u7c7b\u81ea\u52a8\u6d4f\u89c8\u548c\u63d0\u53d6\u7f51\u7edc\u6570\u636e\u7684\u8f6f\u4ef6\u3002\u5bf9\u4e8e\u6ca1\u6709Python\u7ecf\u9a8c\u7684\u4eba\u6765\u8bf4\uff0c\u53ef\u4ee5\u4f7f\u7528\u4ee5\u4e0b\u66ff\u4ee3\u65b9\u6848\u8fdb\u884c\u7f51\u7edc\u722c\u866b\uff1a<\/p>\n<p><strong>1. \u7f51\u7edc\u722c\u866b\u5de5\u5177<\/strong><\/p>\n<ul>\n<li> <strong>WebHarvy\uff1a<\/strong>\u4e00\u6b3e\u514d\u8d39\u4e14\u6613\u4e8e\u4f7f\u7528\u7684\u7f51\u7edc\u722c\u866b\u5de5\u5177\uff0c\u53ef\u7528\u4e8e\u63d0\u53d6\u7279\u5b9a\u7f51\u7ad9\u7684\u6570\u636e\u3002<\/li>\n<li> <strong>Scrapy\uff1a<\/strong>\u4e00\u4e2a\u529f\u80fd\u5f3a\u5927\u7684Python\u6846\u67b6\uff0c\u53ef\u7528\u4e8e\u5f00\u53d1\u7f51\u7edc\u722c\u866b\u3002\u867d\u7136\u5b83\u9700\u8981Python\u77e5\u8bc6\uff0c\u4f46\u6709\u4e00\u4e9b\u5728\u7ebf\u6559\u7a0b\u548c\u793e\u533a\u652f\u6301\u53ef\u4ee5\u5e2e\u52a9\u521d\u5b66\u8005\u3002<\/li>\n<li> <strong>Beautiful Soup\uff1a<\/strong>\u4e00\u4e2aPython\u5e93\uff0c\u53ef\u7528\u4e8e\u89e3\u6790\u548c\u63d0\u53d6HTML\u548cXML\u6587\u6863\u4e2d\u7684\u6570\u636e\u3002\u5b83\u53ef\u4ee5\u4e0eScrapy\u7ed3\u5408\u4f7f\u7528\u6216\u5355\u72ec\u4f7f\u7528\u3002<\/li>\n<\/ul>\n<p><strong>2. \u65e0\u4ee3\u7801\u5de5\u5177<\/strong><\/p>\n<p><span>\u7acb\u5373\u5b66\u4e60<\/span>\u201cPython\u514d\u8d39\u5b66\u4e60\u7b14\u8bb0\uff08\u6df1\u5165\uff09\u201d\uff1b<\/p>\n<ul>\n<li> <strong>Import.io\uff1a<\/strong>\u4e00\u4e2a\u57fa\u4e8eweb\u7684\u5e73\u53f0\uff0c\u53ef\u8ba9\u7528\u6237\u65e0\u9700\u7f16\u5199\u4ee3\u7801\u5373\u53ef\u6784\u5efa\u548c\u8fd0\u884c\u7f51\u7edc\u722c\u866b\u3002<\/li>\n<li> <strong>Octoparse\uff1a<\/strong>\u4e00\u6b3e\u4ed8\u8d39\u5de5\u5177\uff0c\u5b83\u63d0\u4f9b\u4e86\u4e00\u4e2a\u62d6\u653e\u754c\u9762\uff0c\u4f7f\u6784\u5efa\u7f51\u7edc\u722c\u866b\u53d8\u5f97\u66f4\u52a0\u5bb9\u6613\u3002<\/li>\n<li> <strong>ParseHub\uff1a<\/strong>\u53e6\u4e00\u4e2a\u4ed8\u8d39\u5de5\u5177\uff0c\u53ef\u8ba9\u7528\u6237\u53ef\u89c6\u5316\u5730\u8bbe\u7f6e\u7f51\u7edc\u722c\u866b\u3002<\/li>\n<\/ul>\n<p><strong>3. API \u548c\u670d\u52a1<\/strong><\/p>\n<ul>\n<li> <strong>Google Search API\uff1a<\/strong>\u53ef\u7528\u4e8e\u83b7\u53d6\u7f51\u9875\u5185\u5bb9\u548c\u5143\u6570\u636e\u7684\u4fe1\u606f\u3002<\/li>\n<li> <strong>Webhose.io\uff1a<\/strong>\u4e00\u4e2a\u63d0\u4f9b\u5b9e\u65f6\u7f51\u7edc\u6570\u636e\u8bbf\u95ee\u7684API\u3002<\/li>\n<li> <strong>Mozenda\uff1a<\/strong>\u4e00\u4e2a\u6258\u7ba1\u5f0f\u7f51\u7edc\u722c\u866b\u5e73\u53f0\uff0c\u65e0\u9700\u7f16\u5199\u4ee3\u7801\u5373\u53ef\u63d0\u53d6\u6570\u636e\u3002<\/li>\n<\/ul>\n<p><strong>\u4f7f\u7528\u66ff\u4ee3\u65b9\u6848\u7684\u6ce8\u610f\u4e8b\u9879\uff1a<\/strong><\/p>\n<ul>\n<li>\u65e0\u4ee3\u7801\u5de5\u5177\u5f80\u5f80\u6bd4\u57fa\u4e8ePython\u7684\u89e3\u51b3\u65b9\u6848\u529f\u80fd\u66f4\u6709\u9650\u3002<\/li>\n<li>API\u548c\u670d\u52a1\u53ef\u80fd\u9700\u8981\u4ed8\u8d39\u6216\u53d7\u901f\u7387\u9650\u5236\u3002<\/li>\n<li>\u5bf9\u4e8e\u590d\u6742\u7684\u722c\u53d6\u4efb\u52a1\uff0c\u53ef\u80fd\u9700\u8981\u4e86\u89e3\u4e00\u4e9b\u7f16\u7a0b\u57fa\u7840\u77e5\u8bc6\u3002<\/li>\n<\/ul>\n<p>\u56e0\u6b64\uff0c\u5bf9\u4e8e\u6ca1\u6709Python\u7ecf\u9a8c\u7684\u4eba\u6765\u8bf4\uff0c\u53ef\u4ee5\u4f7f\u7528\u7f51\u7edc\u722c\u866b\u5de5\u5177\u3001\u65e0\u4ee3\u7801\u5de5\u5177\u3001API\u548c\u670d\u52a1\u8fdb\u884c\u7f51\u7edc\u722c\u866b\u3002\u6839\u636e\u9700\u8981\u6570\u636e\u7684\u590d\u6742\u6027\u548c\u5927\u5c0f\uff0c\u53ef\u4ee5\u9009\u62e9\u6700\u5408\u9002\u7684\u89e3\u51b3\u65b9\u6848\u3002<\/p>\n<p>\u4ee5\u4e0a\u5c31\u662f\u6ca1\u505a\u8fc7python\u600e\u4e48\u722c\u866b\u7684\u8be6\u7ec6\u5185\u5bb9\uff0c\u66f4\u591a\u8bf7\u5173\u6ce8\u7c73\u4e91\u5176\u5b83\u76f8\u5173\u6587\u7ae0\uff01<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u5bf9\u4e8e\u6ca1\u6709 python \u7ecf\u9a8c\u7684\u4eba\uff0c\u53ef\u4ee5\u4f7f\u7528\u66ff\u4ee3\u65b9\u6848\u8fdb\u884c\u7f51\u7edc\u722c\u866b\uff0c\u5305\u62ec\uff1a\u7f51\u7edc\u722c\u866b\u5de5\u5177\uff1awebharvy\uff08\u514d\u8d39\uff0c\u6613\u4e8e\u4f7f\u7528\uff09\u3001scrapy\uff08\u9700\u8981 python \u77e5\u8bc6\u4f46\u6709\u5728\u7ebf\u6559\u7a0b\uff09\u65e0\u4ee3\u7801\u5de5\u5177\uff1aimport.io\u3001octoparse\u3001parsehubapi \u548c\u670d\u52a1\uff1agoogle search api\u3001webhose.io\u3001mozenda\u9009\u62e9\u6700\u5408\u9002\u7684\u89e3\u51b3\u65b9\u6848\u53d6\u51b3\u4e8e\u6570\u636e\u590d\u6742\u6027\u548c\u5927\u5c0f\u3002 \u6ca1\u6709Python\u7ecf\u9a8c\u4e5f\u80fd\u8fdb\u884c\u7f51\u7edc\u722c\u866b \u7f51\u7edc\u722c\u866b\u662f\u4e00\u7c7b\u81ea\u52a8\u6d4f\u89c8\u548c\u63d0\u53d6\u7f51\u7edc\u6570\u636e\u7684\u8f6f\u4ef6\u3002\u5bf9\u4e8e\u6ca1\u6709Python\u7ecf\u9a8c\u7684\u4eba\u6765\u8bf4\uff0c\u53ef\u4ee5\u4f7f\u7528\u4ee5\u4e0b\u66ff\u4ee3\u65b9\u6848\u8fdb\u884c\u7f51\u7edc\u722c\u866b\uff1a 1. \u7f51\u7edc\u722c\u866b\u5de5\u5177 WebHarvy\uff1a\u4e00\u6b3e\u514d\u8d39\u4e14\u6613\u4e8e\u4f7f\u7528\u7684\u7f51\u7edc\u722c\u866b\u5de5\u5177\uff0c\u53ef\u7528\u4e8e\u63d0\u53d6\u7279\u5b9a\u7f51\u7ad9\u7684\u6570\u636e\u3002 Scrapy\uff1a\u4e00\u4e2a\u529f\u80fd\u5f3a\u5927\u7684Python\u6846\u67b6\uff0c\u53ef\u7528\u4e8e\u5f00\u53d1\u7f51\u7edc\u722c\u866b\u3002\u867d\u7136\u5b83\u9700\u8981Python\u77e5\u8bc6\uff0c\u4f46\u6709\u4e00\u4e9b\u5728\u7ebf\u6559\u7a0b\u548c\u793e\u533a\u652f\u6301\u53ef\u4ee5\u5e2e\u52a9\u521d\u5b66\u8005\u3002 Beautiful Soup\uff1a\u4e00\u4e2aPython\u5e93\uff0c\u53ef\u7528\u4e8e\u89e3\u6790\u548c\u63d0\u53d6HTML\u548cXML\u6587\u6863\u4e2d\u7684\u6570\u636e\u3002\u5b83\u53ef\u4ee5\u4e0eScrapy\u7ed3\u5408\u4f7f\u7528\u6216\u5355\u72ec\u4f7f\u7528\u3002 2. \u65e0\u4ee3\u7801\u5de5\u5177 \u7acb\u5373\u5b66\u4e60\u201cPython\u514d\u8d39\u5b66\u4e60\u7b14\u8bb0\uff08\u6df1\u5165\uff09\u201d\uff1b Import.io\uff1a\u4e00\u4e2a\u57fa\u4e8eweb\u7684\u5e73\u53f0\uff0c\u53ef\u8ba9\u7528\u6237\u65e0\u9700\u7f16\u5199\u4ee3\u7801\u5373\u53ef\u6784\u5efa\u548c\u8fd0\u884c\u7f51\u7edc\u722c\u866b\u3002 Octoparse\uff1a\u4e00\u6b3e\u4ed8\u8d39\u5de5\u5177\uff0c\u5b83\u63d0\u4f9b\u4e86\u4e00\u4e2a\u62d6\u653e\u754c\u9762\uff0c\u4f7f\u6784\u5efa\u7f51\u7edc\u722c\u866b\u53d8\u5f97\u66f4\u52a0\u5bb9\u6613\u3002 ParseHub\uff1a\u53e6\u4e00\u4e2a\u4ed8\u8d39\u5de5\u5177\uff0c\u53ef\u8ba9\u7528\u6237\u53ef\u89c6\u5316\u5730\u8bbe\u7f6e\u7f51\u7edc\u722c\u866b\u3002 3. API \u548c\u670d\u52a1 Google Search API\uff1a\u53ef\u7528\u4e8e\u83b7\u53d6\u7f51\u9875\u5185\u5bb9\u548c\u5143\u6570\u636e\u7684\u4fe1\u606f\u3002 Webhose.io\uff1a\u4e00\u4e2a\u63d0\u4f9b\u5b9e\u65f6\u7f51\u7edc\u6570\u636e\u8bbf\u95ee\u7684API\u3002 Mozenda\uff1a\u4e00\u4e2a\u6258\u7ba1\u5f0f\u7f51\u7edc\u722c\u866b\u5e73\u53f0\uff0c\u65e0\u9700\u7f16\u5199\u4ee3\u7801\u5373\u53ef\u63d0\u53d6\u6570\u636e\u3002 \u4f7f\u7528\u66ff\u4ee3\u65b9\u6848\u7684\u6ce8\u610f\u4e8b\u9879\uff1a \u65e0\u4ee3\u7801\u5de5\u5177\u5f80\u5f80\u6bd4\u57fa\u4e8ePython\u7684\u89e3\u51b3\u65b9\u6848\u529f\u80fd\u66f4\u6709\u9650\u3002 API\u548c\u670d\u52a1\u53ef\u80fd\u9700\u8981\u4ed8\u8d39\u6216\u53d7\u901f\u7387\u9650\u5236\u3002 \u5bf9\u4e8e\u590d\u6742\u7684\u722c\u53d6\u4efb\u52a1\uff0c\u53ef\u80fd\u9700\u8981\u4e86\u89e3\u4e00\u4e9b\u7f16\u7a0b\u57fa\u7840\u77e5\u8bc6\u3002 \u56e0\u6b64\uff0c\u5bf9\u4e8e\u6ca1\u6709Python\u7ecf\u9a8c\u7684\u4eba\u6765\u8bf4\uff0c\u53ef\u4ee5\u4f7f\u7528\u7f51\u7edc\u722c\u866b\u5de5\u5177\u3001\u65e0\u4ee3\u7801\u5de5\u5177\u3001API\u548c\u670d\u52a1\u8fdb\u884c\u7f51\u7edc\u722c\u866b\u3002\u6839\u636e\u9700\u8981\u6570\u636e\u7684\u590d\u6742\u6027\u548c\u5927\u5c0f\uff0c\u53ef\u4ee5\u9009\u62e9\u6700\u5408\u9002\u7684\u89e3\u51b3\u65b9\u6848\u3002 \u4ee5\u4e0a\u5c31\u662f\u6ca1\u505a\u8fc7python\u600e\u4e48\u722c\u866b\u7684\u8be6\u7ec6\u5185\u5bb9\uff0c\u66f4\u591a\u8bf7\u5173\u6ce8\u7c73\u4e91\u5176\u5b83\u76f8\u5173\u6587\u7ae0\uff01<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16],"tags":[],"class_list":["post-793","post","type-post","status-publish","format-standard","hentry","category-16"],"_links":{"self":[{"href":"https:\/\/fwq.ai\/blog\/wp-json\/wp\/v2\/posts\/793","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fwq.ai\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fwq.ai\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fwq.ai\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fwq.ai\/blog\/wp-json\/wp\/v2\/comments?post=793"}],"version-history":[{"count":0,"href":"https:\/\/fwq.ai\/blog\/wp-json\/wp\/v2\/posts\/793\/revisions"}],"wp:attachment":[{"href":"https:\/\/fwq.ai\/blog\/wp-json\/wp\/v2\/media?parent=793"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fwq.ai\/blog\/wp-json\/wp\/v2\/categories?post=793"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fwq.ai\/blog\/wp-json\/wp\/v2\/tags?post=793"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}