问题描述:

I am just beginning to explore scrappy framework.

I have been reading scrapy to be used to extract urls/images etc from the page content and crawl.

My question is, is there a way to extract/print all the network resources loading in the webpage like how PhantomJS does print all the network resources in a webpage without extracting from the html content of the page, but directly from the network resources at the time of resource requested/completed itself.

Thanks

网友答案:

Scrapy don't render the webpage.

Scrapy just fetch the webpage's html code from the web server.

So when Scrapy fetched a webpage, the spider just visited the server once, and would not request the resources, like images and javascript files.

相关阅读:
Top