问题描述:

I submit a query on a web page. The query takes several seconds before it is done. Only when it is done does it display an HTML table that I would like to get the information from. Let's say this query takes a maximum of 4 seconds to load. While I would prefer to get the data as soon as it is loaded, it would be acceptable to wait 4 seconds then get the data from the table.

The issue I have is when I make my urlread request, the page hasn't finished loading yet. I tried loading the page, then issuing a sleep command, then loading it again, but that does not work either.

My code is

import urllib.request

import time

uf = urllib.request.urlopen(urlname)

time.sleep(3)

uf.decode('UTF-8')

text = uf.read()

print (text)

The webpage I am looking at is http://bookscouter.com/prices.php?isbn=9781111835811 (feel free to ignore the interesting textbook haha)

And I am using Python 3.X on a Raspberry Pi

网友答案:

The prices you want are not in the page you're retrieving, so no amount of waiting will make them appear. Instead, the prices are retrieved by a JavaScript in that page after it has loaded. The urllib module is not a browser, so it won't run that script for you. You'll want to figure out what the URL is for the AJAX request (a quick look at the source code gives a pretty big hint) and retrieve that instead. It's probably going to be in JSON format so you can just use Python's json module to parse it.

相关阅读:
Top