I need an efficient, well written php script that will complete the following requirements:
1) Index through a list of websites and download key information to a MySQL database. The websites URLs are sequential, so they can be called from a loop function (ie. [url removed, login to view], [url removed, login to view], etc). There are approximately 10,000 pages the script will need to parse.
2) Parse data from each website. There are 10 products returned on each page in a standard <TD> <TR> html structure. Each page has identical table structure. The parser would need to pull Product name, ID#, price, image link, and product link. Each product will be on its own row in the database.
3) The script will need to present itself like a user agent (ie. googlebot).
4) The script should multi-thread, working on multiple URL's in parallel. I would like the ability to modify the code to set how many threads at a time it will process in parallel, with the default being 5 pages at a time.
5) There script should allow for "polite downloading", such that I can put in a time requirement between page downloads and prevent the host from assuming it is an attack on their site.
6) The script should clear the database of all content each time before executing.