I have a php script that scrapes product prices from three website sources. After the data is scraped, I have to manually compare the prices to another website. I would like the following script, capable of running all the scraping in parallel, in either PhP or Python:
1) From a list of product ID numbers in my mysql data base, scrape three websites and gather the products that meet my criteria (I have the search URL's set up already, so the script only needs to index the product code and download the page). I would like the ability to set the number of parallel threads that are working.
2) Using the data in step 1, compare the selling prices to the maximum buying prices at a comparison website (the comparison site is fairly intuitive, though will require cookies to work with a script...I have some python code started that might help). The comparison site may ban my IP if I overload it with requests, so I would like the ability to specify how many threads, as well as specify just one thread with a delay.
3) Output the results to a csv file, with a few calculated values.