I need a site scraped that contains 100,000 PDF files.
To access the files you must complete a simple user registration form.
The site does not like bots.
About 6% of file requests have a captcha that must be solved before the file can be accessed.
If an IP address overuses the site or has a usage pattern that indicates it may be a bot, the IP (and perhaps IP block or even ISP) will be blocked.
Bidders should have captcha solving capability and access to many IP addresses on different IP blocks and ISPs.
Winning bidder will be paid for every 10,000 files delivered.