
Completed
Posted
Paid on delivery
Government portals that publish public records but do not offer bulk-download options. I need an automated solution that can search by number on this page and download each file in its native PDF form. Here is what I am after: • A repeatable scraper—Python capable of searching in specific domain, following pagination, and collecting accessible PDF link. • The script should save the PDFs locally in a clear folder structure (site / year / category). • A simple log or CSV report listing the URL, document title, and download status for every file processed. Acceptance criteria 1. All public records published in the specified date span are present as intact PDFs. 2. The log matches the count of files actually downloaded. Please make sure the code is well commented and easy for me to rerun whenever new records are released. If you have dealt with anti-bot mechanisms on government sites before, let me know—some domains may throttle or deploy basic captchas and I want the scraper to handle those gracefully while remaining compliant with each site’s terms of use.
Project ID: 40376108
54 proposals
Remote project
Active 23 days ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs

Hi, I understand you need an automated solution to scrape public records PDFs from government portals, handling searches, pagination, and saving files in a structured folder system, while producing a log of URLs, titles, and download status. The scraper must reliably capture all records within a specified date range and handle potential throttling or basic captchas without violating site terms. With solid experience in Python and web scraping, including handling anti-bot mechanisms on government sites, I can build a repeatable scraper that collects accessible PDF links, organizes them by site/year/category, and maintains a clear CSV log of all processed files. The code will be modular, fully commented, and easy to rerun whenever new records are released. My approach is to first implement the search and pagination system for the target domain, ensure accurate PDF detection and download, then set up the structured folder system and logging. I will include safeguards for throttling or basic captchas to ensure smooth and compliant operation. To proceed, I will clarify the specific domains and date ranges to target so the scraper can be precisely configured. Ihor
$20 USD in 1 day
1.7
1.7
54 freelancers are bidding on average $29 USD for this job

Hello there, I am experienced in web scraping and building scripts or a Windows desktop application using Python. I am also experienced in large data scraping from a given website, bypassing IP, Captcha, and anti-bot or cloud flair protection. Please message me to discuss this project in detail. Best Regards Enamul
$30 USD in 2 days
8.2
8.2

Hi I have expertise in Web Automation and can develop you a Python scraper to automate searching and downloading of PDF files from from your targeted Govt websites. I'm available to discuss details in chat. Abdul H.
$50 USD in 1 day
7.8
7.8

Hello Honor Good day, I came across your project and read your project requirements. I've worked on similar projects with excellent results. The field requires 10 years of professionalism and experience, enabling me to efficiently complete the project and deliver high-quality notes within tight deadlines. You can see an example of one of those projects in my portfolio here: www.freelancer.com/u/oadsmedia I can discuss your project in more detail at your convenience and provide a more detailed quote once I understand the full scope of your requirements. Thank you for your time and consideration. I look forward to the opportunity to work with you. Sincerely, Joni
$30 USD in 1 day
7.0
7.0

As an experienced 15-year professional, I have an impressive portfolio that aligns perfectly with your government public record scraping project. Not only am I skilled in Python web scraping, but I have a comprehensive knowledge of technologies like Selenium, BeautifulSoup, Scrapy, and Requests – all needed for your project's success. I understand the importance of accuracy and well-commented code for future continuation, and I can guarantee this will be delivered to you. Dealing expressly with anti-bot mechanisms on government sites in the past positions me as a compelling fit for your needs. I've surpassed tailored obstacles like site throttling and basic captchas while maintaining strict adherence to each platform’s terms of use. You needn't worry, my work won't breach any regulations. Moreover, my data management capabilities dovetail with the log creation you require. I will ensure all important details like URL, document title, and download status are meticulously recorded to validate the count of files downloaded. Besides these techni
$30 USD in 1 day
7.0
7.0

I can build a reliable Python scraper to search, paginate, and download all PDFs into an organized folder structure. It will include logging (CSV), retry handling, and be easy to rerun. Experienced with government sites, I’ll ensure stable, compliant scraping with clear documentation.
$120 USD in 3 days
7.4
7.4

Hi, I can do this project right now with 100% accuracy. If you need any sample please let me know. Thanks
$10 USD in 1 day
6.4
6.4

I can help you. Government portals frequently use session-locked search results and non-descriptive PDF filenames, which I will handle by mapping the metadata directly from the search table to your specific local folder structure. To bypass throttling and anti-bot measures, I will implement request-pacing with randomized headers and session persistence. The Python script will include robust error-handling for intermittent timeouts, ensuring the CSV log accurately flags failed downloads for easy manual verification without needing to restart the entire scrape.
$20 USD in 7 days
5.7
5.7

Hi, I can build a reliable Python scraper that searches the target government portal, navigates pagination, and downloads all available PDFs within your specified date range. My approach is to create a clean and repeatable script using requests and BeautifulSoup or Scrapy, depending on the site structure. The script will perform number-based searches, follow all relevant pages, and extract direct PDF links accurately. Each file will be saved locally in a structured format such as site/year/category, so the data remains organized and easy to manage. I will also generate a CSV log that records the source URL, document title, and download status for every file. This ensures that the log matches exactly with the downloaded files. To handle real-world conditions, I will include rate limiting, retry logic, and proper session handling to avoid throttling and ensure stability. If the site uses basic anti-bot protections, I will implement compliant strategies such as delays and header management. The final deliverable will include a fully commented script, a sample run, and a clear README explaining how to set up, run, and rerun the scraper when new records are released. Best regards, Doan
$20 USD in 1 day
5.8
5.8

Hi there, Myself suganya, hope you are doing good. My skills are a precise match for your requirements, and I am confident in delivering impactful results. I am very much confident in this project. AVAILABLE IMMEDIATELY TO SCRAP PUBLIC WEBSITE. I have more than 6 years of experience in data processing . Good in handling digital documents in excel, pdf, acrobat, word, etc... easily and quickly. More experience in data cleasning, formating, organizing in large volume data. Please consider this project for me, i have more time and patience to do this project carefully with 100% quality. Hopefully Waiting for you to discuss about this project. Please kindly check my efficiency and effort in all completed projects here, https://www.freelancer.com/u/suganyav12 Thanks, suganya SIMILAR EXPERIENCE, https://www.freelancer.com/projects/selenium/Conference-Sponsor-Data-Scraper/proposals https://www.freelancer.com/projects/beautifulsoup/Trade-Fair-Data-Web-Scraping https://www.freelancer.com/projects/data-scraping/Directory-Contact-Data-Extraction https://www.freelancer.com/projects/data-management/Compile-Outside-School-Hours-Care https://www.freelancer.com/projects/excel/Google-Maps-Lat-Long-Extraction
$15 USD in 1 day
5.6
5.6

I can develop an automated Python scraper to search by number on government portals, follow pagination, and download PDFs in their native format. The script will organize files into a clear folder structure (site/year/category) and generate a log or CSV report with URL, document title, and download status. This solution will be repeatable and efficient for ongoing data collection.
$30 USD in 1 day
6.2
6.2

Hello, I have done many python projects. Please send the details. Looking forward to your reply. Regards, Aditya
$30 USD in 3 days
5.5
5.5

With my diverse skill set and over 6 years of experience in web development, software architecture, and web scraping, I am confident in my ability to deliver a robust, well-commented Python scraper that will meet all your project requirements. I have extensive knowledge in scraping PDFs and creating an organized folder structure - characteristics that are fundamental to your project's success. My expertise includes handling anti-bot mechanisms and complying with site terms of use, which are essential for navigating government sites. I am adept at developing efficient and clean codes, ensuring your scraping tasks are performed without limitations or being blocked. Moreover, as an experienced Fullstack Developer, I am familiar with different database systems and CMS, skills that would add value to storing the collected data. Communication is key to delivering high-quality solutions that meet clients' needs. My fluency in English guarantees effective and transparent dialogue throughout the project duration. My commitment to exceeding expectations and my proven track record of generating dynamic and responsive solutions make me the ideal freelancer for your project. Let's team up and revolutionize how you handle public records!
$30 USD in 1 day
5.5
5.5

Hi, Lets get connect over a chat. I have more than 9 years of experience in building custom platforms in python. I will walk through to my work samples as well. I am online right now. Thanks Ali
$10 USD in 1 day
5.3
5.3

Hi — the real job here is not just collecting links, but building a repeatable pipeline that can search, paginate, download native PDFs, and prove exactly what succeeded and what did not. A common failure in public-record scrapers is when the script counts result pages as success, but some PDFs are redirected, incomplete, or throttled, so the log looks finished while the archive is missing documents. I’d structure this around three layers: search/discovery, PDF validation/download, and audit logging with deterministic folder naming by site, year, and category. The hardest decision early is whether document identity is based on search number, title, or source URL, because that affects deduplication and safe reruns. I’d keep the scraper compliant, well commented, and built so future releases can be pulled without reprocessing the entire archive.
$40 USD in 7 days
5.4
5.4

Hi, I can build a clean, repeatable Python scraper for public-record portals that searches by number, follows pagination, downloads accessible PDFs, and logs every result clearly for reruns. My focus is on reliability, compliant collection, organized file storage, and well-documented code that is easy to maintain and update as new records are published. Looking forward to your response. Best regards Shujoy
$20 USD in 2 days
4.9
4.9

Hi, I’m a full‑stack developer with over 5 years of custom coding experience, and I’ve built a few reliable Python scrapers for government portals that use pagination, search parameters and occasional captchas. I understand you need a repeatable script that can search by record number, walk through pages, grab the PDF links and store each file in a clear “site / year / category” layout, plus a CSV log of URL, title and status. My approach would be to use requests with session handling and BeautifulSoup to parse the results, while Selenium (headless Chrome) can step in if a simple captcha or throttling appears. After each download I’ll write an entry to the CSV and verify the file size to confirm integrity. The folder structure will be created on the fly based on the metadata you provide, and the code will be heavily commented so you can rerun it whenever new records appear. Do you have a preferred way to supply the list of record numbers (CSV, API, manual entry), or should the script pull them from the search page itself? Looking forward to delivering a robust, easy‑to‑maintain solution. Thanks
$40 USD in 2 days
4.7
4.7

Hi there, It looks like you need an automated solution to scrape public records from government portals that don’t allow bulk downloads. I can create a Python scraper that searches the specified domain, navigates pagination, and downloads PDF files into a well-organized folder structure. With my 4+ years of experience in web scraping and data extraction, I can ensure the script handles any anti-bot measures gracefully and complies with the site's terms. I’ll also implement a logging system to keep track of the URLs, document titles, and download statuses, so you can easily review the results. My goal is to make sure all records are captured accurately and that you have clear documentation for future use. Could you clarify if there are specific government sites or categories you want the scraper to focus on? Best regards, Arslan Shahid
$10 USD in 1 day
4.3
4.3

I read your project requirements and would be thrilled to collaborate with you. With expertise in Web Scraping and Data Extraction using Python, I specialize in navigating complex data structures and deliver efficient results and scalable solutions. Let’s connect to discuss further
$20 USD in 2 days
4.2
4.2

I will develop a Python scraper that searches the target portal by record number, handles pagination, and downloads each PDF in its original format. Files will be organized into a structured folder system (site/year/category), with a CSV log capturing URL, title, and download status to ensure accuracy. The code will be clean, well-commented, and reusable for future updates, with built-in retries and respectful request pacing to remain compliant with site policies.
$20 USD in 1 day
4.2
4.2

Hi, I can build a reliable Python scraper to automate searching, pagination, and downloading public records as original PDFs from your target government portal. The script will extract all available documents within your specified date range and save them in a clean structure (site/year/category). It will include a CSV log with URL, title, and download status to ensure full tracking and verification. I’ll implement retry logic, rate limiting, and session handling so the scraper runs smoothly and avoids failures. It will also support reruns for future updates. I have experience handling dynamic pages and basic anti-bot limits (throttling, headers, delays) while staying compliant with site terms. If captchas appear, I’ll design a safe fallback approach. Deliverables include clean, well-commented code, sample output, and clear instructions so you can reuse it. Timeline: 2–4 days Budget: $120 (flexible depending on complexity) Ready to start once you share the portal details.
$10 USD in 1 day
4.2
4.2

Skopje, Macedonia
Payment method verified
Member since Feb 8, 2018
$10-30 USD
$10-30 USD
$10-30 USD
$10-30 USD
$30-250 USD
$30-250 AUD
₹3000-3200 INR
₹12500-37500 INR
$30-250 USD
$1500-3000 USD
$20 CAD
₹600-1500 INR
₹1500-12500 INR
₹600-7000 INR
£2-5 GBP / hour
$30-250 NZD
$10-30 USD
$2-8 USD / hour
₹100-400 INR / hour
$10-60 USD
₹12500-37500 INR
$2-8 USD / hour
$30-250 USD
€30-250 EUR
₹75000-150000 INR