
Suoritettu
Julkaistu
Maksettu toimituksen yhteydessä
We are looking for an experienced developer to build a robust web scraping solution capable of extracting structured data from a login-protected medical/drug repository website. The platform contains a large database of drug information (potentially hundreds of thousands to over a million pages). The scraper should be able to navigate through the website after login, systematically extract relevant drug data, and store it in a structured format. Scope of Work: Develop a scraper that can log into a protected website. Navigate through the drug repository pages. Extract structured information from each drug page. Handle pagination and large-scale crawling. Implement mechanisms to prevent crashes or interruptions during long scraping runs. Store extracted data in a structured format such as JSON, CSV, or a database. Data to Extract (Example Fields): Drug name Active ingredients Indications Dosage information Contraindications Side effects Drug interactions Pharmacology details Any other structured medical information available on the page Technical Requirements: Experience with large-scale web scraping. Ability to handle login/session-based websites. Familiarity with tools such as Selenium, Playwright, Puppeteer, Scrapy, or similar frameworks. Knowledge of handling dynamic JavaScript-rendered pages. Experience with data parsing and structured data storage. Ability to implement error handling and logging. Deliverables: Fully functional scraping script or application. Clean, well-structured dataset. Documentation explaining how to run and maintain the scraper. Optional: automated scheduling or update mechanism. Preferred Skills: Python (Scrapy, Selenium, BeautifulSoup) or Node.js (Playwright, Puppeteer). Experience scraping large datasets. Experience with MongoDB or similar databases is a plus. Project Size: Medium to large. Please Include in Your Proposal: Your experience with similar scraping projects. Technologies you would use. Estimated timeline. Examples of previous work. We are looking for someone reliable who can build a scalable solution capable of handling large volumes of data efficiently.
Projektin tunnus (ID): 40291703
35 ehdotukset
Etäprojekti
Aktiivinen kuukausi sitten
Aseta budjettisi ja aikataulu
Saa maksu työstäsi
Kuvaile ehdotustasi
Rekisteröinti ja töihin tarjoaminen on ilmaista
35 freelancerit tarjoavat keskimäärin ₹6 495 INR tätä projektia

Hi i am an experienced developer with a strong background in large-scale web scraping, especially for healthcare and pharmaceutical platforms. Reviw my previous similar project. I can build a robust scraper that securely logs in, navigates through the repository, and extracts structured drug data into JSON, CSV, or a database. Using Python (Scrapy, Selenium, Playwright) with proper error handling, pagination support, and crash recovery, I’ll ensure the solution is scalable and reliable for millions of records. I’ve delivered similar projects handling complex, dynamic websites and can provide clean datasets along with clear documentation. I estimate delivery within 3–4 weeks and can also add optional scheduling for automated updates. You’ll get a dependable solution tailored to your needs. Regards Mukesh
₹10 000 INR 7 päivässä
4,3
4,3

With over 5 years of experience in automation and large-scale data handling, I am a perfect candidate for your web scraping project. My expertise in Python (including Scrapy and BeautifulSoup) complements your requirements for navigating login-based websites, handling structured data, and implementing error handling and logging. I’m comfortable with different databases like MongoDB which become handy when dealing with projects this size. I've proven my ability to successfully complete earlier web-scraping projects, managing efficient algorithms while keeping page rendering dynamics in check. Alongside this, my understanding of Selenium and Puppeteer will be invaluable in handling the complexity of your task. Moreover, considering the magnitude of your project, I understand the importance of devising a scalable and robust solution. I ensure clean architecture, security, performance, and most importantly good communication on deliverables at every step. With my milestone-first approach and track record of on-time delivery, you can be assured that I will not let you down. Let's embark on this exciting journey together! Allow me to demonstrate my skills, passion for automation and commitment by contributing to your web scraping project. Discussing it further will help me provide an estimated timeline reflecting the efficient delivery you deserve. Reach out anytime!
₹7 000 INR 7 päivässä
1,4
1,4

Hi Client, I can do this. I have reviewed your post carefully. I specialize in Python-based web scraping using Scrapy, Selenium, Playwright, and BeautifulSoup for large-scale data extraction. I have experience building login-based and high-volume scraping systems (hundreds of thousands to millions of pages), including handling sessions, pagination, dynamic JavaScript content, and anti-bot protections, while ensuring stability with retry logic, logging, and crash recovery. I also use AI-assisted tools like GitHub Copilot and ChatGPT to speed up development while keeping code consistent and maintainable. The solution will extract all required drug data fields and store them in a clean structured format (JSON/CSV or database like PostgreSQL/MongoDB) with a scalable architecture. Estimated timeline: 5–10 days depending on site complexity. Please message me so we can discuss the details and align on your exact requirements. Looking forward to working with you. Best regards, Oleksandr
₹5 000 INR 2 päivässä
0,9
0,9

Hi, Scraping a login-protected repository with hundreds of thousands of pages requires more than a basic crawler — the key is building a stable pipeline that can run for long periods without crashing or losing progress. I would approach this using Python with Scrapy + Playwright. Scrapy handles large-scale crawling efficiently, while Playwright manages login sessions and JavaScript-rendered pages. After authentication, the crawler would systematically traverse the drug repository, extract structured fields (drug name, ingredients, dosage, interactions, etc.), and store them in a structured database such as PostgreSQL or MongoDB. To handle large datasets, I would include request throttling, retry logic, checkpointing, and detailed logging so the scraper can resume automatically if interrupted. The system can also export the cleaned dataset into JSON/CSV and optionally run scheduled updates if the repository changes. You’ll receive the full scraping script, structured dataset, and clear documentation on how to run or extend the crawler. — Vishal
₹4 000 INR 1 päivässä
0,7
0,7

Hi There , Good morning! I’ve carefully checked your requirements and really interested in this job. I’m full stack node.js developer working at large-scale apps as a lead developer with U.S. and European teams. I’m offering best quality and highest performance at lowest price. I can complete your project on time and your will experience great satisfaction with me. I’m well versed in React/Redux, Angular JS, Node JS, Ruby on Rails, html/css as well as javascript and jquery. I have rich experienced in Data Mining, Automation, Elasticsearch, Python, JSON, Scrapy, PostgreSQL, BeautifulSoup, Data Extraction and Web Scraping. For more information about me, please refer to my portfolios. I’m ready to discuss your project and start immediately. Looking forward to hearing you back and discussing all details.. Looking forward to serve you
₹7 770 INR 3 päivässä
0,0
0,0

As an experienced Full-Stack Developer with advanced skills in PostgreSQL and Python, I believe I'm uniquely equipped to handle your large-scale web scraping project. Over the past nine years in the industry, I've successfully crafted numerous scalable and efficient data-driven solutions. Your need for structured data storage and my familiarity with MongoDB creates a perfect synergy. In terms of the technologies I’d employ, you can be confident that I’ll leverage the power of Scrapy, Selenium, BeautifulSoup and other appropriate tools such as Node.js's Playwright or Puppeteer based on our mutual preferences. It’s worth mentioning that my proficiency and hands-on experience with these technologies are demonstrated by the numerous similar projects I’ve completed successfully. With such an extensive database to scrape through and navigate, errors can arise if not handled properly. I assure you-- error handling is part of my core competencies. And as a bonus, my skill set extends to automated scheduling and updating mechanisms which could positively impact this project. Lastly, my communication skills and commitment to providing comprehensive documentation ensure effective collaboration during development while facilitating maintenance once the project concludes. Let me put my expertise to work for you; together we'll build a solution that excels on all fronts: efficiency, reliability, and scalability.
₹12 000 INR 7 päivässä
0,0
0,0

I will build a reliable and scalable web scraping solution to extract structured drug data from the protected repository. Using Python with Playwright/Scrapy, the scraper will securely log in, navigate through pages, handle pagination, and collect all required fields such as drug name, ingredients, dosage, side effects, and interactions. The system will include error handling, logging, and resume capabilities to support large-scale scraping without interruption. Extracted data will be stored in a clean structured format (JSON/CSV or database). I will also provide clear documentation on how to run and maintain the scraper.
₹2 000 INR 5 päivässä
0,0
0,0

I can develop a robust, industrial-scale scraping solution to extract and structure your medical database. As a Software Developer specialized in Node.js and MySQL, I don't just "scrape" data; I build resilient data pipelines capable of handling sessions, pagination, and large-scale crawling (1M+ pages) without interruptions. My Technical Approach: • Session & Login Management: Using Playwright/Puppeteer, I’ll implement persistent session handling to navigate the protected repository securely. • Large-Scale Architecture: I will use a headless browser cluster to handle dynamic JS-rendered content, ensuring high speed while staying under the radar. • Database Integrity: I’ll store the extracted drug data (Dosage, Interactions, Pharmacology) in a structured MySQL/MongoDB schema, optimized for deep medical analysis. • Resilience: Implementation of error-logging and "auto-resume" mechanisms to prevent data loss during long runs. Experience: I recently built a Comprehensive Management System involving complex relational data. This background ensures that the million-row dataset I deliver will be clean, de-duplicated, and ready for use. Timeline: 14–21 days for a fully audited, structured dataset. Ready to build a scalable engine for your medical data. Shall we discuss the specific login security of the target site? Best regards, Santino Chibotta
₹1 500 INR 4 päivässä
0,0
0,0

Based on your requirements, I can build a scalable web scraping system capable of logging into the protected repository, navigating the drug database, and extracting structured medical information reliably across a very large dataset. I have worked on similar large-scale scraping projects before, including login-protected platforms and dynamic websites with hundreds of thousands of records. I would implement the scraper in Python using Scrapy combined with Playwright or Selenium to handle authentication, session persistence, and JavaScript-rendered pages. The crawler will systematically traverse drug pages, manage pagination, and extract fields such as drug name, active ingredients, indications, dosage, contraindications, side effects, interactions, and pharmacology details. To ensure stability for long-running jobs, the scraper will include rate limiting, retry mechanisms, logging, and checkpoint-based progress tracking so the process can resume safely if interrupted. The data will be structured and stored in JSON, CSV, or a database such as MongoDB depending on your preference. I will deliver a fully functional scraper along with clean datasets and clear documentation for running and maintaining the system. Estimated timeline for the initial build and testing is approximately one week.
₹1 500 INR 1 päivässä
0,0
0,0

Hello Sir, I am a professional Python developer with over 7+ years of experience. I have read your requirements and am interested in working with you. I have hands-on experience in Python automation, web scraping, and data handling. My skills include Python (Scrapy, Selenium, BeautifulSoup) for efficient data extraction, and I can store and manage data in CSV files and database systems such as MongoDB. I focus on delivering reliable, clean, and well-structured solutions. I am ready to start immediately and look forward to your response. Best regards, SoftNexus Technologies
₹9 000 INR 2 päivässä
0,0
0,0

I specialize in building production-grade Python scraping pipelines and this project is exactly in my wheelhouse. For this job I would use Selenium to handle the login session and JavaScript-rendered pages, then Scrapy or BeautifulSoup to systematically crawl and extract the drug data at scale. I'll implement pagination handling, crash recovery, and error logging so the scraper runs reliably even across hundreds of thousands of pages. Final data delivered in clean JSON and CSV format, with optional MongoDB storage if needed. My approach: - Login session management with cookie persistence - Rotating delays to avoid detection/blocking - Checkpoint system so scraping resumes if interrupted - Structured output with all requested fields I've built similar large-scale data extraction pipelines professionally and can start immediately. One question — is the target site publicly known or will you share the URL after hiring? Happy to assess any anti-scraping measures upfront so there are no surprises during delivery. Ready to start today.
₹7 000 INR 7 päivässä
0,0
0,0

Hello, I can develop a scalable and reliable web scraping solution to extract structured drug data from the login-protected repository. I have experience building Python-based scraping systems using Scrapy, Selenium/Playwright, and BeautifulSoup capable of handling dynamic JavaScript pages and large datasets. The scraper will securely handle authentication, maintain sessions, and systematically crawl all drug pages with pagination support. I will implement error handling, retry mechanisms, logging, and checkpointing so the crawler can run for long periods without crashes. Extracted fields such as drug name, active ingredients, dosage, indications, contraindications, side effects, interactions, and pharmacology details will be parsed and stored in a structured format (JSON/CSV or database such as PostgreSQL/MongoDB) for easy analysis.
₹4 000 INR 4 päivässä
0,0
0,0

I have extensive experience building scalable web scrapers for login-protected sites using Python tools like Selenium and Scrapy, ensuring robust session management and error handling. For your project, I propose developing a Python-based scraper that logs in securely, navigates through pagination, extracts all specified drug data fields, and stores the data in MongoDB for efficient querying. The solution will include retry mechanisms to prevent crashes during long runs, with clear documentation and optional scheduling via cron. I estimate completion within 3-4 weeks. I’ve previously delivered similar large-scale medical data scraping projects with successful outcomes. I would love to chat more about your project! Regards, Adriaan Potgieter.
₹6 250 INR 7 päivässä
0,0
0,0

Hi, I have strong experience building large-scale web scrapers for authenticated and JavaScript-heavy websites. I can develop a reliable scraper that logs in, navigates the drug repository, handles pagination, and extracts structured medical data efficiently. I typically use Python with Playwright/Scrapy and implement robust error handling, logging, and resume mechanisms for long runs. Data can be delivered in JSON/CSV or a database like MongoDB. I can also include automated scheduling if needed. Happy to share similar scraping work and discuss timelines.
₹5 000 INR 7 päivässä
0,0
0,0

Hello, I can build a reliable and scalable web scraping solution to extract structured drug data from the login-protected medical repository. I have experience developing large-scale scrapers that handle authentication, dynamic pages, and long-running crawls without interruptions. The scraper will automatically log in, navigate through the drug repository, handle pagination, and extract the required fields such as drug name, active ingredients, indications, dosage information, contraindications, side effects, drug interactions, pharmacology details, and other structured data available on each page. The system will include retry logic, error handling, logging, and checkpointing to ensure the process continues smoothly even during long runs. For technology, I would use **Python with Playwright/Scrapy** (or Node.js with Playwright if preferred) to handle JavaScript-rendered pages and session management. The extracted data can be stored in **JSON, CSV, or a database such as MongoDB/PostgreSQL** for easy analysis and future updates. You will receive a fully functional scraping script, a clean structured dataset, and clear documentation explaining how to run and maintain the tool. Automated scheduling can also be added if regular updates are required. Estimated timeline: **2–3 weeks**, depending on the size and complexity of the repository. I would be happy to review the target site and discuss the exact data fields to ensure everything is captured accurately.
₹7 000 INR 7 päivässä
0,0
0,0

Hello, I have strong experience building large-scale web scraping systems using Python (Selenium, Scrapy, BeautifulSoup) for login-protected and dynamic websites. I have already completed a similar project where I scraped complete drug information from Tata 1mg, extracting detailed medical data for thousands of medicines. I can build a robust, scalable scraper with login handling, pagination, error recovery, and structured storage (JSON/CSV/DB). Estimated timeline for the initial working scraper is around 5–7 days depending on the website structure. I can also share examples and ensure clean, well-documented code.
₹1 500 INR 5 päivässä
0,0
0,0

Hello, I can help develop a reliable web scraping solution to extract structured data from your login-protected drug repository. For projects like this, I usually build a scalable scraper that can handle authentication, large datasets, and long scraping runs without interruptions. My approach would include: • Automated login handling with session management • Navigating through repository pages and pagination • Extracting structured drug information such as name, ingredients, dosage, side effects, and interactions • Handling dynamic JavaScript pages if needed • Implementing error handling, logging, and retry mechanisms for long scraping runs • Exporting the collected data into structured formats such as JSON, CSV, or a database Technologies I would likely use: Python with Selenium / Scrapy / BeautifulSoup depending on the site structure. These tools are reliable for handling login sessions, dynamic content, and large-scale data extraction. The final deliverables will include: • A fully functional scraping script • Clean structured dataset • Documentation explaining how to run and maintain the scraper Quick question: Does the website use heavy JavaScript rendering or is the data available directly in the page source? Looking forward to learning more about the platform. Best regards, Bannu
₹7 000 INR 7 päivässä
0,0
0,0

Hello, I can build a robust web scraper for the login-based drug repository website to extract large volumes of structured data efficiently. I have experience working on web scraping and data extraction projects, where I developed scripts that scrape website data and store the extracted information in CSV and Excel sheets for reporting and analysis. For this project, I will use Python with Scrapy/Selenium (or Playwright) to handle login sessions, dynamic JavaScript content, and large-scale crawling. The scraper will systematically navigate the repository pages and extract structured drug data. The solution will include: • Secure login session handling • Automated crawling of drug repository pages • Extraction of fields such as drug name, ingredients, dosage, side effects, interactions, etc. • Handling of pagination and large datasets • Error handling and logging to prevent interruptions • Export of data into JSON / CSV / database formats • Clean, well-documented source code for easy maintenance The scraper will be designed to run reliably for large datasets and avoid crashes during long scraping processes. Estimated delivery time: 5–7 days. I would be happy to discuss the website structure and ensure the scraper collects all required drug data accurately. Best regards, Vaidehi Panchal
₹7 000 INR 7 päivässä
0,0
0,0

Hello, I’m Mpumelelo Mabena, and I am confident in delivering a reliable, scalable web scraping solution tailored to your login-protected medical/drug repository. My skill set positions me well to execute this successfully. I understand the need for a clean, professional scraper that efficiently handles session-based login, dynamic JavaScript pages, and large-scale data extraction with error handling built-in. With expertise in AI automation, web development, and digital solutions, I can create a seamless, automated scraping tool that stores structured data in your preferred format. While I am new to Freelancer, I have strong real-world experience and have completed multiple successful projects off the platform. Could you share your preferred timeline and any priorities for maintaining or updating the scraper post-delivery?
₹6 650 INR 14 päivässä
0,0
0,0

As an experienced and detail-oriented developer, I understand the challenges you face with web scraping large-scale, login-protected sites. My extensive knowledge in Python, involving Scrapy, Selenium, and BeautifulSoup combined with my proficiency in Node.js using Playwright and Puppeteer has equipped me with the expertise to build a robust web scraper for your needs. I have been successful in extracting vast amounts of data from similar challenging sites before, and assure you I can deliver a fully functional scraping solution that navigates through your medical/drug repository website systematically and efficiently. Moreover, my strong understanding of handling dynamic JavaScript-rendered pages guarantees that no information will be missed from each drug page during the extraction process. I also possess excellent error handling and logging skills that will help prevent undesired crashes and interruptions during your long scraping runs. Complementing these abilities is my familiarity with storing structured data in formats like JSON, CSV, or databases (including MongoDB) - ensuring your extracted data is clean, well-structured, and easy to analyze. Our dedication to delivering high-quality projects means we do not compromise on scalability or efficiency. We understand your need for a solution capable of handling large volumes of data without significant setbacks.
₹7 000 INR 7 päivässä
0,0
0,0

New Delhi, India
Maksutapa vahvistettu
Liittynyt heinäk. 18, 2019
$10-30 USD
₹600-1500 INR
₹600-1500 INR
$30-250 USD
₹600-1500 INR
₹400-750 INR/ tunnissa
₹1500-12500 INR
$10-30 AUD
₹1500-12500 INR
$25-50 USD/ tunnissa
₹600-1500 INR
$30-250 USD
$30-250 AUD
$30-250 AUD
₹12500-37500 INR
£250-750 GBP
₹12500-37500 INR
€30-250 EUR
₹1500-12500 INR
$10-30 USD