We want to scrape real estate listings of 3 different websites. More websites will be added later. Ideally we can add more websites by configuration (ideally we can configure the regex for each variable). We will provide a briefing with all values marked on the respective listings pages.
The goal for the tool is to scrape the newest listings so we can check them and if needed, post them on our own platform (of course with a reference to the source) instead of copy-paste them by hand.
The scraper should have the following behavior:
- 1st we scrape the index/result pages to find newly added listings
- 2nd we visit and scrape individual listings
The scraper should not scrape more than 5 listings per hour per website to prevent unnecessary load and the scraper from being banned.
Scraper has to be developed in Ruby (or RoR) and we will use Active Admin as CRUD for the data management. Database will be PostgreSQL.
You should also be able to set this up on Amazon AWS (EC2/RDS/S3). We might also integrate the tool you develop into an existing setup.
In addition you know about using proxies and can implement that.
Finally, please add the result of 6+5=? to your bid, so we know you're not a bid bot.
25 freelanceria on tarjonnut keskimäärin 591$ tähän työhön
You can bank on me for the expertise in scrapers and crawlers I have. Having delivered many such similar projects in the past, I look forward to discuss and start the project inline to your requirements
6+5=11 I have done this before for an Australian client, this is not very trivial and if you are not careful your IP would eventually get banned from their servers
6+5=11 Hi, i'm good at web scraping . Last year, i created private ruby gem for scraping specific site. Experienced on ActiveAdmin Gem , Postgresql database. Also I can deploy it on EC2 . Thanks, Mezbah Alam