We are seeking a computer programmer to build a custom web scraper for the HeinOnline law database. The scraper would be using a custom list of international treaties as the inputs and the output would be the number of hits (the number of unique articles that mention the treaty) for each treaty by year from 1945-2014. Ideally, we would want the results in a list format as well as a total for each treaty in each year. We would prefer the programmer to use a publicly available, open source scraper, such as Scrappy, so that the scraper/crawler could be adjusted if there were changes in our needs or to the HeinOnline database. We would provide access to the HeinOnline database.
The scraper would require the following features:
• The ability to search for a list of user-specified keywords
• A GUI for the scraper so that we would be able to run the scraper at the time of our choosing and without reliance on the programmer
• The ability to search multiple query names for a single treaty
• A sustainable program that could be executed each year
• The scraper must be able to search for the treaty names in the order the words appear
“Charter of the United Nations” vs. “Charter”…”United Nations”
• The output should also include metrics for the scrape itself such as date/time
• The scraper must be able to access HeinOnline at varying intervals (1 second, 3 seconds, 7 seconds) so that we do not disturb the HeinOnline server
There is the possibility for additional work and the development of additional scraping tools depending on our needs and the performance of the developer. A sample of the query names from a previous scrape of the data is attached.
The developer should have proven experience in Python and Webscraping. Data science expertise or previous experience with an open source scrapping program is a plus. If you any questions, please feel free to reach out.
Hi, I have great experience in website data extraction. i have done the extraction of many sites like [url removed, login to view],[url removed, login to view],[url removed, login to view],[url removed, login to view],[url removed, login to view],[url removed, login to view] and many more i have read th Lisää
17 freelancers are bidding on average $240 for this job
Hi sir, I am scraping expert, I have did too many similar projects, please check my feedback then you will know. Can you tell me more details? then I will provide demo data for you. Thanks, Kimi
I am an expert in Python/scrapy, and have a lot of projects done here. I am interested in your project, please contact me to discuss more detailed requirements, thanks!
Hi there. I am an experienced web scraper. I work with an already developed API called Import.io. Please contact me to discuss further details of your project. Thanks, Daniel
I already have developed the scrappers for one of my Indian client and can show you the demo. Plz let me know if you want it asap. -Subhasish ----------------------------------------------------------