First scrape the backlog of approx 100-150k property listings over a suitable period of time (which might be 2-4 weeks) and thereafter set the system to continually scrape new listings.
We need the usual data such as URL, listing ID, listing date, price, location, beds, baths, description, etc etc and photos More details to be shared via PM.
This data will eventually be imported into our current system and MySQL database, however for now you may wish to store such data in some NoSQL format and we will deal with the importing at a later date. All we ask for is to be able to see how many listings have been scraped to date, and then how many on a daily basis, so we can see the system working. An email alert could be created if no new listings have been added.
Our preference would be to use Python.