Running billions of Crawlers (50$).

Running billions of Crawlers (50$).

Hello! Thank you for having interest in this topic.

The title might be very strange, but it has actual purpose.

If you ever have experience running a web crawler/scraper, I guess that you'd meet one common dillema.

It's about speed of the crawler. Just running a single instance of script, something takes around days to complete its scraping job.

or even more than days. perhaps weeks, even months. in case of running single instance scraper.

to compensate this deficiency most of cases we choose to setup multiple instances of crawler.

But then even using that method, there's certain limitation. Especially when you're one single person with one single hardware computer,

The multiple instances are meant to be restricted according to the computer's specs. About what type of GPU it is, and how many rams it has.

To go beyond that limit, I've come across a thinking that maybe, I could be using cloud computers. and run the crawlers instances in that multiple cloud computer.

I've posted something similar already in the past. Please look behind pastebin the link.

[login to view URL]

(Please take a look)

But so far, I only had the idea and never have seen if it's possible. So I need actual specific, detailed demonstration for it.

The title as you see it's saying about running billions of crawlers. But to be really honest I don't need that.

But I need to absorb 'actual happening' and how it could be done.

I'm 100% sure that there must be a way that is able to run Billions of Crawlers.

I'm good even if it's not Cloud computer based method, But if you have seen a single case of it,

I expect you'd have the knowledge how it's getting done.

That is the knowledge share I am expecting in this topic.

[login to view URL]

[login to view URL]

^ Related articles worth to take a look :)

Question 1.

Let's say we have 3 instances running in the cloud computing service. (as Windows os)

each is named as instance 1, instance 2, instance 3

and then let's say we have a very simple Python script. that opening a notepad in Windows, and type words in it.

And then, We should have set-up those python scripts inside of all these 3 instances,

Can we have overviewing script that deciding to write '123' in instance 1, and '456' in instance 2, '789' in instance 3?

Hopefully not entering in each of the instance and fixing the script one by one, just overviewing in one script.

Would it be possible in Cloud Computing Service?

Question 2.

If you have the knowledge, please let me know which program languages are most used when it comes to this parallel work way combined Cloud Computers.

Essential Note 1.

please record your screen show-casing all the answers of the question above. To demonstrate your answer.

I would be appreciated if this could be recorded as if it's a tutorial. :)

Thank you for reading! Have a good day and bid me if you think you can complete one. Anytime !

And I am willing to respond very quickly to explain and describe more what I exactly need.

also, if you can really make it out well this time in this project, I will promise to carry out further projects with you. Thanks alot.

Taidot: Pilvitietotekniikka , Google Cloud Platform, Virtual Machines, Web Crawling, Tietojen kaavinta verkosta

Näytä lisää: wells fargo 5401 beach fort worth, let japanese translation, costs running facebook application, database crawler, web crawl project, crawl whole internet, distributed web crawler, open source web crawler data, common crawl golang, common crawl example, crawling projects, per hour tape transcription worth, running sum total sql update, sql trigger running total, web crawlers script, website let type japanese, let wap work iphone, art director hello, running widgets iphone, website running slow

Tietoa työnantajasta:
( 13 arvostelua ) Incheon, Korea, Republic of

Projektin tunnus: #19426380

6 freelanceria on tarjonnut keskimäärin 90$ tähän työhön


Hi. I perfectly understand your problem and we already work on a solution for this. Soon we will have a new service and will go public with it. Shortly we want to build a network of personal computers connected and d Lisää

$111 USD 10 päivässä
(42 arvostelua)

Hi Employer, I have checked this link [login to view URL] can extract any data from any [login to view URL] I have scraped the data from these websites & stored scraped data into CSVs and excels as per my c Lisää

$77 USD 3 päivässä
(8 arvostelua)

Wow, that's a huge project description. So yes it is possible to run multiple instances of same script in same or different VMs without overriding each other work. How? Let's assume you have 1 million urls to sc Lisää

$50 USD 3 päivässä
(1 arvostelu)

Hi I'm an experienced Golang programmer & have been awarded as an Alibaba Cloud MVP. I have a lot of experience in writing scrappers and programming the cloud. I will help you to write scrappers Relevant Skills and Ex Lisää

$50 USD 3 päivässä
(0 arvostelua)

hello sir, i have worked on similar problem where more than 200 generic crawlers crawl around 5000 websites, this can be achieved by using docker (please google it to know more about docker containers ) and maintainin Lisää

$111 USD 3 päivässä
(0 arvostelua)

Allow us to take this opportunity to be introduced to you, I represent Advance bots, the company that provides custom software solutions and all kind of web automation bots. Also we have developed software for: • Ti Lisää

$140 USD 7 päivässä
(0 arvostelua)