Find Jobs
Hire Freelancers

549190 Meta Tag and WHOIS Scraper

N/A

Käynnissä
Julkaistu yli 12 vuotta sitten

N/A

Maksettu toimituksen yhteydessä
We're looking to do some research on a list of domain names. For each domain name we want to know the following: Domain, Company, Industry, Country, State, City, Zip/Postal, title tag, meta description, meta keywords That's it. The input will be the list of domain names pasted into a text area on a web page. The output should be a downloadable CSV or TAB delimited file I can load into Excel. There should also be visible output on the web page while running so we can see progress. Your script will have list of lists that contain the industry information. formatted as specified below or whatever way is easiest for you (though, we should be able to add/edit/delete from this list as much as we want. Hardcoded within the program is OK). The INDUSTRY "List of lists" could look like this: Industry,tag1,tag2,tag3,... And basically, if the domain name, home page title, meta description or keywords ("the data fields") have either the industry name or any of the tags in them, then that is the industry they should be assigned. Here's an example of what the industry lists might look like, but you can format them any way that works best for you. $legalwords = array("legal","law", "lawyer", "attorney","advoca"); $consultantwords = array ("consultant","consult","advisor"); $medicalwords = array ("medical","medicine","doctor","surgic","stem cell","scienc","research","laborat"); $contractorwords = array ("contractor","construction"); And since it is possible for a company to be in more than one of the industries, I'd like some logic that determines the most appropriate industry, maybe by counting how many matches there are in the data we are looking at for each category. The location information should come from the WHOIS database. We need error checking built into the program so that if a domain no longer exists, or if it redirects elsewhere, the script does not crash, but continues to the next URL. The output should be a TAB delimited file that we can easily load into EXCEL to do some analysis. That's the whole project. Once the project is awarded to you, I will send you a list of sample domains and a more complete list of industries and tags. When you reply, put the word "orange" in the subject line of your PM or BID. If you don't do that, I'll know you didn't read this spec completely, and I won't read your bid or PM. I need this done in the next 12 hours, but that should be easy as it's an extremely small and simple project for someone who knows PHP even reasonably well. And if you're an expert, this is probably an hour or less. Thanks. Mark
Projektin tunnus (ID): 2295134

Tietoa projektista

Etäprojekti
Aktiivinen 12 vuotta sitten

Haluatko ansaita rahaa?

Freelancerin tarjouskilpailun edut

Aseta budjettisi ja aikataulu
Saa maksu työstäsi
Kuvaile ehdotustasi
Rekisteröinti ja töihin tarjoaminen on ilmaista

Tietoja asiakkaasta

Maan UNITED STATES lippu
Winnetka, United States
5,0
6
Liittynyt huhtik. 19, 2010

Asiakkaan vahvistus

Kiitos! Olemme lähettäneet sinulle sähköpostitse linkin, jolla voit lunastaa ilmaisen krediittisi.
Jotain meni pieleen lähetettäessä sähköpostiasi. Yritä uudelleen.
Rekisteröitynyttä käyttäjää Ilmoitettua työtä yhteensä
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Ladataan esikatselua
Lupa myönnetty Geolocation.
Kirjautumisistuntosi on vanhentunut ja sinut on kirjattu ulos. Kirjaudu uudelleen sisään.