We have an email scraper that can find emails on the web based on urls or company names. I attached its screenshot.
Source: <[login to view URL]> (8.68 MB)
If we import a list of company names, the scraper looks on search engines for url of each company, then it looks for email associated with that url. Then it extracts names from emails based on a list of common names I have. For example "Jennifer" would be extracted from "jennifersmith@[login to view URL]"
We want to make this a bit more complex:
1) we import a list of Company Names or URLS (this part is done)
2) system looks for urls based on company names if needed (this part is done)
3) system looks / scans website files (Web option) and looks for emails there. See how it is done with this software [login to view URL] It would be good if the system can scan xml files, not just html
4) we want the system to see what xml files a website tries to load to itself and analyze xml / txt files. Is this possible? - see illustration <[login to view URL]>
5) If the system was not able to find any emails on the Web, then it tries to look for them on Google. (this part is done)
6) system writes all found emails as separate records to a text file on computer
7) system extracts names from emails and records them in a separate column (this part is done and we will give you name extracting code / logic)