I need a website crawler. I have some base code ( it scans for meta tags one site at a time, and you have to enter each site manually. ) It would need:
1. Scan websites, and log them in the database.
2. Pickup links from sites and index those.
3. Never stop unless it's out of links to index.
4. If the site has no meta tags, take content from the page.
5. copy some content from each page ( like 150 words ) and place that in the database...
6. Will allow me to change a setting about weather or not a site can be scanned twice. ( A Check Box )
I'm guessing roughly 60 lines of code. I'd love to have a pause button, but it's not necessary.