I need you to develop some software for me. I would like this software to be developed for Mac using Java or Python, which ever is cheaper--it doesn't have to be Anthelion. I need an open source crawler scraper bot for semantic domain searches. Apache platform is fine. Anthelion set up for my needs would work.
I need a semantic vector domain search for URLs of emagazines I use for research. The schema is simple, at high level: input all magazine tiles/seed URLs used, followed by phrases directly from the article(good for plagiarism detection) cited, author name, topic, etc. Remove any proprietary data, and o/p the magazine and web page indexed URLs, search domain, and time stamp)
That's it. UI will run from the web, I haven't done this set up so if there are hidden costs let me know.