Overview In this project you will read in a large set of data taken from the web and then perform searches on this data. You will develop an interactive program which will respond with a list of websites that match your query. You will use different methods for searching. Introduction -- Web search In this assignment you will write a number of different algorithms for solving the same problem. The problem is to search a set of documents (web pages in this case) for a given word or words. You be given one or more directories as input. In each directory there will be a set of files. Each file is a single webpage and will be named fxxxxx where each x is a numeric digit (there are exactly 5 digits). The one exception to this is a "special" file found in each directory named index. It contains a mapping of the file number (the xxxxx part) to an actual URL for that file. Your program is to be able to find all files/URLs that match a specific query. You will build a structure similar to an inverted index in order to do your seach. When your program is started, the user will specify the set of directories to consider. Your program is to read the data in from those directories and then prompt the user for a request. Your program will then parse the user's request and respond accordingly. The user's command will be to either search the files to find matches to a set of keywords, add a directory to the list of directories to be considered, or quit the program. The search and add commands have a number of different options associated with them.
** Please note that i have some functional code that implements a hash table, and file vector. You may choose to build on that code or do it from scratch yourself. 1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done. by MONDAY April 14th 2003 10 pm EST.
SUN Solaris 8 g++ compiler command line program