Write a Hadoop MapReduce program to compute frequencies (number of occurrences) of bigrams
that appear in the lines of a collection of documents.
A bigram consists of two consecutive words in a line of text. For example, consider a line :
"srm cs it students". This line will only have three bigrams: "srm cs", "cs it"
and "it students" and their count will be 1 respectively (You do not need to consider the
reverse order). You need to count the frequencies of bigrams from the collection of documents. Your
two output files should look like the following picture (bigrams and frequencies) :
1. Hadoop jar [login to view URL] BigramCount input-directory output-directory
2. input-directory : any folder name that contains a number of text files
3. output-directory : any folder name that contains your output results
7 freelanceria on tarjonnut keskimäärin %project_bid_stats_avg_sub_26% %project_currencyDetails_sign_sub_27% tähän työhön
I've 2 years working experience with spark and distributed computing. I have solved many problems using spark and several of them contained sub-problems like this.