I have a data set of 10,000 txt files that I have already cleaned and preprocessed in R (I will provide the code already written, the data are in the "corpus" format). I now need to convert and merge them in the csv format (where 1 row = 1 file). Then I need to convert that file (*.csv) to the following output (see [login to view URL]) where
[NumTerm1] [term_1]:[count] [term_2]:[count] ... [term_N]:[count]
[NumTerm2] [term_1]:[count] [term_2]:[count] ... [term_N]:[count]
where [NumSent] is the number of sentences in the document, [NumTerm1] is the number of unique terms in sentence 1, and [term]:[count] are the term id and how many times that term appeared in the sentence.
The word id information is in "[login to view URL]" file.
The outputs of the project will be the .csv file and the .docs file.x)