Käynnissä

Extract Sentences from HTML using the R framework

Qualifications

- You need to be very fit with the R framework

- Have understanding of text mining

- The project has to be done with R (not PHP or another programming language)

Project´s goal

- Read static HTML files

- Extract Meta title of HTML

- Remove HTML, just keep plain text

- Search in plain text for given keywords/searchwords

- Extract the sentence where the keyword occurs

- Extract the sentence before

- Extract the sentence after

- Build text out of this 3 sentences

Exampel of output

- Title of HTML file

- Introtext

- Keyword #1 with 3 sentences (before, the one with the keyword and after)

- Keyword #2 with 3 sentences (before, the one with the keyword and after)

- Keyword #3 with 3 sentences (before, the one with the keyword and after)

Taidot: tietojärjestelmäarkkitehtuuri

Näytä lisää: extract sentences, html framework, the r programming language, r architecture, Programming with R, programming language r, programming in r, programming html, html programming software, html programming language, html programming file, framework programming, software framework, r], r-programming language, R&R, R software, r s, r programming project, r programming language, programming R, html programming, Extract, c++ text mining, c r

About the Employer:
( 0 reviews ) Vienna, Austria

Projektin tunnus: #1717962

Myönnetty käyttäjälle:

danilonqueiroz

Hi, I have over 2 years of experience in R language and Machine Learning, development and implementation of academic projects on clustering analysis and data mining.

275 $ USD 5 päivässä
(0 arvostelua)
0.0