Hello. Here the task:
[login to view URL] in English and Russian
2. Html texts - utf-8
What I need:
1. Module of text (html) analysis. Need to find keywords from dictionaries in texts and assign name of thematic to texts according to dictionaries.
2. Thematic can be assigned if there are 2 words(minimum) from dictionary/dictionaries(in total) were found in a text.
This programm(module of text analysis) should be created as a "black box" with API.
Here is how it should work: I send html texts from Data base using request to module of analysis. Module of analyses analyse the text and send a name of thematic of the text(if thematic is found according to keywords from dicts). Module of analysis should work fast.
It's be better if it'll be c++/qt
It's easy to do for a data scientist.