The purpose of this project is to build a document similarity retrieval Flask app running on Docker that will allow the vectorization of text documents by using as many unsupervised embedding algorithms as possible (word2vec, fasttext, glove, SIF-word2vec etc.) that also has a REST API accepting requests and providing responses. Although at this point I'm not pursuing embeddings based on Tensorflow, depending on how the project proceeds, they may be implemented e.g. Universal Sentence Encoder.
Please download a detailed project specification from here (remove asterisks *): https://www.*mediafire*.com/file/l0y12pbge2oqndc/[login to view URL]
Please provide a list of all the embedding algorithms that you are able to implement in C/C++/Python. These could be already implemented in Gensim or you can get the code from Github from various papers presented.
You are free to propose any modifications that seem fit and increase efficiency based on your experience, including the price.
Requirements: I would like the developer to have a knowledge of the following: C/C++/Python, Docker, nginx, Gensim, Deep experience in NLP, Deep experience in semantic textual similarity, Elasticsearch, Ubuntu, GPU programming, MySQL.
If the project is developed smoothly and to a high standard, extension of the project will be offered.
If you have experienced in, suggest 5 unsupervised embedding algorithms for implemeting with proposal.
15 freelanceria on tarjonnut keskimäärin 615$ tähän työhön
Hi I'm very interested in your post. As a senior Python developer, I can help you perfectly. I have rich experience with Python/Django/Flask/ML/AI/Scraping. Let's discuss more detail via private chat. Thanks.
I am an ML NLP research scientist. tf-idf, word2vec, glove, elmo, fasttext, bert, xl-net Have used most of the tech you mentioned. Discuss rest in a chat.