Brief description of my current application:
I have a running application called news aggregator. The code run's every 5 minutes to collect news from different sources (rss feeds) and i save them into a mysql database table.
Now what i wan't to achieve is to perform a Hierarchical Agglomerative Clustering/Classification of those articles every time i run the code (5 minutes) and detecting the category of the articles as well.
Description of the task:
- Classification/Clustering service
For example a python class: class ClusteringService(object):
* The object is the array (that will contain the articles for processing)
- The service should be able to detect the category of each article from a trained model/machine learning.
- The service should be able to cluster similar articles from a trained model/machine learning.
Example news: Twitter releases a new app
For the same topic, different media has reported in different times:
BBC - 22 Hours ago, Yahoo News - 15 Hours ago, CNN - 4 Hours ago
Note: see the attached picture.
So, because the aggregator runs every 5 minutes to collect news, the service should be able to handle articles max 2 days old.
For example: Twitter releases a new app - BBC - 22 Hours ago, Yahoo News - 15 Hours ago
For this topic a cluster (cluster id: 342) has been created 15 hours ago, but when the service runs today, CNN has reported for the same topic so the service should put the same article to the previously created cluster (cluster id: 342) and avoid creating second cluster for the same topic.
In the end the service should return an array of clusters from where i can insert/save them into a database.
The cluster table in my mysql database looks like (schema):
id , article_ids, size, score
Please do not hesitate if you have any suggestions or better solution/idea.
32 freelanceria on tarjonnut keskimäärin 263$ tähän työhön
HI, I am data scientist and have good experience in python and R programming. My area of interest is statistical Analysis of dataset and apply ML/deep learning algorithm. I can intern your tasks. Kind Regards
Hi I'm an expert. If you hire me, i won't let you down. I can provide you all these things with unlimited revisions till the satisfactorily completion of [login to view URL] you
Hi, I am sure that I can perfectly fulfill the ML algorithm as your requirement. - category detect - article cluster I 'd like to discuss it together and work for you. Thanks