Clustering from an email corpus

There is a corpus of emails exchanged among different senders and receivers within an organization. A server stores all these emails along with their timestamps and sender and recipient names.

Upon application of a query string, the system should find out similary measures of the entities ( individuals in the organization / senders or recivers ) by analyzing the context ( which is the query string ) againts the email corpus.

There is an undirected graph where the nodes will represent entities within the organization. Any link between 2 nodes will represent e-mail correspondence between the two individuals.

The links will be assigned weights proportional to the similarity measures. The similarity measures will be measures of similarity of a query being searched ( context ) against the content of the emails exchanged between the individuals.

Besides, entities need to be clustered or grouped based on query context againts the email corpus.

This is a machine learning and AI problem. Spectral Clustering can also be applied. Anyone with extensive AI and data mining and machine learning background is welcome to post a bid. A report of the etchnique and statistcis/ mathematics along with prototype code need to be developed. Will give more existing code and report to work with.

## Deliverables

1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.

2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):

a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.

b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.

3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).

## Platform

java, C# Dot Net

Taidot: C-ohjelmointi, C# -ohjelmointi, tekniikka, Java, MySQL, PHP, tietojärjestelmäarkkitehtuuri, Ohjelmistojen testaus, Tekninen kirjoittaminen

Näytä lisää: web sites which developed in java, two string problem, string find c, programming learning sites, programming graph, programming ai, problem graph, mathematics for all, link graph, learning programming, learning of php programming, programming, learning net programming, learning java programming, learning c# web programming, graph with java, graph programming, graph problem, graph on java, graph nodes

Tietoa työnantajasta:
( 112 arvostelua ) United States

Projektin tunnus: #3242975

1 freelanceria on tarjonnut keskimäärin %project_bid_stats_avg_sub_23% %project_currencyDetails_sign_sub_24% tähän työhön


See private message.

$85 USD 10 päivässä
(11 arvostelua)