High Volume, Batch Processing of Obituary Text

Käynnissä Julkaistu Jan 18, 2012 Maksettu toimituksen yhteydessä
Käynnissä Maksettu toimituksen yhteydessä

We need a software developer with excellent English language skills and experience in Natural Language Processing and Text Analytics to create a parsing program for us that can read and extract the key data elements from the text of an American obituary.

The input file will consist only of a unique ID and a paragraph of obituary text (see attached sample). The input file will be in CSV format.

Your program needs to read the input file and create a CSV output text file that contains:

- unique ID (link to the input)

- deceased name

- locations (city, state)

- Birth Year

- Birth Month

- Birth Day

Input files may consist of a few thousand records or as many as 100,000; so, your program must be reasonably fast (capable of processing a minimum of 50,000 records per hour).

We're flexible on your choice of technology. However, we use Linux, Python and SQLite internally (as well as Windows systems), so those technologies are familiar to us. (But since this will be a "black box" to us, it doesn't really matter what you write it in so long as we can execute the program on our systems.)

We have larger sample datasets available for you to review if necessary (however there are lots of obituaries on the Internet you can look at too - they're all pretty much the same).

If you're interested in bidding on this, we're offering a payment agreement that recognizes your program's success. Some obituaries cannot be parsed, because the data isn't available (or they're corrupt). You first need to identify these ones and flag them as incomplete. Of the rest, we consider anything better than 95% as being complete and eligible for full payment.

% Parsed Successfully % of Fee Earned

95% 100%

90% 90%

85% 75%

80% 65%

75% 50%

65% 30%

50% 25%

* The above is based on a 10,000 record sample file which your program must process under 15 minutes.

Please bid the cost to complete this project to the 100% level. If we select your bid, you will be paid Milestone payments beginning with 50%, as per the table above.

Include with your bid a brief description in the PMB of the technology and process you plan to use.

Thanks for your interest in this project!

Tietojenkäsittely Luonnollinen kieli Python Tietojärjestelmäarkkitehtuuri

Projektin tunnus: #1395939

Tietoa projektista

13 ehdotusta Etäprojekti Aktiivinen Jan 21, 2012

13 freelanceria on tarjonnut keskimäärin $1375 tähän työhön

Blender3D

I have a bit of personal experience with natural language processing (mostly with NLTK and Python), so I've seen this sort of data before. If you don't mind, I'll try processing the data you provided and PM you a sampl Lisää

$800 USD 14 päivässä
(20 arvostelua)
5.4
jdavisp3

I have extensive experience in Python and Linux, and my communications skills in English are excellent. I write clean Python code with unit tests.

$2500 USD 30 päivässä
(5 arvostelua)
5.1
hegazy

I am good about text parsing and processing tools, and my previous PhD degree was in NLP.

$2500 USD 30 päivässä
(6 arvostelua)
4.3
EngineerCat

Hello, I'm sure I can make the program you need. Please check my PM.

$1500 USD 10 päivässä
(5 arvostelua)
3.9
MiguelLam

Hi Mr, I can help you doing this job on Python but it would be faster on C since its faster on processing logic. Please check your PMB. Kind regards.

$500 USD 6 päivässä
(1 arvostelu)
3.2
Viqtor

will provide info and description via PM

$500 USD 0 päivässä
(1 arvostelu)
1.5
bolochka

Hello i can do this project. i won 1 project before. So you need reply me any answer Best regards Bolochka

$1999 USD 5 päivässä
(0 arvostelua)
0.0
MAJIDALIKHAN15

I have four year of experience of data entry. I offer my services to your for this project. If you are interested ten email me. I will give you my contact details. I am working for Interesting in Data Entry Projects f Lisää

$500 USD 15 päivässä
(0 arvostelua)
0.0
jkbbwr

I think I could take this project on and complete it ontime, thanks for the chance

$700 USD 9 päivässä
(0 arvostelua)
0.0
rulsyah

Hi sir. I've seen this kind of pattern before, and I believe tool/programming language doesnt matter, as long as they can perform text processing fast. I do have some experience parsing english NLP to table/field. I co Lisää

$775 USD 12 päivässä
(0 arvostelua)
0.0
sweetatyagiaz

hi, we r very good developer. we are ready to do this work. kindly don't go on bid price see private message for detail then decide. thanks

$4000 USD 30 päivässä
(0 arvostelua)
0.0
vegard1992

Hello denali, willing to take on this project :) Experienced Python programmer, honestly I've some experience with NPL. I speak English (US-English that is) perfectly well.

$600 USD 15 päivässä
(0 arvostelua)
0.0
sahebjade

I understand your requirement and shall not disappoint you.

$1000 USD 15 päivässä
(0 arvostelua)
0.0