Convert PDF to text - repost

We need to process existing PDF files - they must be OCRed and converted to text and then to clean HTML. Please find enclosed sample of what you will be dealing with. If you are interested in this job, please submit a processed result of the attached sample (2 pages).

Please also state what OCR software you use and what version. You MUST use an OCR software, no typing is allowed.

The result MAY NOT contain headers, footers and page numbers from the magazine, just clean text with only very basic formatting - such that would be suitable for publishing on a web site i.e. bold, italics, underline should be preserved, paragraphs, headings, basic tables, uppercase, lowercase, upper index and lower index should be preserved, every other formatting (text flow, columns, weird fonts, page numbers etc.) should be discarded. The desired result is a clean readable text with clean formatting.

The result must be submitted in clean HTML (such as that produced by [url removed, login to view] - you can use that if you wish or anything else you prefer) and separated into single concise articles.

The most important consideration - the result should be as typo-free as humanly possible. You are not supposed to check for grammatical correctness, just that the recognized text is the same as in the source file. You MAY correct an obvious typo in the original text, but don't have to.


1. Bid as if for 1000 pages of text.

2. Send your processed file ([url removed, login to view]) in ONE clean HTML, free of any errors and processed according to the requirements of the project.

3. There is no need to write us about what you can do and how much experience you have. We are NOT INTERESTED in any of that. Just deliver the processed sample, if you are interested in this job.

Please note that the whole point of this exercise is evaluating YOUR PERFORMANCE, so do not send the garbage right out of your OCR software - that is NOT HELPFUL and your bid and any messages will most likely be ignored. We need to see what kind of RESULT you can deliver (and your ability to stick to instructions provided), so try hard to impress us, if you are interested in this job - but not by WORDS, do it by your WORK.

We will select several providers (if any suitable can be found) for this job as we have many files to process. This can turn into a long-term job for you, if you can deliver.

Thank you for looking!

Taidot: puhtaaksikirjoittaminen, tiedonsyöttö, tietojenkäsittely, Tekstintunnistus, PDF

Näytä lisää: typing paragraphs, write thank note, write articles publishing, write index page, write articles pdf, find flow, data entry exercise, convert write job, weird job, turn pdf text, free typing exercise, 420 web, use existing pdf, typing pdf files, publishing data text file, ocr text, impress pages, html typing job, garbage one, data text file, correctness, text processing, pdf text entry software, convert columns text files, typing job html

About the Employer:
( 0 reviews ) Ahmedabad, India

Projektin tunnus: #4227746

Myönnetty käyttäjälle:


Let's transform the pdf into html

40 $ USD 5 päivässä
(0 arvostelua)

I can do this job.

200 $ USD 10 päivässä
(0 arvostelua)