Create a software/script to transform PDF with columns to text
Maksettu toimituksen yhteydessä
I am looking for a skilled developer to create a software/script that can transform PDF documents with columns into text. The input PDF will be always a multi-column text file. Some of the pages of the provided PDF may have 2 columns, some 3 or 4 in the same file and it is totally random. PDFs samples attached.
Table structures and images can be ignored.
- The software/script can be written in any programming language, such as Python, Java, or C++.
- The software/script should be compatible with Linux (any distribution, preference for Debian/Ubuntu)
Please provide examples of similar projects you have worked on and your proposed approach for this project.
I am aware 100% of accuracy cannot be achieved, but a minimum of 90% of success is required.
The script will be tested against PDF files very similar to those provided here.
Projektin tunnus: #37328066