Write a parser script for a big JSON file

WikiData is project which attempts to collect data about everything in our world. Basically, it will contain information of people, cities, countries, foods, atoms, stars, everything.

You can download their entire database from here in JSON format:

[login to view URL]:Database_download#JSON_dumps_(recommended)

All entries in this database are based on a Q code. The Q code is an unique index number of the specific item. For example, [login to view URL] is the entry for a famous person.

We are interested of WikiData's data on people. For example that of [login to view URL]

Your job is to write a script in any language you want, that will analyze the downloaded WikiData database file(s) and output an SQL insert file.

The script must extract 6 things of entries of people:

1) Person's full name,

2) Person's given name (= first name),

3) Person's family name,

4) Person's gender,

5) Person's country of citizenship,

6) Person's native language.

So, for the same person [login to view URL] These values would be:

1) Full name = Manuel José Bonnet Locarno

2) Given name = Manuel

3) Family name = Bonnet

4) Gender = M

5) Country of citizenship = Colombia

6) Native language = Spanish

And this data would be added to the output SQL file as follows:

INSERT IGNORE INTO wikidata (q_code, full_name, first_name, family_name, gender, country, language) VALUES ("Q5993357", "Manuel José Bonnet Locarno", "Manuel", "Bonnet", "M', "Colombia", "Spanish");

The next found person from the WikiData database file(s) would generate a new line to the output SQL file, and so on, and so on.

Please have the script show some kind of progress indication. For example, the number of rows or entries in the database and the current index or row currently being analyzed, so when running the script, you could see the progress.

The script must ignore all other types of entries than persons. Also, if the person's data is missing any of the 6 data fields (full name, given name, family name, gender, country of citizenship, or native language), skip that person.

The purpose of this task is to be able to run the script, and it will generate a huge SQL file which will insert the person data (name / gender / country / language) to database.

In your bid, please state what language you would use to write this. Scripting language such as Perl, PHP or Python would be preferred.

Taidot: JSON, Perl, PHP, Python

Näytä lisää: json file too large, which would be better option to consider in a environment when you have big json file ?, parse large json file python, parse large json file java, parsing large json files javascript, how to handle large json data in android, php parse large json file, split large json file, php script read folder file details, write spider script, perl script unzip zip file, write perl script send logs, text parser script, best script big file, java big xml file parser, write big xml file sax, ajax php script big file uploads, php write output script file, php write json file, node write json file

Tietoa työnantajasta:
( 615 arvostelua ) Turku, Thailand

Projektin tunnus: #17802874

40 freelanceria on tarjonnut keskimäärin %project_bid_stats_avg_sub_26% %project_currencyDetails_sign_sub_27% tähän työhön


Hi there, This is a couple of days worth of work. Can begin immediately. I have a 99% project completion rate and a 4.99 reputation (out of max 5.0 from more than 1140 projects from a period of 12 years. Thank Lisää

$777 USD 10 päivässä
(924 arvostelua)

Hi there I can build this script to import the people data into the database. I can make it using PHP. Looking forward to working with you. Thanks Rinsad

$220 USD 3 päivässä
(997 arvostelua)

Hi, I'd like to work with you on this task. I have 15 years of experience with transforming or manipulating very large bodies of data. Also, I am familiar with Wikidata. I can use any of the three languages; among the Lisää

$200 USD 1 päivässä
(127 arvostelua)

Hello sir I am a qualified python developer with 8 years of professional experience of web scraping. I can download all json file and make a parser for the json file. I am interested in this project and can start th Lisää

$250 USD 3 päivässä
(82 arvostelua)

I have worked with huge (> 100Gb) files before that's why I'm sure you'll be impressed with my work. I can provide you Perl or Python script that will parse JSON and generate SQL file.

$140 USD 2 päivässä
(504 arvostelua)

Greetings. I'm an expert web developer with 8 years of experience. I'm very interested in your project. And I have extensive experience in JSON parsing. I've checked your requirements carefully and feel confident w Lisää

$111 USD 2 päivässä
(60 arvostelua)

Hello How are you I read your description carefully I have many good experiences in JSON and PHP So I can finish your project quickly and perfectly I am going to develop with PHP If you hire me, I will do my best Lisää

$155 USD 3 päivässä
(54 arvostelua)

Hello I have couple of years of experience with Python and about 1,5 years of experience with PHP I have ideas how to write parser of big JSON file.

$76 USD 3 päivässä
(273 arvostelua)

Hi Nice to meet you. I'm python expert, so I will use python to parse your big large json. I have written parse script for such json file (7GB). So no problem and got it you mean. Just let me know if you are inter Lisää

$250 USD 3 päivässä
(42 arvostelua)

Hi Sir, Im a Python Programmer. I have created many Bots to scrape the websites like Alibaba(suppliers data), Linkedin, Facebook, Trulia(real estate, homes), [login to view URL], Amazon(using API), Aliexpress, yellowpages(au, Lisää

$250 USD 3 päivässä
(43 arvostelua)

Hello, I can do this in PHP. Just need to clarify a few things. The script must accept the JSON file(s) then decode the content and output an SQL file with insert statements? Do you have a sample JSON file for referenc Lisää

$180 USD 5 päivässä
(27 arvostelua)

i can do this for you without any problems i'm an expert in data extraction and performance applications.

$250 USD 3 päivässä
(24 arvostelua)

I will use PHP with JsonDumpReader library. Can do the job in a day. Thank you......................

$100 USD 2 päivässä
(58 arvostelua)

Hello, my name is Wolfgang Backhaus, I am a software developer and system engineer from Germany. I have read Your interesting project offer and want to apply for it. I am a seasoned Perl developer (20+ years) Lisää

$100 USD 3 päivässä
(7 arvostelua)

Hey My Dear Friend I will build your script in Perl Technology. I have found your requirement regarding build a script which parse json data and build csv file and store that same data in MYSQL Table and i am app Lisää

$222 USD 1 päivässä
(24 arvostelua)

I propose to implement a solution in the Perl language. My tentative plan would be as follows: 1) Create a Linux virtual machine (with large disk size - e.g. > 40 GB) for this task + download and extract JSON dump. Lisää

$60 USD 3 päivässä
(8 arvostelua)

Hi, I can do it for you. My approach will be: 1) download the WIKIDATA JSON dump. 2) use wikidata-filter to extract only human data 3) write a parser of the provided data in PERL to extract required data and format Lisää

$150 USD 5 päivässä
(9 arvostelua)

Hi Dear. I am an expert in parsing of Json, XML. I will provide a great result to you in your deadline. Regards. Mi.

$250 USD 3 päivässä
(27 arvostelua)

Now that have read your brief and feel very confident on it. PHP or Python will be used simply to do your job. If you could award me the project, faithful result will be promised. Kind regards

$155 USD 3 päivässä
(10 arvostelua)

Hello, After reading your offer this looks like a perfect fit for my skill sets. I have built a large number of creative designs /Development for different businesses. My name is Gopal V and I am an Indian web De Lisää

$155 USD 3 päivässä
(4 arvostelua)