Suljettu

Create a pyspark cluster and execute a job

Hi,

We have 1000 gz files with 700-1B json objs of 6KB each. Average gz file is 250MB and 2.5GB when unzipped. We have a total of approx 700M-1B files that we have to move to s3 and mongo.

Need to setup a pyspark processing pipeline that will process this and move the data.

Taidot: Python, PySpark, ETL, Tietojenkäsittely, Data Extraction

Näytä lisää: create ipcop cluster, job pays 1000 day, typing job per 1000, typing job per 1000 words, create crystal report execute netc, create invoice free lance job, create mysql cluster virtual machine, create naukri com type job portal website, execute job vb net, job pay 1000 weekly, captcha job per 1000, want job essay 1000 words, dream job essay 1000 words, part time job earn 1000 daily, job hi, create website and earn more than $1000 month, Big Data Entry Job ( Over 1000+ Entries), terraform create kubernetes cluster

Tietoa työnantajasta:
( 41 arvostelua ) Mumbai, India

Projektin tunnus: #32291933

1 freelanceria on tarjonnut keskimäärin %project_bid_stats_avg_sub_23% %project_currencyDetails_sign_sub_24% tähän työhön

salvatemarty

I've worked 2 years as part of the team in charge of developing, deploying and supporting a prospecing solution project, working fully on Amazon EMR clusters written in python/pyspark. The Pipeline had a scheduled ETL Lisää

₹2500 INR 7 päivässä
(0 arvostelua)
0.0