Ilmoita projekti

Processing 15 GB text file using GCP Dataproc with Spark (fixed width files)

Suljettu Julkaistu 1 vuosi sitten Maksettu toimituksen yhteydessä

₹1500-12500 INR

Maksettu toimituksen yhteydessä

Suljettu Maksettu toimituksen yhteydessä

I have a pyspark code which is working for small files to process fixed width files on GCP dataproc cluster, but when I'm reading 15GB of compressed gzip text file, it is taking time to either save/load in BigQuery table and unable to fix this issue. Need someone to identify the root cause of this and resolved this issue

Google Cloud Platform PySpark

Projektin tunnus: #35815330

Tietoa projektista

1 ehdotus Etäprojekti Aktiivinen 1 vuosi sitten

Haluatko ansaita rahaa?

Freelancerin tarjouskilpailun edut

Aseta budjettisi ja aikataulu

Saa maksu työstäsi

Hahmottele tarjouksesi

Rekisteröinti ja töihin tarjoaminen on ilmaista

1 freelanceria on tarjonnut keskimäärin %project_bid_stats_avg_sub_23% %project_currencyDetails_sign_sub_24% tähän työhön

sriharivaila2000

I have posted the solution below but please let me know if you want me to solve your problem There could be several reasons why your PySpark code is taking a long time to process a 15GB gzip file on a DataProc cluster Lisää

₹3000 INR 3 päivässä

(0 arvostelua)

0.0

Ilmoita samanlainen projekti

Processing 15 GB text file using GCP Dataproc with Spark (fixed width files)

Tietoa projektista

Haluatko ansaita rahaa?

Freelancerin tarjouskilpailun edut

1 freelanceria on tarjonnut keskimäärin %project_bid_stats_avg_sub_23% %project_currencyDetails_sign_sub_24% tähän työhön

Freelancer

Tietoa

Ehdot

Sovellukset