
Suljettu
Julkaistu
I need an experienced Python engineer who works confidently with AWS Glue to build and manage a small suite of data-integration jobs for a Hyderabad-based project. The core of the work is to design and automate Glue ETL pipelines that pull data from our production databases, catalog it accurately, and transform it into analytics-ready tables. Here is what I expect from the engagement: • Develop, test, and deploy Glue ETL jobs in Python. • Populate and maintain the Glue Data Catalog so new tables are discoverable and properly version-tracked. • Implement efficient transformation logic that cleans, enriches, and partitions data for downstream reporting. • Optimise job performance and cost by selecting the right worker types, job parameters, and database connections. Our source systems are relational databases, so experience configuring Glue connections, crawlers, and dynamic frames against JDBC endpoints is essential. Familiarity with ancillary AWS services—such as IAM for fine-grained permissions, S3 for staging, and CloudWatch for logging—will help keep the pipelines rock-solid. Deliverables will be accepted when: 1. All Glue jobs run from start to finish without manual intervention. 2. Transformed data lands in the specified S3 buckets and matches the target schema. 3. The Data Catalog reflects every table, column, and partition produced by the pipelines. 4. Code is organised in a Git repo with clear README and parameterisation for dev, test, and prod. I prefer someone already in Hyderabad so we can schedule occasional in-person white-boarding sessions, but day-to-day work can remain remote. If this sounds like a good fit, please share examples of previous Glue jobs or ETL code you have delivered.
Projektin tunnus (ID): 40233830
12 ehdotukset
Etäprojekti
Aktiivinen 14 päivää sitten
Aseta budjettisi ja aikataulu
Saa maksu työstäsi
Kuvaile ehdotustasi
Rekisteröinti ja töihin tarjoaminen on ilmaista
12 freelancerit tarjoavat keskimäärin ₹996 INR/tunti tätä projektia

Your Glue pipelines will fail silently if you don't configure job bookmarks correctly - I've seen teams lose weeks of data because they assumed incremental loads were working. The second risk is cost: poorly tuned DPU settings can burn through your AWS budget when processing large datasets. Before architecting the solution, I need clarity on two things. First, what's the daily data volume you're ingesting from your relational sources - are we talking gigabytes or terabytes? Second, do your production databases support CDC or will we need to implement full-table scans with watermark columns? Here's the architectural approach: - AWS GLUE + PYTHON: Build parameterised ETL jobs using dynamic frames with pushdown predicates to minimise data transfer from JDBC sources, reducing job runtime by 50-60%. - DATA CATALOG + CRAWLERS: Configure partition projection and schema evolution policies so new columns don't break downstream Athena queries when your source schema changes. - S3 + PARQUET: Implement columnar storage with Snappy compression and partition pruning strategies that cut query costs by 70% compared to raw CSV dumps. - CLOUDWATCH + SNS: Set up custom metrics and failure alerts so you know within 5 minutes if a job stalls, not after your analysts complain about stale dashboards. - GIT + CI/CD: Structure the repo with separate config files for dev/test/prod environments and implement automated deployment using AWS CodePipeline. I've built similar data integration platforms for 4 clients in fintech and healthcare that process 200GB+ daily. I'm based in Hyderabad and available for in-person architecture sessions. Let's schedule a 20-minute call to walk through your current database schema and discuss partition strategies before we commit to the build.
₹900 INR 30 päivässä
5,3
5,3

I’m a Python data engineer with 8+ years of experience delivering production-grade ETL pipelines on AWS, including multiple AWS Glue implementations for relational-source integration and analytics lakes. I can design, automate, and optimise your Glue jobs to run reliably and cost-efficiently. Deliverables Python-based Glue ETL jobs (dev/test/prod parameterised) Configured crawlers, JDBC connections, and Data Catalog Cleaned, enriched, partitioned datasets in S3 Monitoring via CloudWatch + automated scheduling Complete Git repo with README and deployment steps Core Skills Python, PySpark, AWS Glue, S3, IAM Relational DB integration (JDBC) Data warehousing, ETL optimisation Git, CI/CD Why hire me Hyderabad-based, strong communicator, and focused on zero-touch, production-ready pipelines that scale with your analytics needs.
₹1 200 INR 40 päivässä
3,4
3,4

With my 13+ years of experience in the web development domain, I bring a wealth of knowledge to the table when it comes to building scalable and robust applications. In addition, I have extensive experience working with AWS, including Amazon Glue. Having worked with various clients on similar projects, I know exactly what it takes to design, build and automate Glue ETL pipelines effectively. What sets me apart is not just my technical skills but my strategic approach to problem-solving. While managing your project, I lay special emphasis on eliminating technical debt and building sustainable and future-proof systems. My proficiency in Git will ensure that your code is organised in a way that's super easy to understand and manage, following a ????-??? ????? ??????? as you mentioned. The nature of our project requires meticulous attention to detail and the ability to handle complex database structures. Here too I assure you my proficiency with relational databases and Glue connections will come handy. Additionally, being already based in Hyderabad gives us the advantage of occasional in-person brainstorming sessions while successfully maintaining the remote workflow for day-to-day operations. I'm really excited about this opportunity to work with you! Please review my portfolio for examples of previous projects with strong ETL components.
₹1 000 INR 40 päivässä
1,3
1,3

Hello, I’m Ankur, a freelance developer with a dedicated team of professionals. I read all your requirements for {} and I assure you that I will provide high-quality work at the proper time. Additionally, we also provide you 3 months of support from our side. As a Full Stack Developer, I specialize in Web and App Development, boasting a portfolio of stunning projects with top-notch UI/UX design. My expertise spans Flutter (for both Android and iOS), PHP, and WordPress, and I bring over 7 years of experience to the table. Whether it’s websites, applications, or e-commerce platforms, I’ve got you covered. But I’m not limited to just coding. My skill set extends to graphic design and logo creation, offering you a one-stop solution for all your project needs. With a track record of over 500 completed projects, I am committed to delivering nothing short of excellence. My ultimate goal is your complete satisfaction. Thank you for considering me for your project. I’m ready to transform your vision into a reality that stands out in today’s competitive landscape. Best Regards, Ankur Hardiya
₹1 000 INR 40 päivässä
0,2
0,2

As a Hyderabad-based AWS Glue Python Developer with several years of experience, I would be an ideal candidate for your project. Having worked with AWS Glue extensively, I have built and managed numerous ETL pipelines that closely align with your described needs, enabling smooth data integration from various databases into analytics-ready tables. My previous projects required the same skills you’re looking for: developing ETL jobs, managing Data Catalogs effectively, implementing efficient data transformation logic, and optimizing performance to minimize costs. In addition to my expertise in AWS Glue, I am highly proficient with related AWS services such as IAM, S3 and CloudWatch. This understanding allows me to build robust pipelines by leveraging the full potential of all relevant services. To ensure the pipelines are highly reliable and require minimal manual intervention, I design well-parameterized code structures that can be easily deployed across different environments while maintaining clear documentation. I also want to emphasize my belief in effective communication for successful project completion. While I value occasional in-person whiteboarding sessions for deeper understanding and brainstorming sessions, I’m comfortable executing day-to-day tasks remotely as well. Finally, to demonstrate my suitability for this role, I will gladly provide you with examples of previous Glue jobs or ETL code that I have Successfully delivered.
₹750 INR 45 päivässä
0,0
0,0

As an experienced software engineer with 20 years in the field, I have garnered a significant amount of knowledge and proficiency that makes me an ideal candidate to undertake your AWS Glue Python project. Having worked extensively with Amazon Web Services and Python, I am fully adept at utilizing these tools for the design and automation of Glue ETL pipelines. Furthermore, my advanced understanding of relational databases and their integration aligns perfectly with your project's requirements. I bring specialized expertise in parsing through AWS Glue Connections, Crawlers, and Dynamic Frames against JDBC endpoints, which is key to the functionality of your project. My cataloguing skills ensure proper version tracking for quick data discoverability, and I guarantee high-quality performance through optimal job parameterization.
₹1 000 INR 40 päivässä
0,0
0,0

With 4+ years in IT across Cloud, DevOps, and AWS-based data workflows, we design and manage scalable ETL pipelines using Python and infrastructure best practices. We focus on automation, cost optimization, and production-ready architecture. For your AWS Glue engagement: Glue ETL Development • Design, build, and deploy Glue jobs in Python • Use DynamicFrames + Spark transformations • Parameterized jobs for dev/test/prod environments • Automated execution (no manual triggers required) Data Catalog & Schema Management • Configure Crawlers for JDBC sources • Maintain accurate Glue Data Catalog entries • Partitioned datasets aligned with analytics needs • Version-aware schema handling Data Processing & Optimization • Clean, enrich, and partition relational data • Optimize worker types, DPUs, and job parameters • Efficient S3 staging architecture • Cost-aware pipeline tuning AWS Integration • JDBC connections to relational DBs • IAM role configuration (least privilege) • S3 structured landing zones • CloudWatch logging & monitoring Delivery Standards • End-to-end tested pipelines • Data validated against target schema • Clean Git repo with README & env configs We are comfortable collaborating remotely with structured communication and can align with Hyderabad-based schedules as required. Ready to share relevant ETL architecture examples and begin pipeline design.
₹750 INR 40 päivässä
0,0
0,0

I bring extensive experience building production-grade AWS Glue ETL pipelines using Python for data integration projects. I can design and implement the complete Glue infrastructure you need for your Hyderabad-based project. My approach: • Design automated Glue ETL jobs with proper error handling and retry logic • Configure Data Catalog with crawlers for automatic schema discovery and version tracking • Implement transformation logic using PySpark for efficient data cleansing, enrichment, and partitioning • Optimize worker types and job parameters for cost-effective performance • Set up connections to relational databases using JDBC with proper authentication Technical Stack: • Python/PySpark for Glue job development • Git for version control with comprehensive documentation • IAM roles and S3 bucket policies for fine-grained permissions • CloudWatch integration for monitoring pipeline health and performance metrics • DynamoDB/RDS experience for endpoint configuration Deliverables: • Well-structured Git repository with parameterized code for dev/test/prod environments • Complete Data Catalog reflecting all table schemas with proper partitioning • Automated Glue jobs running end-to-end without manual intervention • Clear README and deployment documentation Available for remote collaboration with flexibility for occasional in-person sessions in Hyderabad. Can start immediately.
₹1 250 INR 40 päivässä
0,0
0,0

Hi, I reviewed your requirement for an AWS Glue Python developer. I’m a Python backend developer with hands-on experience building data pipelines, automation scripts, and Flask-based backend systems. I’ve worked with Python ETL workflows, SQL databases, and API integrations, and I’m comfortable writing clean Glue jobs using PySpark/Python. I recently built backend systems involving data processing, automation, and database integration, and I can help you with: AWS Glue ETL jobs using Python Data transformation and validation Integration with S3 / databases Error handling and logging Clean, maintainable code I’m available to start immediately. Could you please confirm: 1. What data sources are involved (S3, RDS, etc.)? 2. Is this a one-time job or ongoing support? Looking forward to working with you. Thanks, Abdul Wahab
₹750 INR 40 päivässä
0,0
0,0

Hello, Your requirement aligns closely with the AWS Glue–based data engineering solutions I’ve been working on recently, especially around building automated ETL pipelines from relational databases into analytics-ready S3 data lakes. I can help you design and deploy Glue jobs that: • Extract data efficiently from JDBC sources (MySQL/PostgreSQL/SQL Server) using optimized connections • Transform and enrich datasets using Glue DynamicFrames and PySpark logic • Partition and store curated data into S3 for downstream analytics • Maintain accurate Glue Data Catalog entries with crawlers and schema versioning • Implement IAM roles and CloudWatch monitoring for secure and reliable execution For performance and cost optimization, I typically configure the right worker types, job bookmarks, push-down predicates, and partition strategies so jobs run faster while controlling AWS costs. Deliverables will include: • Fully automated Glue pipelines (Dev/Test/Prod parameterized) • Data landing in target S3 buckets matching defined schema • Updated Glue Catalog with partitions and metadata • Git repository with clean code structure and README documentation • Deployment and testing support Although I am flexible with remote collaboration, I’m open to occasional on-site sessions if required. Let’s connect to discuss your data sources and expected data volume so I can propose the optimal architecture. Best regards, Sachin Kumar
₹1 100 INR 40 päivässä
0,0
0,0

Hey Hi, I am having 4+ year of experince in AWS glue & amazon services , python , sql , pyspark also know about git . currently in Nagpur but if you want to meet up I can travel as it is 7-8 hour travel only . I am looking for some good opportunity like problem statements to work on . Missing that thrill of working on building big data processing and optimization. I have hands on experince of process 50 gb to 100 gb of data . Regards, Akshay Mokalwar
₹1 000 INR 40 päivässä
0,0
0,0

Ranchi, India
Liittynyt tammik. 4, 2026
₹750-1250 INR/ tunnissa
₹12500-37500 INR
$250-750 USD
$15-25 USD/ tunnissa
€12-18 EUR/ tunnissa
₹600-1500 INR
$1500-3000 CAD
$10-30 USD
$30-250 USD
£10-15 GBP/ tunnissa
€12-18 EUR/ tunnissa
₹600-2000 INR
$250-750 USD
$30-250 USD
€8-30 EUR
₹12500-37500 INR
$250-750 USD
£20-250 GBP
₹37500-75000 INR
₹1500-12500 INR
$250-750 USD