
Suljettu
Julkaistu
Maksettu toimituksen yhteydessä
I want a complete, production-style data engineering project built on Azure and Databricks that I can showcase as a real-world reference preferably in healthcare or retail domain. The solution should ingest at-scale public data sets (think TPC transaction benchmarks for retail, Synthea synthetic EHR data for healthcare, or a clever blend of both) and drive two to three meaningful business use cases from raw landing all the way to presentation. I am purposely leaving the exact scenarios open so you can propose what will demonstrate the most insight, but they must be substantial enough to feel like something a modern data team would support in production. Data quality is my top priority. I need lineage, validation rules, automated tests, and observable metrics baked in from day one—Great Expectations, Delta Live Tables expectations, or comparable frameworks are welcome, as long as quality gates are visible in the monitoring layer. Scope to cover: • Architecture design diagram with clear component rationale (Azure Data Lake, Databricks, Delta, Unity Catalog, etc.). • Reproducible code (Python / PySpark, notebooks or repos) with CI/CD instructions. • Ingestion pipelines (batch or streaming), curated layers, and serving tier (SQL endpoints, Power BI, or dashboards of your choice). • Integrated monitoring, alerting, and cost-aware observability using native Azure tools or open-source add-ons. • End-to-end test suite: unit, integration, and data quality tests triggered via pipelines. • Comprehensive markdown that walks through setup, architecture, and business logic. Acceptance criteria: 1. Pipelines run on my Azure subscription with a simple deploy script. 2. Data quality reports surface failed expectations in a dashboard or log analytics workspace. 3. Each business use case produces a consumable output (table, visualization, or API) that confirms value.
Projektin tunnus (ID): 40331378
59 ehdotukset
Etäprojekti
Aktiivinen 12 päivää sitten
Aseta budjettisi ja aikataulu
Saa maksu työstäsi
Kuvaile ehdotustasi
Rekisteröinti ja töihin tarjoaminen on ilmaista
59 freelancerit tarjoavat keskimäärin $544 USD tätä projektia

Hello, With a comprehensive understanding of Azure and Databricks, my team at Live Experts® LLC is well-equipped to bring your vision for an end-to-end data engineering solution to life. Our emphasis on data quality and reliable automation aligns perfectly with your requirements. We're adept at integrating frameworks like Great Expectations and Delta Live Tables, ensuring transparent quality checks throughout the process - a priority you've previously mentioned. Our skills in Python / PySpark, SQL, PowerBI, and Azure, coupled with our extensive experience in designing architecture diagrams and building reproducible code pipelines further reinforce our suitability for your project. We understand that effective monitoring and alerting is crucial; thus, we commit to implementing a cost-efficient monitoring layer using native Azure tools or open-source add-ons. At the heart of our service is not just technically robust output; it's also the clear communication surrounding the work we deliver. Our expertise in documentation will ensure that- from repo instructions to markdown that walks you through setup, architecture, and business logic - you'll have a complete understanding of all aspects of the project. We look forward to deploying pipelines running on your Azure subscription through simple deploy scripts, showcasing clearly visible quality reports in a dashboard or log analytics workspace. As valued end results are key for you, we Thanks!
$750 USD 5 päivässä
6,8
6,8

Hello, Can we discuss about your Azure Databricks data engineering project cause I have built a system where raw event streams looked fine but silently broke metrics until we enforced schema evolution, late-arriving handling, and expectation-based gates across Delta layers. I’d approach yours with similar production thinking using PySpark and Unity Catalog. Do you want lineage at column level or table level only? How should failed data be quarantined vs retried? Will Power BI read from gold or a serving layer? Best regards, Devendra S.
$750 USD 14 päivässä
6,5
6,5

Hi I can build a production-style Azure + Databricks data engineering project that showcases real-world pipeline design, strong data quality controls, and business-ready outputs in healthcare, retail, or a blended domain. A common weakness in portfolio data projects is that they stop at ETL and dashboards, without lineage, quality gates, observability, or deployable architecture that resembles an actual production environment. I solve this by designing a Medallion architecture on Azure Data Lake and Databricks with Delta tables, Unity Catalog governance, automated validation rules, pipeline tests, and monitored orchestration from ingestion through serving. The solution can include public datasets such as Synthea or retail benchmark data, then drive 2–3 meaningful use cases like patient readmission risk trends, care utilization analytics, demand forecasting, basket analysis, or inventory-revenue intelligence. I will structure the project with reproducible PySpark code, CI/CD-ready deployment instructions, observable quality metrics, and serving outputs through SQL endpoints, Power BI, or dashboard layers. My experience with Azure services, Databricks workflows, Delta Lake, data testing frameworks, and analytics engineering allows me to deliver something that feels like a modern data platform rather than a demo notebook. Thanks, Hercules
$500 USD 7 päivässä
6,2
6,2

With my expertise in SQL, Azure, Business Intelligence, Documentation, and Data Architecture, I am well-equipped to deliver the Azure and Databricks End-to-End Pipeline project. I am confident in my ability to meet your data quality and business use case requirements. The budget can be adjusted as per the project scope. Let's discuss further details to ensure a successful outcome. Please review my profile showcasing 15 years of experience. I am eager to start and demonstrate my commitment to this project. Let's collaborate for success.
$525 USD 10 päivässä
5,3
5,3

Hi muralikarthikk, Just last week I completed a similar task successfully, so I can get started on this without any ramp-up time. Two questions: 1) Domain and scale—should we prioritize healthcare (Synthea) or retail (TPC‑DS/TPCx‑BB), and what scale factor/refresh (batch vs streaming) do you want? 2) Tooling—GitHub Actions or Azure DevOps for CI/CD, and Terraform for infra with Microsoft Purview for lineage? Suggestions: 1) Use a medallion lakehouse with Delta Live Tables + Auto Loader and Unity Catalog; enforce data quality via DLT expectations and Great Expectations; publish lineage to Purview. 2) Improve cost/perf with Photon clusters, Delta OPTIMIZE/ZORDER + Change Data Feed for incremental upserts; serve on Serverless SQL Warehouses; add cluster policies and cost budgets. Action Plan: - Phase 1: Design: diagram, rationale, choose domain and 2–3 use cases. - Phase 2: IaC: Terraform for ADLS Gen2, Databricks, Unity Catalog, Purview; secrets/policies. - Phase 3: Pipelines: DLT + Auto Loader Bronze/Silver/Gold; GE suites; pytest unit/integration. - Phase 4: Serving: SQL endpoints + Power BI/DBSQL dashboards; deliver outputs per use case. - Phase 5: Observability & Release: DLT/GE metrics to Log Analytics with alerts; CI/CD pipeline; one-click deploy script; docs. Best Regards, Sid
$750 USD 5 päivässä
5,2
5,2

Hello, I can build a full end-to-end data engineering pipeline on Azure and Databricks tailored to your domain of choice—healthcare with synthetic EHR data (Synthea) or retail with TPC benchmark datasets. The solution will cover ingestion, curation, and serving layers with batch or streaming pipelines, reproducible Python/PySpark code, and CI/CD deployment scripts. Data quality will be fully integrated using Delta Live Tables or Great Expectations, with validation rules, lineage tracking, and observable metrics surfaced through dashboards or Azure Log Analytics. Each business use case will produce actionable outputs via SQL endpoints, Power BI, or custom dashboards. Complete documentation and architecture diagrams will accompany the deliverables, ensuring reproducibility and clarity for demonstration purposes. Thanks, Asif
$750 USD 10 päivässä
5,0
5,0

Hello Sir, I am certified in Azure Databricks.I can work as mentioned.I have 8 years of experience in same field. Let’s connect
$250 USD 2 päivässä
4,3
4,3

Hi, I’m excited about the opportunity to help transform your operations from manual to marvel using AI. I specialize in AI automation and have hands-on experience building agentic AI solutions, including social media content generation agents and cybersecurity auto-agents, among others. I’ve helped businesses streamline marketing, sales, events, and client service processes, turning repetitive workflows into intelligent, self-managing systems. I take a collaborative approach, working closely with leadership to identify automation opportunities and implement scalable solutions that deliver real impact. I’d love to bring this expertise to your team and help you elevate efficiency across all areas of your business. Looking forward to discussing how we can make your operations smarter and smoother. Best regards, Tony
$500 USD 7 päivässä
4,4
4,4

Hello, I understand your need for a robust data engineering project showcasing real-world applications in the healthcare or retail domain using Azure and Databricks. The primary goal is to drive meaningful business use cases from raw data to presentation while prioritizing data quality through lineage, validation rules, automated tests, and observable metrics. My approach involves designing a comprehensive architecture utilizing Azure Data Lake, Databricks, Delta, Unity Catalog, and other relevant components. I will develop reproducible Python/PySpark code with CI/CD instructions, ingestion pipelines, curated layers, serving tiers, monitoring tools, and end-to-end testing suites to ensure quality and reliability. I am ready to start immediately and would like to discuss further details regarding the scope, timeline, and expectations to deliver a high-quality end-to-end pipeline on your Azure subscription. Best regards, Justin
$500 USD 7 päivässä
4,0
4,0

Hello, With over 7 years of experience in Power BI and SQL, I have carefully reviewed your project requirements. I am well-equipped to handle the Azure and Databricks End-to-End Pipeline project efficiently. To ensure the successful completion of the project, I propose to design a robust architecture utilizing Azure Data Lake, Databricks, Delta, Unity Catalog, and other essential components. The solution will include reproducible Python/PySpark code with CI/CD instructions, ingestion pipelines, curated layers, serving tiers, monitoring, alerting, and observability features. I will also implement end-to-end testing suites for data quality assurance. My approach will focus on maintaining data quality through lineage, validation rules, and automated tests using frameworks like Great Expectations and Delta Live Tables. The solution will be well-documented with a comprehensive markdown guide. I would like to discuss the project further with you. Please connect with me via chat to explore this opportunity in detail. You can visit my profile at https://www.freelancer.com/u/HiraMahmood4072 Thank you.
$275 USD 2 päivässä
4,1
4,1

Hello, I went through your project description and it seems like that I am a great fit for this job. I have an expert team with many years of experience in SQL, Azure, Business Intelligence, Documentation, Data Architecture, Power BI, PySpark, Automation, CI/CD. Lets connect in chat so that we discuss further. Thank You
$500 USD 7 päivässä
3,6
3,6

Hey, I liked your project, Azure and Databricks End-to-End Pipeline and believe I can help you with the project. With my background in SQL, Azure, Business Intelligence, I'm confident I can meet your requirements. Would be glad to go over specifics if you're interested.
$750 USD 7 päivässä
3,3
3,3

Your data quality requirement is the hardest part of this build - most showcase projects skip validation entirely and break silently in production. If your Delta Live Tables expectations aren't surfaced in a centralized monitoring layer, you'll never catch schema drift before it corrupts downstream dashboards. Before architecting the solution, I need clarity on two constraints: Are you planning to run this continuously or tear down resources between demos? That changes whether I implement Azure Synapse serverless pools versus dedicated SQL endpoints for cost optimization. What's your Unity Catalog maturity - do you already have workspace federation configured, or should I design this assuming a single-workspace deployment with external metastore? Here's the architectural approach: - AZURE DATA LAKE + DELTA: Implement medallion architecture (bronze/silver/gold) with Delta time travel enabled so you can audit every transformation and roll back bad loads without losing history. - DATABRICKS WORKFLOWS + DLT: Build Delta Live Tables pipelines with inline expectations that auto-quarantine bad records and publish quality metrics to Azure Monitor - no silent failures. - GREAT EXPECTATIONS + PYTEST: Layer unit tests for transformation logic and integration tests that validate end-to-end lineage from landing zone to Power BI semantic models. - TERRAFORM + AZURE DEVOPS: Provision entire stack (storage accounts, Databricks workspace, service principals, RBAC) via IaC with a single pipeline run - no manual clicking through portals. - POWER BI + DATABRICKS SQL: Expose gold tables through SQL endpoints with row-level security and incremental refresh configured so dashboards stay under 3-second load times at scale. I've built 4 similar reference architectures for clients migrating legacy ETL to modern lakehouse patterns - two in healthcare (FHIR ingestion with HIPAA audit trails) and one retail demand forecasting system processing 50M transactions daily. Let's schedule a 20-minute call to align on which business scenarios will showcase the most architectural depth for your portfolio.
$450 USD 10 päivässä
4,4
4,4

Hi, My Execution Plan: - The Healthcare 'Sovereign' Medallion: I will implement an end-to-end Synthea EHR pipeline. We will move from Raw JSON/CSV (Bronze) to cleansed, patient-centric tables (Silver), and finally to 'Value-Based Care' metrics (Gold) using Delta Live Tables. - Integrated Quality Gates: I will implement a dual-layer validation strategy. Using DLT Expectations for real-time schema enforcement and Great Expectations for deep-dive profiling, ensuring all failed records are quarantined rather than poisoning your Gold layer. - Unity Catalog Governance: The project will be built with Unity Catalog at its core, showcasing fine-grained access control, data lineage, and a centralized audit log. - Infrastructure as Code (IaC): I will provide a Terraform-based deployment script that stands up the entire Azure environment (ADLS, Databricks, SQL Endpoints) in a single command, ensuring 100% reproducibility. - Observability Dashboard: A native Databricks SQL/Power BI dashboard that surfaces data quality metrics (success/fail rates) and pipeline latency. Milestones: 1. Environment & Ingestion: Terraform deployment and 'Bronze' layer ingestion of Synthea data. 2. The Quality Engine: Implementation of DLT Expectations and Silver/Gold transformations. 3. Observability & Handover: Finalizing the DQ dashboards, CI/CD instructions, and the walkthrough documentation. Regards, Nguyen
$1 000 USD 7 päivässä
3,4
3,4

I will design and deliver a production-grade Azure + Databricks data engineering solution with scalable pipelines, Delta architecture, and robust data quality using Great Expectations/DLT, ensuring full lineage, testing, and observability. The project will include end-to-end ingestion, curated layers, business use cases with dashboards/API outputs, CI/CD setup, and clear documentation so it runs seamlessly in your Azure environment as a strong real-world showcase.
$500 USD 7 päivässä
3,1
3,1

Hi, I’m Karthik with 15+ years of experience in data engineering, Azure, and Databricks, and I can build a production-grade, end-to-end pipeline that’s portfolio-ready and mirrors real enterprise systems. **Proposed Solution (Healthcare/Retail):** ✔ Ingest Synthea EHR or TPC retail datasets into Azure Data Lake (Bronze) ✔ Transform using Databricks + Delta Lake (Silver/Gold layers) ✔ Govern via Unity Catalog + lineage tracking **Business Use Cases (example):** • Patient risk prediction / readmission analytics • Sales forecasting & inventory optimization • KPI dashboards for operational insights **Data Quality & Observability:** ✔ Great Expectations / Delta Live Tables expectations ✔ Automated validation rules + lineage tracking ✔ Monitoring via Azure Monitor + Log Analytics ✔ Cost-aware metrics and alerts **Engineering Stack:** ✔ PySpark / Python pipelines (modular, reusable) ✔ CI/CD (Azure DevOps/GitHub Actions) ✔ SQL endpoints + Power BI dashboards **Deliverables:** • Architecture diagram with rationale • Reproducible codebase + deploy scripts • End-to-end pipelines (batch/stream) • Data quality dashboards & alerts • Full documentation + test suite I focus on real-world design, scalability, and clean engineering practices. **Timeline:** ~2–3 weeks Let’s build a showcase-ready, enterprise-grade pipeline. Warm Regards, Karthik Resonite Technologies
$800 USD 7 päivässä
4,1
4,1

Hello. I am highly experienced in building end-to-end data engineering pipelines on Azure and Databricks. In my previous projects, I’ve ingested large-scale public datasets like healthcare EHR data and retail benchmarks, similar to the scenarios you're proposing. My focus is on creating robust, production-grade pipelines with solid data quality controls, ensuring each stage of the pipeline is optimized and observable. For your project, I will design the architecture, provide reproducible code (Python/PySpark), and ensure the pipeline integrates well with Azure tools like Data Lake, Delta, and Unity Catalog. I will implement automated tests and monitoring to ensure that data quality is maintained throughout the pipeline, and the final output (visualizations, dashboards, or APIs) is meaningful for the business use case. My question: Are there any specific datasets you’d like me to work with, or should I propose a dataset that aligns with your business use case? Timeline: I estimate 2 weeks for completion, including setup, code deployment, and testing. Budget: $500 USD. Let’s build a production-ready solution that you can showcase with confidence! Thanks, Manish
$500 USD 10 päivässä
1,7
1,7

Hi, I understand you need a production-grade Azure + Databricks data engineering project with strong data quality, observability, and real business value—not a demo. Proposed Use Cases (Healthcare + Retail) Patient cost & readmission risk analytics (Synthea EHR) Pharmacy demand forecasting (retail + prescriptions) Fraud/anomaly detection (claims + transactions) Architecture Azure Data Lake Gen2 (Bronze/Silver/Gold) Databricks (Delta Lake + Delta Live Tables) Unity Catalog (governance + lineage) Orchestration via Databricks Jobs / Data Factory Serving via Databricks SQL + Power BI Data Quality (Core Focus) Great Expectations + DLT expectations Schema checks, freshness, nulls, duplicates Pipeline fail on violations Reports in Azure Monitor / Log Analytics Pipelines Batch + optional streaming ingestion Medallion architecture Modular PySpark code CI/CD & Testing Git-based repo + deploy scripts Unit, integration, and data tests in pipelines Deliverables Architecture diagram + rationale Reproducible code One-click deploy on your Azure Dashboards with business outputs Data quality monitoring Full documentation This will be a real-world, portfolio-ready data platform. Best Regards JP
$500 USD 7 päivässä
1,7
1,7

I’ll help you build a robust, production-ready data engineering solution on Azure and Databricks that tackles real-world challenges in healthcare or retail. Having designed scalable pipelines for diverse industries, I’ll ensure seamless ingestion of large public datasets, transforming raw data into insightful business metrics with strong focus on data quality, lineage, and validation baked in from the start. I bring strong off-platform experience designing clean, professional architectures with Delta Lake, Unity Catalog, and integrated monitoring tools like Great Expectations to automate quality testing and observability. Key skills include PySpark, CI/CD, Azure Data Lake, and end-to-end pipeline automation. We can chat more about how to turn your vision into a polished showcase that clicks. Ready when you are to dive in and make data sing. Let's have a chat, Alicia
$600 USD 14 päivässä
0,7
0,7

Hi, Your requirement for a production-style Azure and Databricks data engineering project with strong data quality, lineage, and observability is clear, and I can help design and deliver this end-to-end. I would approach this by building a scalable pipeline from ingestion to serving using Azure Data Lake, Databricks with Delta, and a governed structure, ensuring validation, monitoring, and reproducibility are built in from day one. The solution will include meaningful business use cases, clean architecture, and deployable code aligned with real-world data team practices. Before we proceed, I’d like to confirm your preferred domain between healthcare or retail and whether you expect batch only or streaming as well. Let’s connect and build this as a strong, production-grade reference.
$500 USD 7 päivässä
0,6
0,6

Eagle Mountain, United States
Liittynyt maalisk. 28, 2026
£750-1500 GBP
₹12500-37500 INR
$10-100 USD
$15-25 USD/ tunnissa
€30-250 EUR
$250-750 USD
$10-60 USD
₹12500-37500 INR
₹12500-37500 INR
$30-250 USD
₹1500-12500 INR
$10-30 AUD
₹600-1500 INR
$15-25 CAD/ tunnissa
₹8000-20000 INR
₹12500-37500 INR
₹1500-12500 INR
₹4500-6000 INR
$250-750 USD
$10-30 USD