
Closed
Posted
My immediate goal is to develop robust predictive models that can meaningfully inform cardiovascular research and clinical decision-making. I have aggregated multiple datasets—ranging from structured EHR extracts to imaging-derived variables and device telemetry—and now need a data scientist who can turn these raw inputs into clinically relevant risk scores and outcome forecasts. Scope of work • Clean, integrate, and document the disparate datasets I will share (CSV, SQL dump, and optional imaging features in HDF5). • Engineer features, test several algorithms (e.g., gradient boosting, random forests, neural nets), and iterate toward an interpretable solution. • Provide model performance metrics—AUROC, calibration plots, and decision-curve analysis—so clinicians can easily gauge utility. • Package the final model as a reproducible Python notebook / script with clear inline comments, environment file, and concise README. Acceptance criteria 1. AUROC ≥0.80 on the held-out test set. 2. Code executes end-to-end with `conda env create -f [login to view URL]`. 3. All steps, from preprocessing through validation, are traceable in a single notebook or Markdown report. When you respond, attach a detailed project proposal outlining: your planned workflow, preferred libraries (scikit-learn, XGBoost, PyTorch, etc.), anticipated timeline with milestones, and any relevant prior cardiovascular or biomedical work you can publicly reference. I am fully open to alternative techniques or supplementary data sources you may recommend, provided they enhance predictive power and remain explainable to a clinical audience.
Project ID: 40427834
19 proposals
Remote project
Active 21 secs ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
19 freelancers are bidding on average ₹1,034 INR/hour for this job

Your cardiovascular dataset will fail validation if we don't address class imbalance and temporal leakage upfront. Most EHR-based models I've audited show inflated AUROC scores because they leak future diagnoses into training features or ignore the fact that adverse events occur in only 3-8% of patients. Before architecting the pipeline, I need clarity on two things: What's the prevalence rate of your primary outcome in the dataset, and are your imaging features time-stamped relative to the index event? If we're predicting 30-day readmission but the model sees discharge summaries written after readmission, we'll hit 0.95 AUROC in dev and 0.62 in production. Here's the technical approach: - PYTHON + SCIKIT-LEARN + XGBOOST: Build an ensemble pipeline with SMOTE for minority class upsampling, stratified k-fold cross-validation, and SHAP values for clinical interpretability so cardiologists can see which biomarkers drive each prediction. - HADOOP + SPARK: If your EHR dumps exceed 50GB, I'll parallelize feature engineering using PySpark to handle joins across patient timelines without memory crashes. - HDF5 + PYTORCH: Extract imaging embeddings using a pretrained ResNet backbone, then fuse them with tabular features in a multi-input neural network to capture non-linear interactions between echo measurements and lab values. - CALIBRATION + DECISION CURVES: Deliver Brier scores and net benefit plots at multiple risk thresholds so you can justify clinical deployment to an IRB or hospital committee. I've built 4 FDA-submission-ready models for cardiology startups, including a heart failure readmission predictor that outperformed the LACE index by 18 points AUROC. I don't take on projects where the ground truth labels are ambiguous. Let's schedule a 20-minute call to walk through your data dictionary and confirm we can hit 0.80 AUROC without overfitting.
₹900 INR in 30 days
5.4
5.4

With my decade-long experience in the field of data science, I am more than equipped to handle your ambitious project. I specialize in **data-engineering & analytics**, **machine learning**, and **statistical analysis**; everything you need for a project of this magnitude! My skills in **Python** and with libraries like **Scikit-learn, XGBoost, PyTorch**, will allow me to efficiently clean, integrate, and engineer crucial features from the disparate datasets you'd share. My forte also lies in providing clear insights on model's performance, and I can definitely deliver on that front too! With my experience producing ROC curves, calibration plots, and decision-curve analyses - I vow to give your clinicians easy-to-understand utility gauges. Additionally, my proficiency in creating reproducible models aligns perfectly with your project requirement of packaging the final model as a Python notebook/ script with clear comments. When it comes to timeline management, I pride myself on my efficiency. Given the complexity and potential challenges posed by datasets from various sources like EHR extracts, imaging features among others - I have come up with a detailed project plan. My experience working with healthcare clients would certainly come in handy here.
₹1,000 INR in 40 days
3.8
3.8

Hi there, I have read your project requirement carefully. You need a complete data science pipeline to clean and integrate cardiovascular datasets, build predictive models, evaluate them with clinical metrics (AUROC, calibration, decision curves), and deliver a reproducible, well-documented Python solution. We can develop an end-to-end workflow using Python (scikit-learn, XGBoost, PyTorch if needed), focusing on data quality, feature engineering, model performance, and clinical interpretability (SHAP, calibration analysis). The final output will be a clean notebook/script with environment setup, ensuring reproducibility and easy extension for future research. Questions: ========= What is the primary prediction target (e.g., mortality, readmission, specific condition)? What is the approximate dataset size and number of features? Are the datasets already cleaned/anonymised, or do we handle preprocessing fully? Do you prefer interpretable models only or open to more complex models if performance improves? Best Regards, Srashtasoft Team
₹750 INR in 40 days
3.0
3.0

Good morning/evening Sir, Ready For making your Analysis and Machine learning model Sir as it a 1 day or 2 MAX job easy making and completing with perfect & satisfying Results Just give me the order to start
₹1,000 INR in 20 days
1.8
1.8

I’m Gurpreet Singh, a professional freelance developer based in New Delhi, with 10+ years of experience in delivering secure, scalable, and high-performance digital solutions. I help startups and businesses turn their ideas into powerful, market-ready products. ? What I Can Do for You Mobile App Development (Android & iOS) Desktop Software Development (C#, Java, .NET) Custom Software & Web Application Development Website Design & Development (WordPress, Joomla, Drupal) Laravel, React JS & Node JS Development Game Design & Development Blockchain Solutions AI Automation & Custom Tools Meta Trading Tools, Bot Scripting & Web Scraping SEO, Digital Marketing & Branding Video Editing & Multimedia Production ⚙️ Technologies I Work With React JS, Node JS, MongoDB Python (Django) Android (Java/Kotlin), iOS (Swift) Flutter & React Native ✨ Why Work With Me? ✔ 10+ years of proven industry experience ✔ Modern, scalable & cost-effective solutions ✔ Creative and experienced development approach ✔ Transparent communication & smooth workflow ✔ Secure, optimized & future-ready technology ✔ On-time delivery with dedicated support ✔ Flexible pricing (open to discussion) ? Let’s Work Together If you’re looking for a reliable freelancer who can bring your ideas to life and deliver high-quality results — I’m here to help. Let’s build something amazing together ?
₹750 INR in 40 days
0.0
0.0

Hello, I have experience with Python, machine learning, predictive modeling, statistical analysis, biomedical datasets, and data integration. I can help clean and integrate your datasets, engineer features, test multiple models, and build an interpretable cardiovascular prediction pipeline with clear validation metrics and documented code. I am comfortable working with tools such as scikit-learn, XGBoost, PyTorch, pandas, and Jupyter notebooks, and I always focus on reproducibility, model performance, and clean documentation. I am new on this platform and currently a student, so I am offering a discounted rate to build my profile and gain trusted reviews. I will complete the project carefully, professionally, and with high attention to detail. Please give me a chance to prove my work. Thank you very much.
₹1,000 INR in 40 days
0.0
0.0

Hi there, As a Computer Engineering student specializing in Medical AI, I don't just build models; I build clinically interpretable solutions. Having previously developed a high-accuracy VGG16 model for brain tumor detection, I am deeply familiar with the nuances of processing complex biomedical datasets and multi-modal features (EHR & Imaging). To hit your ≥0.80 AUROC target, here is my dedicated workflow: 1. Robust Data Integration: Using Pandas and SQLAlchemy, I will consolidate your CSV/SQL sources, while utilizing h5py to extract high-dimensional features from the HDF5 imaging variables. 2. Multi-Algorithm Iteration: I will benchmark Gradient Boosting (XGBoost/LightGBM) against Deep Neural Networks (PyTorch) to identify the optimal balance between predictive power and clinical utility. 3. Explainability-First Approach: Beyond AUROC, I will implement SHAP values and Decision Curve Analysis. This ensures clinicians don’t just see a score, but understand the exact physiological features driving the forecast. 4. Reproducible Delivery: You will receive a clean, modular Python pipeline with a Conda environment file, ensuring a "single-click" execution for your team. I have one technical question regarding the telemetry data: Are the time-series logs already windowed, or would you like me to handle the temporal feature engineering? Looking forward to contributing to your research! Rahma
₹1,000 INR in 40 days
0.0
0.0

I’m a Data Scientist and Python Developer with experience in predictive analytics, machine learning, data preprocessing, and model evaluation. I can build an end-to-end cardiovascular risk prediction pipeline using Python, Scikit-learn, XGBoost, and PyTorch with interpretable and clinically explainable outputs. My workflow will include data cleaning, feature engineering, model comparison, AUROC evaluation, calibration analysis, and reproducible deployment-ready notebooks with proper documentation. I focus on clean code, traceable workflows, and reliable predictive performance for real-world clinical applications.
₹750 INR in 40 days
0.0
0.0

Hi, I am interested for this job and want to work with you. Please, visit my website www.vanecus.com. Thanks and regards Md. Abdul Latif Dhaka, Bangladesh.
₹1,000 INR in 40 days
0.0
0.0

Bengaluru, India
Member since Dec 26, 2021
₹750-1250 INR / hour
₹600-1500 INR
₹2500-3500 INR
₹600-1500 INR
$10-30 USD
₹12500-37500 INR
$30-250 AUD
$250-750 USD
₹12500-37500 INR
₹12500-37500 INR
$8-15 USD / hour
€12-18 EUR / hour
₹37500-75000 INR
₹750-1250 INR / hour
$750-1500 AUD
$30-250 USD
₹1500-12500 INR
₹1500-12500 INR
£30-51 GBP
min £36 GBP / hour
$30-250 USD