Welcome to the Association for Diagnostics & Laboratory Medicine (ADLM) Data Science Program repository. This collection provides resources, tools, and educational materials for applying data science methods to laboratory medicine and diagnostics.
- Getting Started
- Laboratory Data Science
- Datasets and Code
- R
- SQL
- Open-Source Tools & Platforms
- Journals
- Getting Help
- Contributing
- Acknowledgments
Start here if you have laboratory experience but are new to data science.
- Practical Machine Learning in the Clinical Laboratory
- Applications of Machine Learning in Routine Laboratory Medicine
- Programming Basics: - Introduction to Python - Python Tutorial - Kaggle - R for Data Science
- What Is Data Analytics in Lab Medicine Anyway?
- Machine learning algorithms for predicting urinary tract infections: Integration of demographic data and dipstick reflectance results
- Advancing patient care with data science in clinical laboratories
- On Demand Webinars Data Analytics Webinars
- Data science in laboratory medicine certificate program - COMING SOON
| Citation | ML Method | Code/Data Access | Application |
|---|---|---|---|
| Hu H et al. "Expert-Level Immunofixation Electrophoresis Image Recognition." Clin Chem 2023;69(2):130-139. DOI: 10.1093/clinchem/hvac190 | CNN ensemble (VGG-16, ResNet-18, MobileNet-V2) | Zenodo: 10.5281/zenodo.7123624 | Monoclonal protein detection |
| Schipper A et al. "Machine Learning-Based Prediction of Hemoglobinopathies." Clin Chem 2024;70(8):1064-1075. DOI: 10.1093/clinchem/hvae081 | XGBoost, Logistic regression | GitHub; FigShare: 10.6084/m9.figshare.25765302 | CBC-based hemoglobinopathy screening |
| Steinbach D et al. "Applying Machine Learning to Blood Count Data Predicts Sepsis." Clin Chem 2024;70(3):506-515. DOI: 10.1093/clinchem/hvae001 | Boosted random forest | sbcdata (R), sbcmodel (MATLAB); Zenodo: 10.5281/zenodo.6922968 | Early sepsis warning |
| Spies NC et al. "Automating Detection of IV Fluid Contamination Using Unsupervised ML." Clin Chem 2024;70(2):444-452. DOI: 10.1093/clinchem/hvad207 | UMAP unsupervised | GitHub; FigShare: 10.6084/m9.figshare.23805456 | Preanalytical error detection |
| Spies NC et al. "Validating, Implementing, and Monitoring ML Solutions." Clin Chem 2024;70(11):1334-1343. DOI: 10.1093/clinchem/hvae126 | XGBoost tutorial | Tutorial Site; FigShare: 10.6084/m9.figshare.23805456 | Educational resource |
|
| Citation | ML Method | Code/Data Access | Application |
|---|---|---|---|
| Ammer T et al. "refineR Algorithm for Reference Intervals." JALM 2023;8(1):84-91. DOI: 10.1093/jalm/jfac101 | Box-Cox + MLE | CRAN: refineR; GitHub mirror | Reference interval estimation |
| Mobini M et al. "End-to-End SARS-CoV-2 Data Automation." JALM 2023;8(1):41-52. DOI: 10.1093/jalm/jfac109 | Random forest | Supplementary materials | Lab automation pipeline |
| Spies NC et al. "Data-Driven Anomaly Detection Review." JALM 2023;8(1):162-179. DOI: 10.1093/jalm/jfac114 | Review of methods | Supplementary materials | Methods overview |
| Walke D et al. "SBC-SHAP: Accessibility and Interpretability of ML." JALM 2025;10(5):1226-1240. DOI: 10.1093/jalm/jfaf091 | SHAP explainability | GitHub | Explainable sepsis prediction |
| Boerman AW et al. "Predicting Urine Culture Outcomes." JALM 2025;10(6):1439-1452. DOI: 10.1093/jalm/jfaf131 | XGBoost | Supplementary materials | Urine culture stewardship |
- Zenodo – Repository for sharing datasets.
- Laboratory Data Package - Transforms raw EHR laboratory records into datasets
- Reference Intervals - Calculates age dependent reference intervals
- Journal of Applied Laboratory Medicine - Has some helpful articles on data science
- Clinical Chemistry Journal
- Join our Data Science and Informatics Community: artery.myadlm.org/communities
We welcome contributions! This is a community-driven resource.
- Star this repository to show support
- Submit pull requests for:
- New tools and resources
- Updated links or descriptions
- New datasets or tutorials
- Corrections or improvements
- Open issues for:
- Broken links
- Outdated information
- Suggestions for new sections
- GitHub Issues: Report problems or suggest improvements
- Quality over quantity: Focus on resources that are actively maintained and well-documented
- Educational value: Prioritize resources that help people learn
- Clinical relevance: Ensure tools address real laboratory medicine challenges
- Open access preferred: Public repositories and free resources first
- Include context: Explain what tools do and who they’re for
- Pull Requests: Contribute directly
This repository is maintained by the ADLM Data Science Program with contributions from laboratory professionals, data scientists, and researchers. We thank our community for their commitment to advancing data science in laboratory medicine.
Stay Connected
- LinkedIn: Follow @myADLM for the latest news and resources
- Annual Data Science Symposium: Join us for hands-on data science workshops and networking. Stay tuned for registration for 2026.
Last Updated: December 2025