Openproblems provides us a living benchmark, however the results of that benchmark are often difficult to interpret. In this thesis we aim to define better ways of interpreting benchmark results from the batch integration task.
-
literature search
-
understand openproblems infrastructure
-
add new integration methods to batch integration task
- DRVI, (sysVI, scPoli)
- other methods from other lab, scMerge
-
investigate metrics
- Add existing published metrics
- ASW improved
- Lutge et al. paper CellMixS
- (kSIM)
- check how scIB metrics differ from these on existing open problems datasets
- Add existing published metrics
-
analyse 1 dataset in detail
HLCA,mouse pancreas- look at biology beyond cell type
- Add 1 method and 1 metric to OP
- prototype investigation of integration results on 1 dataset
- written project proposal of the research plan of the thesis
intermediate presentation
- Main questions:
- How can you translate open problems results to best practices?
- How can we interpret open problems results and their generalization to unseen use cases (datasets)?
- How do we interpret differences in metric-based ranking
- Can we predict how a method will work on a new dataset?
- Scope: Case-study on batch integration task
- add Archmap datasets & existing integrations from HCA integration team
- Conceptualize which dataset characteristics you want to log as predictors of integration performance
- Target: 10 more datasets
- scib metrics
- how do they correspond to
ourdata? get this info from the papers directly - e.g. ASW, but we have nested batch
- e.g. cell-type based metrics - do large scores also come from good cell type separation, rare celltypes
- how do they correspond to
- Range of scIB metrics
- similar to feature selection metrics selection https://doi.org/10.1038/s41592-025-02624-3
- correlation of metrics, range, how useful are metrics compared to others?
- How can we improve interpretability of the benchmark results?
- Improved documentation of metrics
- Case study of integrated object vs metrics (Work package 3)
- Show limitation of x metrics
- follow up with a potential improvement
- don’t reinvent the wheel
- Build a predictor on dataset features → Can we predict model performance from dataset features?
- check Robrecht’s trajectory paper on order of testing datasets
- small N large K problem
- Which characteristics do we want to use?
- correlated characteristics
- What aspects of the datasets are more important for the integration?
Evaluate integration performance by analysing real dataset (e.g. CxG datasets already considered in the openproblems benchmark)
- show that the ranking via scib metrics corresponds to what a biologist might expect in their data
- show whether there is an improvement of new metrics with regard to
- show if there is an improvement in clustering of the top-performing method vs the worst-performing method