-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
- In Stephenson dataset:
- column
Smokerhas a valueNot_known, should be NA - column
Status_on_day_collectionhas levels with 1 and 2 observations, should be merged or set to NA (alternatively, this column can be dropped as there isStatus_on_day_collection_summary) - column
Outcomehas levelunknown, should be set to NA
- column
- In COPD dataset:
- Column
Bronchodilator_Usehas small categories, should be renamed or merged
- Column
- In HLCA dataset:
- Column
cause_of_deathhas a lot of small categories, should be grouped - Column
lung_conditionhas some small categories - Column
lung_conditionhas categoryHealthy (tumor adjacent), which should be renamed toHealthywith tumor information contained elsewhere - Column
lung_conditioncontains COVID severity levels, might be better to group them together (done indisease) and move severity to a separate column - Column
sequencing_platformhas small categories and one mixture of two categories - Column
smoking_statushas a small categoryhist of marijuana use, which should probably be merged withactive - Column
assayhas small categories - Column
sexhas categoryunknown, should be set to NA or predicted - Column
self_reported_ethnicityhas small categories, might make sense not to use it at all
- Column
Metadata
Metadata
Assignees
Labels
No labels