Add support for generating taxprofiler/funcscan input samplesheets for preprocessed FASTQs/FASTAs#688
Add support for generating taxprofiler/funcscan input samplesheets for preprocessed FASTQs/FASTAs#688vagkaratzas wants to merge 19 commits intonf-core:devfrom
Conversation
CarsonJM
left a comment
There was a problem hiding this comment.
Great work, I think getting this blueprint in place will be very helpful in the future!
|
|
||
| // Generate downstream samplesheets | ||
| generate_downstream_samplesheets = false | ||
| generate_pipeline_samplesheets = "funcscan,taxprofiler" |
There was a problem hiding this comment.
Should this default to null so that users have to opt-in to samplesheet generation?
There was a problem hiding this comment.
Yes, that's a good idea! I will set that in createtaxdb too
There was a problem hiding this comment.
What's the difference? In both cases null/false as default it would be an opt-in by the user, no?
There was a problem hiding this comment.
Ah maybe I misunderstood Carson... not sure :D
There was a problem hiding this comment.
It looks like @jfy133 used only one workflow, which will selectively generate samplesheets based on params.generate_pipeline_samplesheets. Do you think it would be best to keep that consistent?
There was a problem hiding this comment.
Also, since FastQ files are being pulled from the publishDir, it might be a good idea to include options that override user inputs for params.publish_dir_mode (so that it is always 'copy' if a samplesheet is generated) and params.save_clipped_reads, params.save_phixremoved_reads ...etc so that the preprocessed FastQ files are published to the params.outdir if a downstream samplesheet is generated
|
|
||
| // Generate downstream samplesheets | ||
| generate_downstream_samplesheets = false | ||
| generate_pipeline_samplesheets = "funcscan,taxprofiler" |
There was a problem hiding this comment.
Yes, that's a good idea! I will set that in createtaxdb too
| ch_assemblies | ||
|
|
||
| main: | ||
| def downstreampipeline_names = params.generate_pipeline_samplesheets.split(",") |
There was a problem hiding this comment.
I've also implemented the same system in createtaxdb now, but with an additional input validation thing that you should also adopt here (i.e., to check that someone doesn't add an unsupported pipeline, or makes a typo).
Check the utils_nfcore_createtaxdb_pipeline file there
| ch_assemblies | ||
|
|
||
| main: | ||
| ch_list_for_samplesheet = ch_assemblies |
There was a problem hiding this comment.
Next thing which I don't think will be so complicated is to add another input channel for bins, and here make an if/else statement if they want to send just the raw assemblies (all contigs) or binned contigs to the samplesheet.
It will need another pipeline level parameter too though --generate_samplesheet_funcscan_seqtype or something
Co-authored-by: James A. Fellows Yates <jfy133@gmail.com>
|
|
||
| // Validate samplesheet generation parameters | ||
| if (params.generate_downstream_samplesheets && !params.generate_pipeline_samplesheets) { | ||
| error('[nf-core/createtaxdb] If supplying `--generate_downstream_samplesheets`, you must also specify which pipeline to generate for with `--generate_pipeline_samplesheets! Check input.') |
| @@ -25,6 +25,7 @@ include { ANCIENT_DNA_ASSEMBLY_VALIDATION } from '../subworkflows/local/ancient_ | |||
| include { DOMAIN_CLASSIFICATION } from '../subworkflows/local/domain_classification' | |||
| include { DEPTHS } from '../subworkflows/local/depths' | |||
| include { LONGREAD_PREPROCESSING } from '../subworkflows/local/longread_preprocessing' | |||
…correct assemblies files (Funcsan)
|
Something somewhere is not liking my new channel of gzipped assemblies... |
|
Almost there, though includes a file |
|
Closes #687 and #686
This adds the local subworkflow (and other relevant code and docs) for generating samplesheets for the downstream pipelines funcscan and taxprofiler.
PR checklist
nf-core lint).nextflow run . -profile test,docker --outdir <OUTDIR>).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).