-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Future
- Duplication statistics: high coverage or PCR duplicates ? Spread over the transcriptome or localized on a set of genes. How distributed at the gene scale ?
- Add a column with list of genes corresponding to each GO term enriched (as present for KEGG)
- lncRNA analysis
https://www.tandfonline.com/doi/full/10.1080/15476286.2021.1899673 - CircRNA analysis
https://www.sciencedirect.com/science/article/pii/S1672022921000292 - tRNA abundance/modifcation
https://www.sciencedirect.com/science/article/pii/S1097276521000484?via%3Dihub - Gene fusion detection
https://genome.cshlp.org/content/31/3/448.short?rss=1 - WGCNA and meta analysis
https://journals.plos.org/ploscompbiol/article?id=10.1371%2Fjournal.pcbi.1008976&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+ploscompbiol%2FNewArticles+%28PLOS+Computational+Biology+-+New+Articles%29 - Include String ?
https://string-db.org/
oct 2021
- use new sequana-wrappers
Those requested features are for the rnadiff analysis, not sequana_rnaseq:
- if possible, provide resuls w/wo independent filtering
- we using --force (rnadiff), we should suppress previous DGE results otherwise they will be added to the HTML reports
- design add column 'alias name'
April/May/June 2021
- if pvalue == 0, should set a value so that it can be seen in volcano plot
- fastp tool to complement existing cutadapt trimming tool
- add html entry point for the enrichment (if several comparisons) or several enrichments
- refactorise sequana enrichment maybe to have syntax such as sequana enrichment panther"
march 2021
- better filtering for multiqc
- main summary.html should have more features/summary/plots
- check rnaseqc gtf input [catch missing GTF in the main.py and rnaseq.rules]. added a converter in sequana
- gtf input (from GFF) for the prokaryotes case
- gtf input (from GFF) for the eukaryotes case
- salmon for eukaryotes tested on mm10
- check rnaseqc multiqc module . no need for the biomics fork anymore.
Jan 2021
- BUG fix switch mark duplicates correctly for the qc and others
- Better GFF handling with custom gff able to handle several feature types, sanity checks of user's choice on attribute and feature
- Checked rna_sqc functionality and provide a gff2gtf parser in sequana.
Dec 2020
- Fix issue of seg fault for bacterial genomes with star aligner
- fastq_screen should work now. The only contaminants looked for is the phix. Other genome should be handled by the users (meaning build the indexing); fastq_screen searches for phix is now the default behaviour since the code should work out of the box
- fix missing workflow image in the report.
- add strandness plot in ./outputs directory and add the image in the summary plot
- bowtie1/star/bowtie2 indexing are now stored in their own sub-directories
- provide way to disable rRNA search
- fix issue related to star index rule bug in sequana
- rnadiff option is now set automatically to one_factor
- add option --run to execute the pipeline without manual checking (batch mode)
Oct-Nov 2020
- star index we may have warning.
--genomeSAindexNbases 14 is too large for the genome size=4456448,
which may cause seg-fault at the mapping step. Re-run genome generation with
recommended --genomeSAindexNbases 10 - a more generic title in the multiqc_config
Sept 2020
- Add tolerance for feature_counts in the pipeline and config file after fixing sequana featurecounts functions (v0.9.17)
Aug 2020
- do_indexing option is now pre-filled when instanciating the pipeline.
- salmon option validateMappings is deprecated. to remove
- salmon indexing included
- refactorise the way feature counts are handled. Not in the onsuccess but a simpler code from @khourhin now included in sequana and this pipeline as of version 0.9.16 .
June/july 2020
- Fix R1/R2 issue for rRNA
- add mark duplicates in cluster config and set to False by default
- add paired option for feature counts when paired data is provided.
- add option to skip the fastqc on the raw data. This will be the default; The fastqc on the filtered data is kept by default.
- cleanup the multiqc option to exclude fastqc_samples (to not clash with fastqc_filtered)
April-May 2020
- if input genome size is >4billions Gb, the bowtie2 output extension are .bt2l (not .bt2) therefore, the sequana rule bowtie2_mapping should be updated and this pipeline as well.
- add input to the rnadiff analysis in ./rnadiff
- a faster --help option
- a --from-project option to import existing pipeline
- a HTML custom front page
- add feature counts as a single file
Jan 2020 - April 2020
- integrate the biomix scripts to make the link with the differential analysis
- add feature counts in separate directory ready to use by rnadiff
- integrate salmon
Dec 2019 - Jan 2020
- fix the RNAseQC rule, which is brojen at the moment
- check for rRNA feature name presence in the GFF
- check for feature count type provide by the user
- check config with schema
- fix read tag
- possiblity to switch off cutadapt
- fixing the bowtie2 config/pipeline conflict name (see explanation of the naming convention in the config and pipeline when using bowtie2_mapping rule #3)
- Fixing indexing issue: indexing is done even though not asked for or vice versa: when we set indexing to False, the pipeline fails with crypting message. We will provide a better handling of checking whether or not indexing is done.
- include the schema file
- parameter output-directory should be renamed output_directory in the multiqc section
- handle the stdout correctly inb the fastqc rule, bowtie2, bowtie1
- allow rRNA feature and/or files with meaningful error message if the 2 options conflict
- better multiconfig report (text/title)
Reactions are currently unavailable