Releases · danielzmbp/remag

05 Mar 21:51

github-actions

v0.4.0

def25ee

v0.4.0 Latest

Latest

release: bump version to 0.4.0
style: apply formatting-only updates across modules and tests
remove automated bioconda PR, generate recipe as artifact only; fix recipe (Python >=3.9, add scipy)
clean up duplicate imports, stale comments, and misplaced import re
remove unused imports and stale comments in hyenadna files
remove dead variables and unused import in clustering.py
fix: --filter-only flag was silently ignored (missing from args Namespace)
remove dead _leiden_clustering_on_graph() function
remove dead PathManager class from utils.py
docs: update README to match current pipeline and fix cli OPTION_GROUPS ghost reference
remove dead greedy_min_score parameter (F1 scores are always >= 0)
Revert "fix: tighten greedy clustering contamination cap to 5%"
fix: tighten greedy clustering contamination cap to 5%
fix: block rescue merges above 10% duplication
feat: expose rescue duplication limits in CLI and enforce max total duplication
Revert "refactor: update encoder architecture to be dynamic and balanced"
refactor: update encoder architecture to be dynamic and balanced
fix: rate limit unconditional fusion debug logs
fix: rate limit debug logging for gating weights
fix: switch debug prints to logger.debug
feat: add debug prints for gating weights
Revert to state after 'feat: add short-reads/sr mode' (77f1ea5)
Update rescue threshold: 0.9 for single sample, 0.7 for coassemblies
Add 0.05 to default Leiden resolutions for larger clusters
Fix indentation bug in rescue loop causing duplicate merges
Update rescue strategy: use global SCG count for contamination and enforce 10% ceiling
feat: add short-reads/sr mode to enforce 1000bp min contig length
feat: add contamination filter to greedy clustering
tune: implement F1-score based bin quality metric inspired by SemiBin2
tune: add 0.1 to greedy clustering resolutions
tune: score bins by total core genes (unique families) - 5*dups
tune: set miniprot query coverage 0.6 identity 0.4, revert global -N/-p
fix: use query coverage instead of target coverage for miniprot filtering
Revert "Increase contamination penalty from 5 to 7 in bin quality scoring"
Revert "Remove Singleton Rescue step"
Revert "Increase contamination penalty from 7 to 10"
Revert "Add 0.1 to default greedy clustering resolutions"
Revert "tune: tighten miniprot thresholds"
Revert "chore: remove agents instructions"
Revert "tune: lower contamination penalty"
Revert "tune: score bins by total core genes"
Revert "tune: relax miniprot filters"
Revert "tune: restore miniprot outs"
tune: restore miniprot outs
tune: relax miniprot filters
tune: score bins by total core genes
tune: lower contamination penalty
chore: remove agents instructions
tune: tighten miniprot thresholds
Add 0.1 to default greedy clustering resolutions
Increase contamination penalty from 7 to 10
Remove Singleton Rescue step
Increase contamination penalty from 5 to 7 in bin quality scoring
Add debug logging for singleton rescue
Restore exact logging format
Revert greedy clustering parallelization
Replace joblib with concurrent.futures
Parallelize greedy clustering resolution search
Reduce rescue duplication tolerance to 3%
Implement Singleton Rescue step
Remove refinement step and related CLI options
Restore missing _leiden_clustering_on_graph function
Replace adaptive resolution with greedy Leiden clustering
fix: handle duplicate alignment filenames by using parent directory for disambiguation
Fix: Remove unexpected 'use_header_cache' argument from check_core_gene_duplications calls
feat: Log command line arguments for reproducibility
test: add reproduction test for issue verification
feat: implement bin rescue strategy
chore: update .gitignore
test: Add tests for CLI default parameters
fix(tests): Resolve existing test suite failures
feat: Update coassembly defaults for learning rate and lambda
Remove standalone directory
Tune coassembly resolution
chore: remove AGENTS documentation
Full Changelog: v0.3.4...v0.4.0

Full Changelog: v0.3.4...v0.4.0

Assets 4

01 Dec 15:38

github-actions

v0.3.4

afa91ba

v0.3.4

chore: place license-files under project section
chore: fix license files config
fix: make license field build-compatible
Prepare v0.3.4 release
Merge branch '2modelapproach-wip'
Lower coassembly auto barlow lambda to 0.005
Add filter-only CLI mode
Skip euk filter in single-cell mode
Set coassembly min contig length to 4096
Raise min contig length for coassemblies
Dial back single-cell k-NN default
Increase single-cell k-NN default
Fix single-cell resolution to 0.01
Lower single-cell resolution sweep to <=0.5
Cap single-cell resolution sweep at 1.0
Add mode presets and single-cell clustering behavior
Seed training for reproducibility
Adjust LR and auto barlow defaults
Raise miniprot alignment thresholds
Loosen completeness early-stop threshold to 50%
Drop resolution 3.0 from coassembly sweep
Prefer cluster count when duplications tie
Drop over-split resolutions and prioritize completeness in selection
Add completeness-driven early stop in resolution sweep
Refine resolution selection tie-breakers
Raise coassembly resolution floor to 0.6
Quiet cluster sizes to debug level
Adjust resolution sweep for coassemblies
Test higher Leiden resolution for coassemblies
Limit high-resolution testing to coassemblies
Filter singleton bins from clustering summary log
Add auto default for Barlow lambda based on sample count
Add CLI knob for Barlow Twins lambda
Refine bin splitting with Leiden and update defaults
Switch refinement to k-means
Seed training setup deterministically
Simplify duplication checks
Seed HyenaDNA predictor randomness
Refactor feature calculations
Write bins after filtering
Vectorize clustering utilities
Update CLI defaults
Fix adaptive resolution grid
Prune unused helpers in utils
Vectorize cluster contig mapping
Clean HyenaDNA model comments
Update plot features import
Remove AGENTS.md
feat: change default refinement threshold to conservative (min_dup=2)
wip: save current 2-model approach experiments
refactor: show training progress bar only in verbose mode
fix: revert test_resolutions to v0.3.0 values (3 resolutions)
revert: restore v0.3.0 fusion layer architecture
revert: restore v0.3.0 adaptive refinement strategy
fix: restore v0.3.0 coverage encoder layer sizes
refactor: revert to duplication minimization for resolution selection
feat: auto-adjust batch size when dataset is smaller
feat: add LRU cache for k-mer mapping and save-bins-before-refinement option
feat: improve reproducibility with seeded random number generation
fix: add missing losses.py module
perf: reuse cached feature tensor in SequenceDataset
refactor: remove redundant eukaryotic classification logging from clustering
refactor: remove obsolete reclustering check log message
refactor: remove unused _leiden_clustering import
fix: use hashlib for deterministic fragment generation
refactor: remove unused extract_base_contig_name import
refactor: remove unused tqdm import from output module
refactor: replace transformers tokenizer with standalone implementation
refactor: remove transformers dependency from core requirements
feat: add version number to params.json output
fix: correct BUSCO gene family parsing and lower quality thresholds
perf: add deterministic fragments and k-NN graph disk caching
fix: inline gene count extraction to avoid missing import
fix: restore batch size default to 2048
fix: enable refinement to work without -k flag
perf: cache k-NN graph in adaptive resolution + log cleanup
fix: remove logic that assigns unembedded contigs to largest cluster
perf: add graph caching helper for refinement
refactor: count only single-copy genes for completeness metric
refactor: round resolution value in Leiden log message
refactor: remove redundant resolution log message
fix: add --save-filtered-contigs to Filtering & Processing section
refactor: simplify CLI by removing parameter range restrictions
chore: remove unused dependencies (psutil, biopython, joblib)
refactor: remove 3 redundant refinement log messages
refactor: merge graph info messages into one
refactor: round resolution values in Leiden clustering message
refactor: remove 'Adaptive resolution determination' message
refactor: remove duplicate message and increase batch size
perf: optimize miniprot execution with caching
refactor: remove redundant filtered FASTA message
refactor: clean up output logging
refactor: move fragment augmentation message to debug
refactor: simplify miniprot logging message
refactor: clean up BAM processing progress display
refactor: improve BAM processing logging and progress display
feat: improve logging and add save-filtered-contigs option
feat: update HyenaDNA classifier to use improved model
refactor: improve CLI UX with positional args and optional output
fix: add run_exports to bioconda recipe template
fix: update bioconda recipe generation to fix test failures
fix: resolve deprecation warnings in workflows and package config
refactor: move calculate_raw_coverage.py to scripts directory
refactor: centralize miniprot thresholds and adjust to 0.30/0.50
chore: remove unused code from refinement and utils modules
feat: auto-adjust batch size when dataset is smaller
chore: add large data files to .gitignore
docs: prepare release 0.3.0
refactor: clean up unused adaptive strategy parameters
refactor: reduce console output verbosity
docs: add repository guidelines for development workflow
feat: add standalone HyenaDNA predictor
feat: add Barlow Twins training diagnostics
feat: add adaptive resolution and performance optimizations
Fix xgboost import and classification column names
Replace XGBoost classifier with HyenaDNA LLM-based model
Prepare 0.2.5 release
chore: drop unused setup_logging import
fix: retain best checkpoint state
fix: guard zero-read coverage normalization
chore: require python 3.9+
Always write embeddings.csv regardless of --keep-intermediate flag
Update fusion layer architecture
Release v0.2.4
Merge branch 'main' of https://github.com/danielzmbp/remag
refinement: skip bins without duplication data; remove '(conservative approach)' from warnings
Merge branch 'main' of https://github.com/danielzmbp/remag
feat: implement conservative refinement strategy to preserve completeness
Merge branch 'main' of https://github.com/danielzmbp/remag
Fix critical edge cases and test infrastructure issues
Merge branch 'main' of https://github.com/danielzmbp/remag
Eliminate code duplication in DataFrame column initialization
Fix security vulnerability and optimize performance
Full Changelog: v0.3.3...v0.3.4

Full Changelog: v0.3.3...v0.3.4

Assets 4

04 Nov 13:06

github-actions

v0.3.3

94fd01c

v0.3.3

Full Changelog: ...v0.3.3

Full Changelog: https://github.com/danielzmbp/remag/commits/v0.3.3

Assets 4

30 Oct 22:47

github-actions

v0.3.2

f024908

v0.3.2

chore: bump version to 0.3.2
fix: suppress pandas FutureWarning for fillna downcasting
refactor: remove obsolete reclustering check log message
refactor: quality-aware resolution selection and improved refinement
fix: use hashlib for deterministic fragment generation
tune: adjust learning rate, coverage threshold, and refinement rounds
refactor: remove unused tqdm import from output module
refactor: replace transformers tokenizer with standalone implementation
refactor: remove transformers dependency from core requirements
feat: add version number to params.json output
fix: correct BUSCO gene family parsing and lower quality thresholds
perf: add deterministic fragments, k-NN caching, and simplified logging
refactor: improve resolution testing and fix refinement without -k
perf: cache k-NN graph in adaptive resolution + improve metrics
fix: remove logic that assigns unembedded contigs to largest cluster
perf: cache k-NN graph during refinement + extend resolution range + find most conservative solution
fix: remove minimum resolution threshold in refinement
refactor: simplify refinement with fixed resolution steps
refactor: fix k-neighbors and threshold during refinement
refactor: count only single-copy genes for completeness metric
refactor: round resolution value in Leiden log message
refactor: fix clustering parameters during resolution testing
refactor: widen auto-resolution testing range
refactor: increase refinement resolution steps for faster convergence
chore: increase default minimum bin size to 300kb
chore: change default base learning rate to 0.001
refactor: remove redundant resolution log message
refactor: improve auto-resolution metrics and refinement
fix: add --save-filtered-contigs to Filtering & Processing section
refactor: simplify CLI by removing parameter range restrictions
refactor: remove experimental SCG loss functionality
chore: remove unused dependencies (psutil, biopython, joblib)
refactor: remove 3 redundant refinement log messages
refactor: merge graph info messages into one
refactor: round resolution values in Leiden clustering message
refactor: remove 'Adaptive resolution determination' message
refactor: remove duplicate message and increase batch size
perf: optimize miniprot execution with caching
refactor: move SCG feature matrix message to debug
refactor: remove redundant filtered FASTA message
refactor: clean up output logging
refactor: move fragment augmentation message to debug
refactor: move SCG gene mappings message to debug
refactor: simplify miniprot logging message
refactor: clean up BAM processing progress display
refactor: improve BAM processing logging and progress display
feat: improve logging and add save-filtered-contigs option
feat: update HyenaDNA classifier to use improved model
refactor: improve CLI UX with positional args and optional output
refactor: move calculate_raw_coverage.py to scripts directory
feat: add SCG-aware contrastive learning with consolidated miniprot execution
fix: add run_exports to bioconda recipe template
Full Changelog: v0.3.1...v0.3.2

Full Changelog: v0.3.1...v0.3.2

Assets 4

21 Oct 09:40

github-actions

v0.3.1

c597605

v0.3.1

chore: bump version to 0.3.1
fix: update bioconda recipe generation to fix test failures
fix: resolve deprecation warnings in workflows and package config
Full Changelog: v0.3.0...v0.3.1

Full Changelog: v0.3.0...v0.3.1

Assets 4

21 Oct 09:04

github-actions

v0.3.0

0b37502

v0.3.0

Full Changelog: ...v0.3.0

Full Changelog: https://github.com/danielzmbp/remag/commits/v0.3.0

Assets 4

17 Oct 11:42

github-actions

v0.2.5

c169a3d

v0.2.5

Prepare 0.2.5 release
chore: drop unused setup_logging import
fix: retain best checkpoint state
fix: guard zero-read coverage normalization
chore: require python 3.9+
Always write embeddings.csv regardless of --keep-intermediate flag
Update fusion layer architecture
Full Changelog: v0.2.4...v0.2.5

Full Changelog: v0.2.4...v0.2.5

Assets 4

14 Sep 16:14

github-actions

v0.2.4

e2d0a87

v0.2.4

Release v0.2.4
Merge branch 'main' of https://github.com/danielzmbp/remag
refinement: skip bins without duplication data; remove '(conservative approach)' from warnings
Merge branch 'main' of https://github.com/danielzmbp/remag
feat: implement conservative refinement strategy to preserve completeness
Merge branch 'main' of https://github.com/danielzmbp/remag
Fix critical edge cases and test infrastructure issues
Merge branch 'main' of https://github.com/danielzmbp/remag
Eliminate code duplication in DataFrame column initialization
Fix security vulnerability and optimize performance
remove comment
README: remove explicit conda install of miniprot; clarify it’s an automatic dependency
Full Changelog: v0.2.3...v0.2.4

Full Changelog: v0.2.3...v0.2.4

Assets 4

20 Aug 09:45

github-actions

v0.2.3

2557783

v0.2.3

Prepare v0.2.3 release - Bug fixes and dependency updates
Fix undefined variable error in clustering and clean up imports
Update recipe dependencies for bioconda
Update README.md for v0.2.2 - Remove obsolete k-means references and update version examples
Full Changelog: v0.2.2...v0.2.3

Full Changelog: v0.2.2...v0.2.3

Assets 4

14 Aug 09:33

github-actions

v0.2.2

a3df926

v0.2.2

Update CHANGELOG.md for v0.2.2 release
Prepare v0.2.2 release - Final cleanup and version bump
Fix undefined variable error in clustering
Refactor codebase for improved maintainability and reduced complexity
clean up redundant code and comments
update dependencies for bioconda
remove small euk db
add args parameter to _construct_knn_graph function
add --outs=0.95 parameter to miniprot command
remove unnecessary noise handling in leiden clustering
limit bins.csv to contig and cluster columns
fix: remove duplicate v prefix in Zenodo title
Full Changelog: v0.2.1...v0.2.2

Full Changelog: v0.2.1...v0.2.2

Assets 4

Releases: danielzmbp/remag

v0.4.0

Uh oh!

v0.3.4

Uh oh!

v0.3.3

Uh oh!

v0.3.2

Uh oh!

v0.3.1

Uh oh!

v0.3.0

Uh oh!

v0.2.5

Uh oh!

v0.2.4

Uh oh!

v0.2.3

Uh oh!

v0.2.2

Uh oh!