Releases: ilya16/SyMuPe
SyMuPe v1.1.0
This is a major update to the SyMuPe library, introducing a unified Inference API for generation, classification, and embedding models, the official release of the Refined Alignment for Scores and Performances (RAScoP) pipeline, and significant architectural refactoring for better readability and maintainability.
This release implements the models and algorithms reported in the article:
"PianoCoRe: Combined and Refined Piano MIDI Dataset" (TISMIR / arXiv / GitHub)
Highlights
Unified Inference API (#7)
Starting with v1.1.0, SyMuPe supports a unified inference API. All models are now grouped into three task-specific categories:
- Generators (e.g.
PerformanceGenerator) - Classifiers (e.g.
MusicClassifier) - Embedders (e.g.
MusicEmbedder)
The models can be loaded using the corresponding AutoFactory classes (AutoGenerator, AutoClassifier, or AutoEmbedder). By default, an interactive progress bar (controlled via show_progress) is now displayed for all inference tasks.
All trained models are available and documented on the Hugging Face Hub.
Simplified Initialization
Inference modules can now be built without manually managing models and tokenizers. Furthermore, methods now accept file paths directly.
v1.0.0:
from symusic import Score
from symupe.data.tokenizers import SyMuPe
from symupe.inference import AutoGenerator, perform_score, save_performances
from symupe.models import AutoModel
model = AutoModel.from_pretrained("SyMuPe/PianoFlow-base").to(device)
tokenizer = SyMuPe.from_pretrained("SyMuPe/PianoFlow-base")
generator = AutoGenerator.from_model(model, tokenizer, device=device)
score_midi = Score("score.mid")
gen_results = perform_score(score_midi)v1.1.0:
from symupe import AutoGenerator
generator = AutoGenerator.from_pretrained("SyMuPe/PianoFlow-base", device=device)
gen_results = generator.perform_score("score.mid")New Models
The SyMuPe HF Hub includes two new models presented in the "PianoCoRe: Combined and Refined Piano MIDI Dataset" article:
- SyMuPe/MIDI-Quality-Classifier: a model trained to automatically assess the quality of MIDI files into four categories:
score(inexpressive),high quality,low quality, andcorrupted. - SyMuPe/Aria-MIDI-MLM: a 12-layer Transformer encoder designed for symbolic piano music feature extraction and trained on the deduplicated subset of the Aria-MIDI dataset.
The Quick Start code is available on the model pages.
RAScoP: Refined Alignment for Scores and Performances (#5)
This release introduces the documented implementation of the RAScoP pipeline, the engine used to create the note-aligned score and performance MIDI files in the PERiScoPe and PianoCoRe datasets. It provides a four-stage refinement process:
- (H): Alignment Hole Processing
- (O): Onset Cleaning and Temporal Refinement
- (I): Note Interpolation
- (S): Performance-to-Score Synchronization
The algorithm is described in the article.
Enhanced Data Processing (#3, #6, #8)
The documentation and reliability of data-related modules (symupe.data.alignments, symupe.data.tokenizers, symupe.data.midi, andsymupe.data.partitura) were significantly improved:
- Compressed Alignments: Added support for
.npzcompressedAlignmentstorage - Tokenizer Aliases: Added
AutoTokenizerandSyMuPeTokenizeraliases forMusicTokenizerandSyMuPefor a more idiomatic developer experience - Bug Fixes:
- Fixed a critical bug in
cut_overlapping_notesregarding previous note processing - Fixed incorrect handling of the last note in
Alignment.match_with_midi - Resolved duplicate pairs produced by
ParangonarAligner
- Fixed a critical bug in
Code Quality & Styling (#4)
The entire codebase is now linted and formatted using ruff. Documentation and type annotations have been standardized across most of the modules.
Breaking Changes
- Base inference classes moved from
symupe.models.basetosymupe.inference.base(#7) - Across all inference methods,
disable_tqdmhas been replaced withshow_progress(with inverted logic) for consistency (#7)
Deprecations
- Standalone functions
perform_scoreandsave_performancesare deprecated and will be removed in the next release (#7). Use the generator methods instead:# Old gen_results = perform_score(generator, score) save_performances(gen_results, out_dir="samples") # New gen_results = generator.perform_score(score) generator.save_performances(gen_results, out_dir="samples")
Links
- PianoCoRe Paper: TISMIR / arXiv
- Models: huggingface.co/SyMuPe
- PyPI: pypi.org/project/symupe
Full Changelog: v1.0.0...v1.1.0
SyMuPe v1.0.0
This is the first public release of SyMuPe, a framework for Symbolic Music Performance modeling!
This release corresponds to the models and results reported in the paper:
"SyMuPe: Affective and Controllable Symbolic Music Performance" (ACM DL / arXiv)
Installation
SyMuPe can be installed directly from PyPI:
pip install symupeHighlights
SyMuPe Tokenizer
A universal configurable tokenizer for score and performance MIDI, supporting time-only and score-aligned performance encodings. The tokenizer outputs both real-valued features and discrete tokens, which can be used for regression and categorical prediction of score and performance features.
SyMuPe Models
The package enables the construction of encoder-only, decoder-only, and encoder-decoder transformer-based models for unconditional and conditional expressive music performance generation. The supported generative modeling paradigms include:
- Causal Language Modeling (CLM)
- Masked Language Modeling (MLM)
- Conditional Flow Matching (CFM)
- Discrete Flow Matching (DFM)
The v1.0.0 release provides three score-only models described in the paper and available on the Hugging Face Hub:
- PianoFlow-base: Flagship model based on CFM for high-fidelity piano performance rendering
- EncDec-base: Encoder-decoder token-based baseline trained using the CLM objective for the decoder
- MLM-base: Fast single-step token-based baseline trained using the MLM objective
The models were trained on the PERiScoPe (Piano Expression Refined Score and Performance MIDI) dataset, also available on Hugging Face.
Inference Generators
Models are accompanied by generator wrappers that allow expressive performance rendering in just a few lines:
from symusic import Score
from symupe.data.tokenizers import SyMuPe
from symupe.inference import AutoGenerator, perform_score, save_performances
from symupe.models import AutoModel
model = AutoModel.from_pretrained("SyMuPe/PianoFlow-base")
tokenizer = SyMuPe.from_pretrained("SyMuPe/PianoFlow-base")
generator = AutoGenerator.from_model(model, tokenizer=tokenizer)
score_midi = Score("score.mid")
gen_results = perform_score(generator, score=score_midi, num_samples=1)
save_performances(gen_results, out_dir="samples")Package Structure
The SyMuPe package represents a collection of many useful classes and functions for symbolic music performance analysis and modeling:
symupe.data:alignments: score-performance alignment tools based on Nakamura's AlignmentTool and Parangonarmidi: MIDI processing utilities, usingsymusicfor fast MIDI processingtokenizers:OctupleM(a modified OctupleMIDI),SPMuple, andSyMuPetokenizers built usingMidiTokdatasets: token sequence datasets for trainingcollators: multi-mask and multi-task batch preparation
symupe.modules:- model building blocks
- training metrics
- sampling utilities
symupe.models:- base
Modelclass andAutoModel MusicTransformer(used forMLMbaseline)CFMMusicTransformer(PianoFlow)Seq2SeqMusicTransformer(used forEncDecbaseline)ScorePerformer- Classifier models
- base
symupe.experiments:- universal
Trainer - training utilities (
TrainerConfig, optimizers, callbacks)
- universal
symupe.inference:AutoGenerator- performance rendering utilities
symupe.utils:- unified
Configclass - I/O functions
- Python and PyTorch helper functions
- unified
Links
- Paper: arXiv:2511.03425
- PyPI: pypi.org/project/symupe
- Models: huggingface.co/SyMuPe
Feedback and contributions are welcome!