Skip to content

Releases: ilya16/SyMuPe

SyMuPe v1.1.0

08 May 17:01
13cc57d

Choose a tag to compare

This is a major update to the SyMuPe library, introducing a unified Inference API for generation, classification, and embedding models, the official release of the Refined Alignment for Scores and Performances (RAScoP) pipeline, and significant architectural refactoring for better readability and maintainability.

This release implements the models and algorithms reported in the article:
"PianoCoRe: Combined and Refined Piano MIDI Dataset" (TISMIR / arXiv / GitHub)

Highlights

Unified Inference API (#7)

Starting with v1.1.0, SyMuPe supports a unified inference API. All models are now grouped into three task-specific categories:

  1. Generators (e.g. PerformanceGenerator)
  2. Classifiers (e.g. MusicClassifier)
  3. Embedders (e.g. MusicEmbedder)

The models can be loaded using the corresponding AutoFactory classes (AutoGenerator, AutoClassifier, or AutoEmbedder). By default, an interactive progress bar (controlled via show_progress) is now displayed for all inference tasks.

All trained models are available and documented on the Hugging Face Hub.

Simplified Initialization

Inference modules can now be built without manually managing models and tokenizers. Furthermore, methods now accept file paths directly.

v1.0.0:

from symusic import Score
from symupe.data.tokenizers import SyMuPe
from symupe.inference import AutoGenerator, perform_score, save_performances
from symupe.models import AutoModel

model = AutoModel.from_pretrained("SyMuPe/PianoFlow-base").to(device)
tokenizer = SyMuPe.from_pretrained("SyMuPe/PianoFlow-base")
generator = AutoGenerator.from_model(model, tokenizer, device=device)

score_midi = Score("score.mid")
gen_results = perform_score(score_midi)

v1.1.0:

from symupe import AutoGenerator

generator = AutoGenerator.from_pretrained("SyMuPe/PianoFlow-base", device=device)

gen_results = generator.perform_score("score.mid")

New Models

The SyMuPe HF Hub includes two new models presented in the "PianoCoRe: Combined and Refined Piano MIDI Dataset" article:

  • SyMuPe/MIDI-Quality-Classifier: a model trained to automatically assess the quality of MIDI files into four categories: score (inexpressive), high quality, low quality, and corrupted.
  • SyMuPe/Aria-MIDI-MLM: a 12-layer Transformer encoder designed for symbolic piano music feature extraction and trained on the deduplicated subset of the Aria-MIDI dataset.

The Quick Start code is available on the model pages.

RAScoP: Refined Alignment for Scores and Performances (#5)

This release introduces the documented implementation of the RAScoP pipeline, the engine used to create the note-aligned score and performance MIDI files in the PERiScoPe and PianoCoRe datasets. It provides a four-stage refinement process:

  1. (H): Alignment Hole Processing
  2. (O): Onset Cleaning and Temporal Refinement
  3. (I): Note Interpolation
  4. (S): Performance-to-Score Synchronization

The algorithm is described in the article.

Enhanced Data Processing (#3, #6, #8)

The documentation and reliability of data-related modules (symupe.data.alignments, symupe.data.tokenizers, symupe.data.midi, andsymupe.data.partitura) were significantly improved:

  • Compressed Alignments: Added support for .npz compressed Alignment storage
  • Tokenizer Aliases: Added AutoTokenizer and SyMuPeTokenizer aliases for MusicTokenizer and SyMuPe for a more idiomatic developer experience
  • Bug Fixes:
    • Fixed a critical bug in cut_overlapping_notes regarding previous note processing
    • Fixed incorrect handling of the last note in Alignment.match_with_midi
    • Resolved duplicate pairs produced by ParangonarAligner

Code Quality & Styling (#4)

The entire codebase is now linted and formatted using ruff. Documentation and type annotations have been standardized across most of the modules.

Breaking Changes

  • Base inference classes moved from symupe.models.base to symupe.inference.base (#7)
  • Across all inference methods, disable_tqdm has been replaced with show_progress (with inverted logic) for consistency (#7)

Deprecations

  • Standalone functions perform_score and save_performances are deprecated and will be removed in the next release (#7). Use the generator methods instead:
    # Old
    gen_results = perform_score(generator, score)
    save_performances(gen_results, out_dir="samples")
    
    # New
    gen_results = generator.perform_score(score)
    generator.save_performances(gen_results, out_dir="samples")

Links

Full Changelog: v1.0.0...v1.1.0

SyMuPe v1.0.0

16 Mar 20:14
4f0bb98

Choose a tag to compare

This is the first public release of SyMuPe, a framework for Symbolic Music Performance modeling!

This release corresponds to the models and results reported in the paper:
"SyMuPe: Affective and Controllable Symbolic Music Performance" (ACM DL / arXiv)

Installation

SyMuPe can be installed directly from PyPI:

pip install symupe

Highlights

SyMuPe Tokenizer

A universal configurable tokenizer for score and performance MIDI, supporting time-only and score-aligned performance encodings. The tokenizer outputs both real-valued features and discrete tokens, which can be used for regression and categorical prediction of score and performance features.

SyMuPe Models

The package enables the construction of encoder-only, decoder-only, and encoder-decoder transformer-based models for unconditional and conditional expressive music performance generation. The supported generative modeling paradigms include:

  1. Causal Language Modeling (CLM)
  2. Masked Language Modeling (MLM)
  3. Conditional Flow Matching (CFM)
  4. Discrete Flow Matching (DFM)

The v1.0.0 release provides three score-only models described in the paper and available on the Hugging Face Hub:

  1. PianoFlow-base: Flagship model based on CFM for high-fidelity piano performance rendering
  2. EncDec-base: Encoder-decoder token-based baseline trained using the CLM objective for the decoder
  3. MLM-base: Fast single-step token-based baseline trained using the MLM objective

The models were trained on the PERiScoPe (Piano Expression Refined Score and Performance MIDI) dataset, also available on Hugging Face.

Inference Generators

Models are accompanied by generator wrappers that allow expressive performance rendering in just a few lines:

from symusic import Score
from symupe.data.tokenizers import SyMuPe  
from symupe.inference import AutoGenerator, perform_score, save_performances  
from symupe.models import AutoModel 

model = AutoModel.from_pretrained("SyMuPe/PianoFlow-base")
tokenizer = SyMuPe.from_pretrained("SyMuPe/PianoFlow-base")
generator = AutoGenerator.from_model(model, tokenizer=tokenizer)

score_midi = Score("score.mid")
gen_results = perform_score(generator, score=score_midi, num_samples=1)

save_performances(gen_results, out_dir="samples")

Package Structure

The SyMuPe package represents a collection of many useful classes and functions for symbolic music performance analysis and modeling:

  • symupe.data:
    • alignments: score-performance alignment tools based on Nakamura's AlignmentTool and Parangonar
    • midi: MIDI processing utilities, using symusic for fast MIDI processing
    • tokenizers: OctupleM (a modified OctupleMIDI), SPMuple, and SyMuPe tokenizers built using MidiTok
    • datasets: token sequence datasets for training
    • collators: multi-mask and multi-task batch preparation
  • symupe.modules:
    • model building blocks
    • training metrics
    • sampling utilities
  • symupe.models:
    • base Model class and AutoModel
    • MusicTransformer (used for MLM baseline)
    • CFMMusicTransformer (PianoFlow)
    • Seq2SeqMusicTransformer (used for EncDec baseline)
    • ScorePerformer
    • Classifier models
  • symupe.experiments:
    • universal Trainer
    • training utilities (TrainerConfig, optimizers, callbacks)
  • symupe.inference:
    • AutoGenerator
    • performance rendering utilities
  • symupe.utils:
    • unified Config class
    • I/O functions
    • Python and PyTorch helper functions

Links

Feedback and contributions are welcome!