Versatile Symbolic Music-for-Music Modeling via Function Alignment

[Demo Page] [Paper] [Pretrained Models] [Uncurated Demos]

Welcome to the official repo for the ISMIR 2025 Paper Versatile Symbolic Music-for-Music Modeling via Function Alignment!

What is Music-for-Music Modeling?

Music for music refers to represent both input/output sequences (including labels like chords, beats, textures, keys, structures) using the music modality itself. This unifies music understanding (music -> labels) and conditional generation (labels -> music) into the same form (music -> music).

What is function alignment?

Function alignment is a new theory of mind that attributes human-like intelligence towards dynamic synergy among interacting agent (i.e., Language Models).

What did we do?

We use function alignment to model music for music tasks in a unified manner. This includes:

Conditional Generation Tasks

Melody to chord
Chord to melody
Drums to others
Others to drum

Music Analysis Tasks

MIDI chord & metrical structure analysis

All these tasks share (1) the same foundation model and (2) the same adapter architecture, but are fine-tuned on different datasets.

Pretrained Models

All pretrained models are available at Google Drive.

It contains the following files:

cp_transformer_v0.42_size1_batch_48_schedule.epoch=00.fin.ckpt: The foundation model (Roformer) trained on 16th-note quantized MIDI sequences.
mel_to_chord: Models for melody to chord generation.
chord_to_mel: Models for chord to melody generation.
drum_to_others: Models for drum to others generation.
others_to_drum: Models for others to drum generation.
midi_analysis: Models for MIDI analysis.

Download the models and put them in the ckpt/ folder. Keep the subfolder structure.

You must download the foundation model to use other models.

Installation

Download the pretrained models as mentioned above.
Clone the repository and install the required packages:

conda create -n function-alignment python=3.13.2
conda activate function-alignment
pip install torch==2.7.1 --index-url https://download.pytorch.org/whl/cu126
pip install -r requirements.txt

Inference Using Downstream Models

To use a downstream model, call the cp_transformer_yinyang_inference.py script. For example:

python cp_transformer_yinyang_inference.py ckpt/mel_to_chord/cp_transformer_yinyang_v5.1_lora_batch_8_nottingham_cp8_v2_chord_mel_rev_mask0.0-10-step1.epoch=last.ckpt

For the best result, use the model with prompt_lora name (for our self-attentive adapters). yinyang refers to the cross-attention adapters. Other models are baseline models.

The examples (input MIDI files and config) for each model is hardcoded in the cp_transformer_yinyang_inference.py file.

Inference Using Foundation Model

The foundation model is a Roformer model trained on 16th-note quantized MIDI sequences.

To use the pretrained model for inference (e.g., continuation), you can directly run the following command:

python cp_transformer_inference.py ckpt/cp_transformer_v0.42_size1_batch_48_schedule.epoch=00.fin.ckpt

You may use your own MIDI files, just ensure the beats (deduced from tempo changes) are correct in the MIDI file.

Future Works

Unsupervised MIDI Chord and Key Estimation

One of the future works, unsupervised MIDI chord and key estimation is also open-sourced on Github.

The work uses the same foundation model as this repo, but is fine-tuned in a totally unsupervised way (with pseudo-labels).

Though this work does not use function alignment, it is still a music-for-music modeling task and can be implemented via function alignment in the future.

Other Future Works

We are currently working on function alignment for more tasks and cross-modality alignment, as well as a better foundation model for symbolic music.

If you are interested in helping us, please feel free to contact us at [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
input		input
modules		modules
uncurated_demos		uncurated_demos
.gitignore		.gitignore
README.md		README.md
cp_transformer.py		cp_transformer.py
cp_transformer_cocomulla.py		cp_transformer_cocomulla.py
cp_transformer_continuous.py		cp_transformer_continuous.py
cp_transformer_fake.py		cp_transformer_fake.py
cp_transformer_fine_tune.py		cp_transformer_fine_tune.py
cp_transformer_inference.py		cp_transformer_inference.py
cp_transformer_seq2seq.py		cp_transformer_seq2seq.py
cp_transformer_yinyang.py		cp_transformer_yinyang.py
cp_transformer_yinyang_inference.py		cp_transformer_yinyang_inference.py
generator_helper.py		generator_helper.py
preprocess_large_midi_dataset.py		preprocess_large_midi_dataset.py
pretty_midi_fix.py		pretty_midi_fix.py
requirements.txt		requirements.txt
settings.py		settings.py
yield_tags.py		yield_tags.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Versatile Symbolic Music-for-Music Modeling via Function Alignment

[Demo Page] [Paper] [Pretrained Models] [Uncurated Demos]

What is Music-for-Music Modeling?

What is function alignment?

What did we do?

Conditional Generation Tasks

Music Analysis Tasks

Pretrained Models

Installation

Inference Using Downstream Models

Inference Using Foundation Model

Future Works

Unsupervised MIDI Chord and Key Estimation

Other Future Works

About

Uh oh!

Releases

Packages

Languages

music-x-lab/midi-function-alignment

Folders and files

Latest commit

History

Repository files navigation

Versatile Symbolic Music-for-Music Modeling via Function Alignment

[Demo Page] [Paper] [Pretrained Models] [Uncurated Demos]

What is Music-for-Music Modeling?

What is function alignment?

What did we do?

Conditional Generation Tasks

Music Analysis Tasks

Pretrained Models

Installation

Inference Using Downstream Models

Inference Using Foundation Model

Future Works

Unsupervised MIDI Chord and Key Estimation

Other Future Works

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages