GitHub

This is the codebase of the paper titled TransPlayer: Timbre Style Transfer with Flexible Timbre Control (ICASSP 2023).

Click me to listen to the demos.

./dataset: To train with your own data, you first need to preprocess the data using the scripts under ./dataset.
- ./dataset/crop_data.py: Crop your .wav files into chunks. By default, every chunk is 200 seconds long.
- ./dataset/preprocess_cqt.py: Extract 84 CQT features from the audio chunks.
- ./dataset/stereo2mono.py: Only in case your wave files are stereo, you'll want them to be mono at training.

./autoencoder: Directory containing the training code.

How to train:

cd autoencoder
python -m train [--data_dir xxxx/directory_of_your_wav_and_npy_files] [--save_dir yyyy/directory_to_save_your_model]

How to perform inference:

cd autoencoder
python -m inference [--feature_path xxxx/path_to_the_original_cqt.npy] [--org "instrument_name_A"] [--trg "instrument_name_B"] [--cp_path yyyy/path_to_your_model]

How to generate the waveform
- Diffwave implementation (can't redistribute it here for license reasons)
- To train Diffwave on CQT, you probably want to tweak the code a bit, particularly in the way the dataset is loaded (we don't have to call torchaudio in every iteration, it doesn't help with constant-Q transform anyways), and the input data parameters (the implementation is meant for Mel-spectrograms, please make sure the parameters match with your CQT)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
autoencoder		autoencoder
data		data
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Irislucent/TransPlayer

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages