Skip to content

Irislucent/TransPlayer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This is the codebase of the paper titled TransPlayer: Timbre Style Transfer with Flexible Timbre Control (ICASSP 2023).

Click me to listen to the demos.

  • ./dataset: To train with your own data, you first need to preprocess the data using the scripts under ./dataset.

    • ./dataset/crop_data.py: Crop your .wav files into chunks. By default, every chunk is 200 seconds long.
    • ./dataset/preprocess_cqt.py: Extract 84 CQT features from the audio chunks.
    • ./dataset/stereo2mono.py: Only in case your wave files are stereo, you'll want them to be mono at training.
  • ./autoencoder: Directory containing the training code.

    • How to train:
    cd autoencoder
    python -m train [--data_dir xxxx/directory_of_your_wav_and_npy_files] [--save_dir yyyy/directory_to_save_your_model]
    
    • How to perform inference:
    cd autoencoder
    python -m inference [--feature_path xxxx/path_to_the_original_cqt.npy] [--org "instrument_name_A"] [--trg "instrument_name_B"] [--cp_path yyyy/path_to_your_model]
    
  • How to generate the waveform

    • Diffwave implementation (can't redistribute it here for license reasons)
    • To train Diffwave on CQT, you probably want to tweak the code a bit, particularly in the way the dataset is loaded (we don't have to call torchaudio in every iteration, it doesn't help with constant-Q transform anyways), and the input data parameters (the implementation is meant for Mel-spectrograms, please make sure the parameters match with your CQT)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages