Port of OpenAI's whisper model.
Clone the repository and instantiate it.
- Specify GPU backend in
LocalPreferences.tomlfile (eitherAMDGPUorCUDA) if using GPU for inference. - Run the model:
julia> using AMDGPU # If using AMDGPU for inference.
julia> using CUDA # If using CUDA for inference.
julia> using Whisper, Flux
# GPU inference at FP16 precision.
julia> Whisper.transcribe(
"./input.flac", "./output.srt";
model_name="tiny.en", dev=gpu, precision=f16)
# CPU inference.
julia> Whisper.transcribe(
"./input.flac", "./output.srt";
model_name="tiny.en", dev=cpu, precision=f32)Multilingual support
To perform transcribtion from non-English language,
specify language argument (optional) and drop .en from the model name.
julia> Whisper.transcribe(
"ukrainian-sample.flac", "./output.srt";
model_name="medium", language="ukrainian", dev=cpu, precision=f32)To see what languages are supported, execute:
julia> values(Whisper.LANGUAGES)- Supported input file:
.flacwith 1 channel and 16k sample rate. - Other input files are converted to it using
ffmpegwhich must be installed on your system and accessible from PATH.
- Beam search decoder.
- Streaming support.