Skip to content

voxcom infer 偶现不稳定,有时候会推理完4096token(超过2分钟), #124

@shengyouhengheng

Description

@shengyouhengheng
Image

如图的位置,voxcpm.py VoxCPMModel inference 有时候会推理完4096token(超过2分钟),如果把max_length设置的比较小比如1024,推理15s后结果音频是乱的且长度也不对应,在我的prompt_wav和prompt_text不对应时,会经常出现(概率大概有5%),当他们对应时,也会偶尔出现几次(不过概率比较低< 1%)

我的配置:voxcpm 1.0.5 torch==2.6.0 H20卡

我的部分demo:

self.tts_model = VoxCPM(voxcpm_model_path=self.model_dir) def generate(self, text, prompt_wav_path, prompt_text=None, cfg=2.0, steps=10): try: wav = self.tts_model.generate( text=text, prompt_wav_path=prompt_wav_path, prompt_text=prompt_text, cfg_value=cfg, inference_timesteps=steps, normalize=False, denoise=False, retry_badcase=False, max_length=1024, ) return 16000, wav

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions