ChatTTSSpeaker

class ChatTTSSpeaker(speaker: str | None = None, source: str = 'huggingface', force_redownload: bool = False, compile: bool = False, custom_path: str | None = None, device: str | object | None = None, coef: str | None = None, use_flash_attn: bool = False, use_vllm: bool = False, experimental: bool = False, enable_cache: bool = True, sample_rate: int = 24000)[source]

Bases: BaseSpeaker

speak(text: str, **settings: object) tuple[ndarray, int][source]

Convert text into audio data.

Parameters:
  • text (str) – Text to synthesize.

  • settings (dict[str, object]) – Additional ChatTTS inference settings.

Returns:

Generated audio samples and bitrate.

Return type:

tuple[numpy.ndarray, int]