Supported Models
Speech Recognition
Name in config: asr
SherpaOnnx [Recommended]
Dependency: pip install "xtalk[sherpa-onnx-asr] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/asr/sherpa_onnx_asr.py
A high-performance speech recognition framework and beyond.
Qwen3ASRFlashRealtime
Dependency: pip install "xtalk[ali] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/asr/qwen3_asr_flash_realtime.py
Zipformer
Dependency: pip install "xtalk[zipformer-local] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/asr/zipformer_local.py
ElevenLabs
Dependency: pip install "xtalk[elevenlabs] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/asr/elevenlabs.py
Text to Speech
Name in config: tts
IndexTTS [Recommended]
Dependency: pip install "xtalk[index-tts] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path:
- src/xtalk/speech/tts/index_tts.py
- src/xtalk/speech/tts/index_tts2.py
GPT-SoVITS
Experimental. Feel free to open an issue for any problem.
Dependency: pip install "xtalk[gpt-sovits] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/tts/gpt_sovits.py
CosyVoice
Dependency: pip install "xtalk[ali] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/tts/cosyvoice.py
ElevenLabs
Dependency: pip install "xtalk[elevenlabs] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/tts/elevenlabs.py
Voice Activity Detection
Name in config: vad
X-Talk has VAD on client side, so you may not need one.
Silero VAD
Dependency: pip install "xtalk[silero-vad] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/vad/silero_vad.py
Turn Detection
Name in config: turn_detector
Turn detectors decide when the user has finished speaking and the system should start generation.
SoulxDuplug
Dependency: pip install "xtalk[soulx-duplug] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/turn_detector/soulx_duplug.py
TurnSense
Dependency: pip install "xtalk[turn-sense] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/turn_detector/turn_sense.py
Speech Enhancement
Name in config: speech_enhancer
FastEnhancer
Dependency: pip install onnxruntime
Path: src/xtalk/speech/speech_enhancer/speech_enhancer.py
Speaker Recognition
Name in config: speaker_encoder
Wespeaker-Voxceleb-Resnet34-LM
Dependency: pip install "xtalk[pyannote] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/speaker_encoder/pyannote_embedding.py
Captioner
Name in config: captioner
Captioners give you description of audio clip.
Qwen3-Omni-30B-A3B-Captioner
Dependency: None
Path: src/xtalk/speech/captioner/qwen3_omni_captioner.py