Testing
Experimental functionality
X-Talk provides a single script, scripts/test.py, for both test-set generation and automated backend evaluation.
It has two modes:
--create: generate a runnable audio dataset from text templates--input: start an embedded X-Talk server and run the dataset automatically
Preparing a Test Template
Create a dataset root with:
- one TTS config JSON at the root, such as
tts_config.json - an optional
test_config.jsonat the root - one or more case subdirectories
For example:
logs/test_templates/smoke/
├── tts_config.json
├── test_config.json
└── basic_turn/
└── timestamp.txt
Writing tts_config.json
In --create mode, the script looks for one root JSON file such as tts_config.json, config.json, or sample_local.json.
That file can use either of these shapes:
- a standalone TTS model config
- a full X-Talk service config that contains a top-level
ttsfield
For clarity, we recommend a standalone tts_config.json dedicated to dataset generation.
Minimal example with IndexTTS:
{
"type": "IndexTTS",
"params": {
"host": "127.0.0.1",
"port": 11996,
"voices": [
{
"name": "man",
"path": "/path/to/reference_voice.wav"
}
]
}
}
Equivalent full-service form:
{
"tts": {
"type": "IndexTTS",
"params": {
"host": "127.0.0.1",
"port": 11996,
"voices": [
{
"name": "man",
"path": "/path/to/reference_voice.wav"
}
]
}
}
}
type should match the Python class name of the TTS model, and params should match that class's initialization arguments. See Supported Models for available TTS backends and their optional dependencies.
For IndexTTS, each voices entry needs:
name: the voice identifier exposed to X-Talkpath: a reference WAV file, or a directory containing reference audio files
If you already have a working server config, reusing its tts section is usually the easiest option.
Optional Configuration
You may place an optional test_config.json at the dataset root. It supports:
concurrencywith_vadvad_redemption_msjudge_llm
Example:
{
"concurrency": 1,
"with_vad": false,
"vad_redemption_ms": 500,
"judge_llm": {
"model": "Qwen/Qwen2.5-7B-Instruct",
"base_url": "http://127.0.0.1:8000/v1",
"api_key": "YOUR_API_KEY"
}
}
Each case directory may also contain an optional criteria.yaml:
judge_asr: true
When judge_asr: true is enabled, every runnable timestamp.txt entry must include the expected transcript text as the third column. Datasets produced by --create already satisfy this format automatically.
Creating timestamp.txt
Each case directory must contain a timestamp.txt. In --create mode, each line uses the format <time_spec>:<text>:
# basic_turn/timestamp.txt
0:Hello, how are you today?
ai_end:Tell me more about your plan.
ai_end+2.5:I also want to ask about pricing.
<time_spec> can be:
- an absolute second value such as
0,5.0, or10.5 ai_startai_enduser_startuser_end- one of the anchors above plus an offset such as
ai_end+2.5
Relative timestamps are resolved in file order. ai_* anchors refer to the next AI response triggered by the previous user clip, while user_* anchors refer to the previous user clip itself.
Generating a Runnable Dataset
Install the script dependencies first:
pip install numpy requests soundfile websockets pyyaml uvicorn fastapi
Optional:
pip install soxr
Install the optional X-Talk dependency required by your chosen TTS backend as well. For example, IndexTTS needs the index-tts extra.
Then generate the dataset:
python scripts/test.py --create logs/test_templates/smoke --out logs/tests
The script loads the TTS model from the root JSON config, synthesizes one WAV file per line, and writes runnable case folders. The generated timestamp.txt will use the format <time_spec>:<audio_file>:<expected_text>.
For example:
logs/tests/smoke/
├── tts_config.json
├── test_config.json
└── basic_turn/
├── audio_000.wav
├── audio_001.wav
├── audio_002.wav
└── timestamp.txt
Running Automated Tests
In --input mode, the script starts an embedded uvicorn server by itself. You do not need to manually start X-Talk first.
Run the generated dataset against a backend service config:
python scripts/test.py --config server_configs/sample_local.json --input logs/tests/smoke --out logs/test_results/smoke
You can also override runtime options from the command line:
python scripts/test.py --config server_configs/sample_local.json --input logs/tests/smoke --out logs/test_results/smoke --concurrency 2 --with-vad
Outputs
The test result folder contains:
<case_name>.mp3: the final mono recording for that case, compressed from the analyzed stereo session audio with high-quality MP3 encoding to save spaceeval.json: overall latency and per-case pass/fail summarylogs/<case_name>.asr.json: expected transcripts, observed ASR events, and optional judge resultsservice_config.json: the effective backend config used for the runtest_config.json: the effective dataset runtime config used for the run
Notes
--with-vadenables client-side VAD. In that mode, remove backendvadfrom the server config to avoid duplicate turn events.--without-vadrequires a backendvadmodel in the server config.- If
judge_asris enabled for any case, configurejudge_llmeither intest_config.jsonor via CLI overrides.