Web UI for WhisperSpeech (https://github.com/collabora/WhisperSpeech)
Note
Versions 2.x, 3.x and 4.x allow voice generation via API.
Note
Version 4.x supports SillyTavern (1.13.1)
Name | Info |
---|---|
CPU | AMD Ryzen 9950X3D |
GPU | AMD Radeon 7900XTX |
RAM | 64GB DDR5 6600MHz |
Motherboard | ASRock B650E PG Riptide WiFi (BIOS 3.25) |
OS | Ubuntu 24.04.2 LTS |
Kernel | 6.11.0-29-generic |
ROCm | 6.4.1 |
Name | Info |
---|---|
CPU | IntelCore i5-12500H |
GPU | NVIDIA GeForce RTX 4050 |
RAM | 16GB DDR4 3200MHz |
Motherboard | GIGABYTE G5 MF (BIOS FB10) |
OS | Ubuntu 25.04 |
Kernel | 6.14.0-24 |
NVIDIA Driver | 570.169 |
CUDA | 12.8 |
1. Install uv and ffmpeg.
2. Clone repository.
3. Mount the repository directory.
3. Create and activate venv using uv.
4. Run webui.py:
CPU (not recommended):
uv run --extra cpu webui.py
ROCm 6.3:
uv run --extra rocm webui.py
CUDA 12.8:
uv run --extra cuda webui.py
1. Copy whisperspeech-webui folder to SillyTavern/public/scripts/extensions/third-party
2. Run app with api support.
# This is example. If you are using SillyTavern locally, you can remove --listen parameter.
uv run --extra cpu webui.py --listen --api
3. Launch SillyTavern. In the extensions tab, expand WhisperSpeech web UI and set the IP and port.
Languages |
---|
English |
Polish |
1. Extract messages.pot:
pybabel extract -F babel.cfg -o ./locale/messages.pot .
2. Generate files:
New language:
pybabel init -i ./locale/messages.pot -d ./locale -l pl_PL
# Replace pl_PL by your language
Update current:
pybabel update -i ./locale/messages.pot -d ./locale -l pl_PL
# Replace pl_PL by your language
3. Compile:
pybabel compile -d ./locale