Local S2S Shell Starter
A simple local speech-to-speech assistant that runs from a Windows terminal.
Stack
- STT: faster-whisper medium
- LLM: Qwen2.5 3B Instruct GGUF Q4_K_M
- TTS: Windows SAPI voice
- UI: terminal only
Pipeline
microphone -> faster-whisper -> Qwen2.5 3B GGUF -> Windows SAPI speech
Hardware Target
- CPU fallback supported
- NVIDIA GPU auto-used when available
- 8GB+ VRAM recommended for smoother local use
Setup
Run from PowerShell:
py -3.11 -m venv .venv ..venv\Scripts\python.exe -m pip install --upgrade pip setuptools wheel ..venv\Scripts\python.exe -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 ..venv\Scripts\python.exe -m pip install -r requirements.txt ..venv\Scripts\python.exe download_models.py
Run
.\run_shell_s2s.bat
Shell Commands
Enter = record mic and run speech-to-speech t = type text and hear reply d = list audio devices q = quit
Model Download
The downloader fetches:
Repo: bartowski/Qwen2.5-3B-Instruct-GGUF File: Qwen2.5-3B-Instruct-Q4_K_M.gguf
The GGUF model file is not committed to this repository.
Scope
This is a local voice-chat starter. It does not control the computer, run tools, or perform system automation.
- Downloads last month
- 6