TTSTT
A full-stack Arabic speech toolkit: text-to-speech with voice cloning (Habibi, SILMA), in-browser speech-to-text with Whisper, a web dashboard, and webhook APIs for n8n and automation. Everything runs locally, with no cloud dependencies.
Upload a voice sample once, then synthesize Arabic speech at any speed; drop in an audio file and get an instant transcript right in your browser; expose both as webhooks for n8n automation. Everything stays on your own machine — no cloud vendor lock-in.
What it solves
- Most TTS/STT services fall short on Arabic in quality, latency, and privacy.
- Voice cloning usually means switching tools and re-uploading the reference audio again and again.
- Automation tools like n8n and Make lack native, low-friction Arabic TTS/STT nodes.
Impact
Arabic TTS (Habibi + SILMA)
Whisper v3 (no server needed)
Zero cloud dependencies

Architecture
Data flow
- User uploads 5–15s voice sample
- Store reference audio on Python server
- TTS: enter Arabic text + speed
- Next.js proxies to FastAPI /synthesize
- F5-TTS or SILMA generates audio
Visualized in Wavesurfer.js
- STT: drop audio → Whisper in browser
- Instant transcript, optionally logged to Neon
- Webhooks: n8n/Make calls + SSE broadcast
Engineering decisions
Voice cloning with single reference upload
Upload a 5–15s voice sample once per model; F5-TTS and SILMA condition on that clip for consistent voice across all future syntheses. No repeated uploads, no API keys.
Browser-native Whisper STT
Whisper Large V3 runs in-browser via ONNX.js; zero server requests, zero API keys, instant transcription. Users keep 100% control of their audio data.
Local-first architecture with optional cloud database
TTS (Python) and STT (browser) run entirely offline. Optional Neon PostgreSQL logs history for dashboards and n8n integration, but all compute stays local.
Webhook APIs for automation (n8n, Make, Zapier)
POST to /api/webhook/tts or /api/webhook/stt for voice synthesis or transcription; every call is logged with timestamps and request IDs. n8n node templates included.
Real-time webhook event stream via SSE
Connect to /api/events (Server-Sent Events) to watch every webhook call in real-time; dashboard shows live TTS/STT requests as they arrive.
Gallery

