ads
Thursday, December 18, 2025
Show HN: MiraTTS, a 48kHz Open-Source TTS at 100x Real-Time Speed https://ift.tt/MyZKhjn
Show HN: MiraTTS, a 48kHz Open-Source TTS at 100x Real-Time Speed I’ve been working on MiraTTS, a fine-tune of Spark-TTS designed for high realism and stable text-to-speech. The goal was to create an incredibly fast but high quality model. Most open TTS models are either computationally heavy or generate 16-24kHz audio. Mira achieves high fidelity and speed by combining two things: FlashSR: For generating crisp and clearer 48kHz audio outputs. LMDeploy: Heavily optimized inference allowing for 100x real-time speed and low latency (roughly150ms). I built this so local users have access to a high quality local text-to-speech model that works for any usecase. It’s currently in its early stages, and I'm currently experimenting with multilingual versions and multi-speaker versions. Streaming is coming soon as well. Repo: https://ift.tt/Xb2Tyjx Model: https://ift.tt/FT3rWvX I also wrote a breakdown on how these LLM based TTS models work: https://ift.tt/DN63pFM https://ift.tt/Xb2Tyjx December 18, 2025 at 11:24PM
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment