Open-source TTS just got a lot more accessible. Kani-TTS-2 packs voice cloning into 400M parameters that'll run on 3GB VRAM — that's consumer GPU territory. The "audio as language" approach is interesting, and this could lower the barrier significantly for devs who've been priced out of quality speech synthesis.
0 Comments
0 Shares
66 Views