Voice Synthesis Models
Find 14 advanced text-to-speech and voice generation models.
Found 14 models
thomcle/chatterbox-tts
Experience Chatterbox TTS for state-of-the-art zero-shot voice cloning and expressive speech synthesis. Perfect for AI agents and video voiceovers.
nvidia/pdf-to-podcast
Transform your PDFs into engaging AI podcasts with NVIDIA PDF to Podcast. Convert documents to audio, customize voices, and set durations for rich, on-the-go content.
mosnfar/piper_persian
Get Piper Persian Text-to-Speech for efficient, local speech synthesis. Optimized for Raspberry Pi, it converts Persian text to audio, ideal for voice assistants and accessibility.
minimax/voice-cloning
Minimax Voice Cloning allows you to create custom AI voices from short audio samples, perfect for personalized TTS applications and generating unique audio content.
minimax/speech-02-turbo
Experience Minimax Speech-02 Turbo for real-time voice synthesis and voice cloning. Unlock emotional expression and multilingual support for dynamic audio applications.
minimax/speech-02-hd
Discover Minimax Speech-02-HD: advanced text-to-audio with emotional expression and multilingual support for high-fidelity voiceovers and audiobooks.
zsxkib/kimi-audio-7b-instruct
Explore Kimi Audio 7B Instruct for universal audio processing, speech transcription, and emotion recognition. Unlock advanced audio AI capabilities now.
prunaai/dia-1.6b
PrunaAI Dia 1.6B revolutionizes expressive voice generation with multi-speaker dialogue and non-verbal cues. Create dynamic, natural-sounding audio for diverse applications.
zsxkib/dia
Unlock realistic dialogue audio with Dia 1.6B. Generate multi-speaker conversations, non-verbal cues, and even clone voices for your projects.
acappemin/deepaudio-v1
DeepAudio-V1 Model excels in Video-to-Speech and Video-to-Audio generation, transforming video inputs into synchronized speech and audio. Experience seamless multimedia content creation.
gianpaj/cog-orpheus-3b-0.1-ft
Experience Cog Orpheus 3B, a multilingual text-to-speech model offering voice cloning and emotional speech. Perfect for real-time applications and expressive audio.
tuannha/f5-tts-vi
F5-TTS Vietnamese converts text to speech with reference-audio voice cloning and adjustable speed. See required inputs, sample settings, and practical use cases before testing.
jichengdu/spark-tts
Spark TTS offers advanced text-to-speech generation with voice cloning for personalized audio and custom voice creation with adjustable parameters. Explore its features!
OpenMOSS-Team/MOSS-TTS-v1.5
Review MOSS-TTS v1.5 model details: OpenMOSS model ID, voice cloning, long-form TTS, multilingual synthesis, Pinyin/IPA control, pause control, and local runtime caveats.
Related Categories
Image Generation
Discover 282 AI models for content creation and text generation.
Image Editing
Explore 39 professional AI models for photo editing, enhancement, and manipulation.
Language Models
Browse 27 large language models for conversation and reasoning.
Video Generation
Access 22 AI models for video creation and editing.