Voice Synthesis Models
Find 13 advanced text-to-speech and voice generation models.
Found 13 models
thomcle/chatterbox-tts
Experience Chatterbox TTS for state-of-the-art zero-shot voice cloning and expressive speech synthesis. Perfect for AI agents and video voiceovers.
nvidia/pdf-to-podcast
Transform your PDFs into engaging AI podcasts with NVIDIA PDF to Podcast. Convert documents to audio, customize voices, and set durations for rich, on-the-go content.
mosnfar/piper_persian
Get Piper Persian Text-to-Speech for efficient, local speech synthesis. Optimized for Raspberry Pi, it converts Persian text to audio, ideal for voice assistants and accessibility.
minimax/voice-cloning
Minimax Voice Cloning allows you to create custom AI voices from short audio samples, perfect for personalized TTS applications and generating unique audio content.
minimax/speech-02-turbo
Experience Minimax Speech-02 Turbo for real-time voice synthesis and voice cloning. Unlock emotional expression and multilingual support for dynamic audio applications.
minimax/speech-02-hd
Discover Minimax Speech-02-HD: advanced text-to-audio with emotional expression and multilingual support for high-fidelity voiceovers and audiobooks.
zsxkib/kimi-audio-7b-instruct
Explore Kimi Audio 7B Instruct for universal audio processing, speech transcription, and emotion recognition. Unlock advanced audio AI capabilities now.
prunaai/dia-1.6b
PrunaAI Dia 1.6B revolutionizes expressive voice generation with multi-speaker dialogue and non-verbal cues. Create dynamic, natural-sounding audio for diverse applications.
zsxkib/dia
Unlock realistic dialogue audio with Dia 1.6B. Generate multi-speaker conversations, non-verbal cues, and even clone voices for your projects.
acappemin/deepaudio-v1
DeepAudio-V1 Model excels in Video-to-Speech and Video-to-Audio generation, transforming video inputs into synchronized speech and audio. Experience seamless multimedia content creation.
gianpaj/cog-orpheus-3b-0.1-ft
Experience Cog Orpheus 3B, a multilingual text-to-speech model offering voice cloning and emotional speech. Perfect for real-time applications and expressive audio.
tuannha/f5-tts-vi
F5-TTS Vietnamese Text-to-Speech model by EraX-AI offers zero-shot voice cloning and adjustable speech for personalized voiceovers. Get natural Vietnamese audio.
jichengdu/spark-tts
Spark TTS offers advanced text-to-speech generation with voice cloning for personalized audio and custom voice creation with adjustable parameters. Explore its features!
Related Categories
Image Generation
Discover 282 AI models for content creation and text generation.
Image Editing
Explore 39 professional AI models for photo editing, enhancement, and manipulation.
Video Generation
Access 22 AI models for video creation and editing.
Language Models
Browse 18 large language models for conversation and reasoning.