F5-TTS Vietnamese Text-to-Speech
Discover F5-TTS Vietnamese Text-to-Speech, a powerful model from EraX-AI. It excels in zero-shot voice cloning and produces natural Vietnamese speech.
🚀Function Overview
A Vietnamese text-to-speech model that converts input text into speech using reference audio for voice cloning with adjustable speech parameters.
Key Features
- Converts Vietnamese text to natural speech
- Supports zero-shot voice cloning via reference audio
- Adjustable speech speed control
- Automatic text extraction from reference audio when needed
Use Cases
- •Creating personalized voiceovers for Vietnamese content
- •Developing voice assistants with custom voices
- •Accessibility tools for text-to-audio conversion
- •Generating speech samples for language learning
⚙️Input Parameters
input_text
stringInput text to convert to speech | Văn bản cần đọc
reference_audio
stringReference audio file in wav format | Giọng tham chiếu, chỉ nhận file wav
reference_text
stringReference text, must match the reference audio | Văn bản tham chiếu, phải trùng với giọng tham chiếu, nếu trống sẽ tự động trích xuất từ audio
speed
numberSpeed of the speech | Tốc độ đọc
💡Usage Examples
Example 1
Input Parameters
{ "speed": 1, "input_text": "Đây là đài tiếng nói Việt Nam. Phát thanh từ kênh trung ương Hà Nội.", "reference_text": "Càng trưởng thành bạn càng nhận ra tranh luận đúng sai, hơn thua cũng không còn quan trọng nữa.", "reference_audio": "https://replicate.delivery/pbxt/Mr9f6kUPs0oOP841x4pV4n0vSBafLSf4yEgSjYkOAqN57CUl/ash_vn.wav" }
Quick Actions
Technical Specifications
- Hardware Type
- T4
- Run Count
- 79
- Commercial Use
- Unknown/Restricted
- Platform
- Replicate
Related Keywords
Related Models
Chatterbox TTS
Chatterbox is a state-of-the-art zeroshot TTS
NVIDIA PDF to Podcast
Transform PDFs into AI podcasts for engaging on-the-go audio content.
Piper Persian Text-to-Speech
A fast, local neural text to speech system that sounds great and is optimized for the Raspberry Pi 4. Piper is used in a variety of projects.