PrunaAI Dia 1.6B
Discover PrunaAI Dia 1.6B, the cutting-edge model for expressive voice generation. Craft dynamic multi-speaker dialogues with non-verbal cues.
🚀Function Overview
Generates voice audio from formatted dialogue text with customizable parameters for expression, duration, and speech characteristics.
Key Features
- Multi-speaker dialogue support (using [S1], [S2] markers)
- Non-verbal cue integration (e.g., laughs, whispers)
- Adjustable audio length via token control
- Speech faithfulness and randomness parameters
- Playback speed modification
Use Cases
- •Generating voiceovers for animations
- •Creating podcast dialogues with multiple speakers
- •Producing expressive dialogue for virtual assistants
- •Adding vocal effects in audio storytelling
⚙️Input Parameters
text
stringInput text for dialogue generation. Use [S1], [S2] to indicate different speakers and (description) in parentheses for non-verbal cues e.g., (laughs), (whispers).
max_new_tokens
integerControls the length of generated audio. Higher values create longer audio. (86 tokens ≈ 1 second of audio).
cfg_scale
numberControls how closely the audio follows your text. Higher values (3-5) follow text more strictly; lower values may sound more natural but deviate more.
temperature
numberControls randomness in generation. Higher values (1.3-2.0) increase variety; lower values (0.1-0.9) make output more consistent and predictable.
top_p
numberControls diversity of word choice. Higher values include more unusual options. Most users shouldn't need to adjust this parameter.
cfg_filter_top_k
integerTechnical parameter for filtering audio generation tokens. Higher values allow more diverse sounds; lower values create more consistent audio.
speed_factor
numberAdjusts playback speed of the generated audio. Values below 1.0 slow down the audio; 1.0 is original speed.
seed
integerRandom seed for reproducible results. Use the same seed value to get the same output for identical inputs. Leave blank for random results each time.
💡Usage Examples
Example 1
Input Parameters
{ "seed": -1, "text": "[S1] It's on Replicate!!! Oh fire! Oh my goodness! What's the procedure? What do we do people? The Dia text-to-speech model — now Pruna-optimized — just dropped on Replicate!!\n\n[S2] Oh my god! Okay… it's happening. Everybody stay calm!\n\n[S1] What's the procedure…\n\n[S2] Everybody stay fricking calm!!!... Everybody fudging calm down!!!!!\n\n[S1] Yes! Yes! Let's try it out at prunaai/dia-1.6b (laughs) — powered up and made leaner with Pruna!\n\n[S2] (whispers) try it now… (whispers) turbocharged by Pruna…", "top_p": 0.95, "cfg_scale": 3, "temperature": 1.3, "speed_factor": 0.94, "max_new_tokens": 3072, "cfg_filter_top_k": 35 }
Quick Actions
Technical Specifications
- Hardware Type
- A100 (80GB)
- Run Count
- 1.7k
- Commercial Use
- Unknown/Restricted
- Platform
- Replicate