MMAudio V2 Video to Audio
Transform your videos with MMAudio V2 Video to Audio, generating high-quality soundscapes. Discover how this AI model can transform your workflow!
🚀Function Overview
Generates high-quality audio from video content with temporal synchronization, optimized for cost efficiency on T4 GPUs.
Key Features
- Transforms visual content into contextually appropriate audio
- Maintains temporal consistency with video events
- Adjustable audio parameters via prompts and settings
- Supports environmental sound synthesis and action-to-sound mapping
- Cost-optimized for T4 hardware
Use Cases
- •Film and video post-production
- •Silent film restoration
- •Educational content enhancement
- •Gaming and VR sound design
- •Accessibility improvements for videos
⚙️Input Parameters
prompt
stringText prompt for generated audio
negative_prompt
stringNegative prompt to avoid certain sounds
video
stringOptional video file for video-to-audio generation
duration
numberDuration of output in seconds
num_steps
integerNumber of inference steps
cfg_strength
numberGuidance strength (CFG)
seed
integerRandom seed. Use -1 or leave blank to randomize the seed
image
stringOptional image file for image-to-audio generation (experimental)
💡Usage Examples
Example 1
Input Parameters
{ "video": "https://huggingface.co/hkchengrex/MMAudio/resolve/main/examples/sora_kraken.mp4", "prompt": "waves, storm", "duration": 10, "num_steps": 25, "cfg_strength": 4.5, "negative_prompt": "music" }
Quick Actions
Technical Specifications
- Hardware Type
- T4
- Run Count
- 391
- Commercial Use
- Unknown/Restricted
- Platform
- Replicate
Related Keywords
Related Models
DrumTest2 Rhythmic Audio Transformer
Transforms any rhythmic sound—a drum kit, beatboxing, a toy drum, even drumming on your belly—into a pro-quality performance on Zohar's studio drum kit.
Speaker Diarization
Speaker Diarization with "pyannote/speaker-diarization-3.1"
Resemble Enhance AI
Optimizes audio files with speech