GPT-4o Mini Transcribe
Discover GPT-4o Mini Transcribe, a cutting-edge speech-to-text model from OpenAI. Ready to experience the power of AI? Start your journey here!
🚀Function Overview
Transcribes audio files to text using GPT-4o mini, offering improved accuracy and language recognition over previous models.
Key Features
- Supports multiple audio formats (mp3, mp4, mpeg, etc.)
- Language specification via ISO-639-1 codes for enhanced accuracy
- Optional text prompts to guide transcription style
- Adjustable sampling temperature for output control
- 16,000 token context window for long audio handling
- 2,000 max output tokens
- Improved word error rate over Whisper models
Use Cases
- •Converting spoken content into written transcripts
- •Generating subtitles for video/audio content
- •Accessibility tools for hearing-impaired users
- •Meeting/lecture documentation
⚙️Input Parameters
audio_file
stringThe audio file to transcribe. Supported formats: mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm
language
stringThe language of the input audio. Supplying the input language in ISO-639-1 (e.g. en) format will improve accuracy and latency.
prompt
stringAn optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
temperature
numberSampling temperature between 0 and 1
💡Usage Examples
Example 1
Input Parameters
{ "language": "en", "audio_file": "https://replicate.delivery/xezq/ejt5KPWzFp25fUGtjPhwFmeeG5nFpCvu5zSMIySXnemTWn0lC/tmptuxz6n1z.mp3", "temperature": 0 }
Output Results
Quick Actions
Technical Specifications
- Hardware Type
- Run Count
- 197
- Commercial Use
- Supported
- Pricing
- Priced by multiple properties
- Platform
- Replicate
Related Keywords
Related Models
DrumTest2 Rhythmic Audio Transformer
Transforms any rhythmic sound—a drum kit, beatboxing, a toy drum, even drumming on your belly—into a pro-quality performance on Zohar's studio drum kit.
Speaker Diarization
Speaker Diarization with "pyannote/speaker-diarization-3.1"
Resemble Enhance AI
Optimizes audio files with speech