GPT-4o Transcribe
Transform your audio into accurate text with GPT-4o Transcribe. Ready to experience the power of AI? Start your journey here!
🚀Function Overview
Transcribes audio files into text using GPT-4o technology with improved accuracy over previous models.
Key Features
- High-accuracy speech recognition
- Supports multiple audio formats (mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm)
- Optional language specification for better accuracy
- Prompt-guided transcription style control
- Temperature parameter for output variability
- 16,000 token context window
- 2,000 token output limit
- Up-to-date knowledge (June 2024)
Use Cases
- •Transcribing meetings or interviews
- •Generating captions for videos
- •Accessibility applications for audio content
- •Multilingual content transcription
- •Academic research transcription
⚙️Input Parameters
audio_file
stringThe audio file to transcribe. Supported formats: mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm
language
stringThe language of the input audio. Supplying the input language in ISO-639-1 (e.g. en) format will improve accuracy and latency.
prompt
stringAn optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
temperature
numberSampling temperature between 0 and 1
💡Usage Examples
Example 1
Input Parameters
{ "language": "en", "audio_file": "https://replicate.delivery/xezq/XoxHeakty0z3KKc46cMLPKC2ct54ekT3EtvcwDQuRIuxfJdpA/tmpsglqtqn5.mp3", "temperature": 0 }
Output Results
Quick Actions
Technical Specifications
- Hardware Type
- Run Count
- 975
- Commercial Use
- Supported
- Pricing
- Priced by multiple properties
- Platform
- Replicate
Related Keywords
Related Models
DrumTest2 Rhythmic Audio Transformer
Transforms any rhythmic sound—a drum kit, beatboxing, a toy drum, even drumming on your belly—into a pro-quality performance on Zohar's studio drum kit.
Speaker Diarization
Speaker Diarization with "pyannote/speaker-diarization-3.1"
Resemble Enhance AI
Optimizes audio files with speech