Speech Recognition with whisper-large-v3
Transcribe audio effortlessly with whisper-large-v3
.
Step 1: Prep Your Audio
Prepare your audio file in a common format like WAV, MP3, or M4A.
Step 2: Send a Transcription Request
import requests
url = "https://api.deeprequest.io/v1/audio/transcriptions"headers = {"Authorization": "Bearer YOUR_API_KEY"}
files = { "file": ("audio.mp3", open("path/to/your/audio.mp3", "rb")),}
data = { "model": "whisper-large-v3", "response_format": "json" # Options: json, text, srt, vtt, verbose_json}
# Optional: specify language for better accuracy# data["language"] = "en"
response = requests.post(url, files=files, data=data, headers=headers)print(response.json()["text"])
Step 3: Translation (Optional)
To translate audio directly to English:
import requests
url = "https://api.deeprequest.io/v1/audio/translations"headers = {"Authorization": "Bearer YOUR_API_KEY"}
files = { "file": ("audio.mp3", open("path/to/your/audio.mp3", "rb")),}
data = { "model": "whisper-large-v3", "response_format": "json"}
response = requests.post(url, files=files, data=data, headers=headers)print(response.json()["text"])
Pro Tips
- For high-quality results, use clear audio with minimal background noise
- File size limit is 25MB
- For longer audio files, consider chunking them into smaller segments
- Set the
language
parameter for better accuracy when the language is known - Use
response_format: "verbose_json"
to get timestamps and confidence scores