Description
Description
The OpenAI API supports streaming for transcription with the gpt-4o-transcribe
and gpt-4o-mini-transcribe
models, but this functionality is not currently exposed in the library. This feature would be particularly useful for real-time transcription of longer audio files.
Current Behavior
Currently, the library only supports non-streaming transcription through the transcribe()
method, which waits for the entire audio file to be processed before returning the result.
Expected Behavior
Add support for streaming transcription similar to how speechStreamed()
is implemented for text-to-speech. This would allow for:
- Real-time transcription output
- Better handling of longer audio files
- Progress monitoring during transcription
- Reduced memory usage for large files
Proposed Implementation
The implementation could follow a similar pattern to the existing speechStreamed()
method, but for the transcription endpoint. This would involve:
- Adding a new method
transcribeStreamed()
to theAudio
resource class - Supporting the
stream: true
parameter in the transcription request - Handling the streaming response format
API Reference
The streaming functionality is documented in the OpenAI API reference:
https://platform.openai.com/docs/api-reference/audio/createTranscription
Additional Context
This feature would be particularly valuable for applications that need to:
- Show real-time transcription progress
- Handle large audio files efficiently
- Provide immediate feedback to users during transcription