How Transcription Works
Ospri Brain uses a two-stage transcription process:Stage 1: Real-Time Streaming
During the meeting, the bot streams live transcript segments as they’re spoken. These appear in real-time and are useful for monitoring but may have lower accuracy.Stage 2: Final Transcript
After the meeting ends, the full recording is processed through Deepgram (or Recall.ai’s built-in transcription) to generate a high-quality final transcript. This replaces the real-time segments with a more accurate version.Speaker Diarization
Ospri uses Perfect Diarization — each speaker is identified separately using individual audio streams when available (e.g., on Microsoft Teams). This means:- Each speaker gets a consistent label throughout the transcript
- No confusion between speakers even in fast-paced conversations
- Names are matched to calendar attendees when possible
Viewing the Transcript
- Click any meeting in the Recordings tab
- Go to the Transcript tab
- The transcript is displayed chronologically with:
- Speaker name (bold)
- Timestamp (clickable if video is available)
- Spoken text
Alex’s screenshot note: Take a screenshot of the Transcript tab on a meeting detail page showing 5-6 transcript entries with different speaker names and timestamps.
Transcript Status
| Status | Meaning |
|---|---|
| Pending | Meeting ended but transcript hasn’t been generated yet |
| In Progress | Transcript is currently being generated |
| Done | Full transcript is available |
Accuracy Notes
- Final transcripts are generated with accuracy-prioritized settings
- Speaker identification uses separate audio streams when the platform supports it
- For platforms without separate streams, AI-based speaker separation is used
- Medical and scientific terminology is generally handled well but may occasionally need manual correction
Transcript Deduplication
The system automatically prevents duplicate transcript segments. A unique index on(meeting_id, speaker_name, start_time) ensures each spoken segment is stored exactly once, even if real-time and final transcription processes overlap.