Ce Este Speaker Diarization
Speaker diarization răspunde la întrebarea "cine a vorbit când?". Esențial pentru analytics, compliance și meeting transcription.
Who
Identify speaker
When
Timestamps
What
Transcription
Use Cases
Call Center Analytics
Analyze agent vs customer separately
Meeting Transcription
Attribute statements to participants
Multi-party Calls
Track who said what in group calls
Compliance Recording
Verify specific disclosures were made
Processing Pipeline
1
Voice Activity Detection
Find when someone is speaking
2
Speaker Embedding
Extract voice signature for each segment
3
Clustering
Group segments by speaker
4
Speaker Assignment
Label each cluster as Speaker 1, 2, etc.
5
Timestamp Alignment
Sync with transcription
Output Example
{
"transcript": [
{
"speaker": "Agent",
"start": 0.0,
"end": 3.2,
"text": "Bună ziua, cu ce vă pot ajuta?"
},
{
"speaker": "Customer",
"start": 3.5,
"end": 8.1,
"text": "Bună, am o problemă cu comanda mea."
},
{
"speaker": "Agent",
"start": 8.4,
"end": 12.0,
"text": "Sigur, îmi puteți spune numărul comenzii?"
}
],
"speakers": {
"Agent": { "total_time": 6.8, "word_count": 15 },
"Customer": { "total_time": 4.6, "word_count": 8 }
}
}Quality Metrics
| Metric | Description | Target |
|---|---|---|
| Diarization Error Rate (DER) | Overall error including missed/false speech | <15% |
| Speaker Confusion | Wrong speaker assignment | <10% |
| Boundary Accuracy | Turn-taking precision | ±200ms |