Codec = Quality + Efficiency

Codec-urile comprimă și decomprimă audio. Alegerea corectă balansează calitatea vocii cu utilizarea bandwidth-ului și latența.

Codec Comparison

Codec	Bitrate	Sample Rate	MOS	Latency	Best For
G.711 μ-law	64 kbps	8 kHz	4.1	Very Low	PSTN compatibility
G.711 A-law	64 kbps	8 kHz	4.1	Very Low	EU PSTN
Opus	6-510 kbps	8-48 kHz	4.5+	Low	WebRTC, quality
G.729	8 kbps	8 kHz	3.9	Medium	Bandwidth savings
G.722	64 kbps	16 kHz	4.3	Low	HD Voice
SILK	6-40 kbps	8-24 kHz	4.2	Low	Variable networks

Opus: Recommended for Voice AI

Why Opus?

✓ Adaptive bitrate based on network
✓ Wideband audio (16 kHz+) for clear speech
✓ Low latency (2.5-60ms frames)
✓ Excellent packet loss resilience
✓ Open and royalty-free
✓ Native WebRTC support

Opus Configuration

{
  "codec": "opus",
  "bitrate": 24000,      // 24 kbps
  "sampleRate": 16000,   // 16 kHz
  "channels": 1,         // Mono
  "frameSize": 20,       // 20ms frames
  "fec": true,           // Forward error correction
  "dtx": true            // Discontinuous transmission
}

G.711: PSTN Standard

μ-law (PCMU)

Used in North America and Japan. Optimized for voice frequencies.

Sample Rate:8000 Hz

Bit Depth:8 bits

Payload Type:0

A-law (PCMA)

Used in Europe and rest of world. Slightly better SNR.

Sample Rate:8000 Hz

Bit Depth:8 bits

Payload Type:8

Codec Selection Strategy

PSTN Calls (Inbound/Outbound)

Use G.711 (PCMU/PCMA) - universal compatibility, no transcoding needed.

Offer: PCMU, PCMA → Accept: carrier default

WebRTC Browser Calls

Use Opus - best quality, adaptive to network conditions.

Offer: opus/48000/2 → Best audio quality

Low Bandwidth Scenarios

Use G.729 or Opus at low bitrate - efficient compression.

Offer: G729, opus@8kbps → Bandwidth efficient

Quality vs Bandwidth

G.711

Bandwidth: 87.2 kbps

Quality: 82%

Opus 24kbps

Bandwidth: 40 kbps

Quality: 90%

Opus 16kbps

Bandwidth: 32 kbps

Quality: 85%

G.722

Bandwidth: 87.2 kbps

Quality: 86%

G.729

Bandwidth: 24 kbps

Quality: 78%

* Including RTP/UDP/IP overhead. Quality measured as MOS equivalent percentage.

Transcoding Considerations

Avoid When Possible

Transcoding adds latency și poate degrada calitatea.

• +5-20ms latency per transcode
• Quality loss (especially lossy→lossy)
• CPU resources consumed

When Necessary

Use dedicated transcoding resources.

• WebRTC ↔ PSTN calls
• Different codec endpoints
• Recording in specific format

Voice AI Codec Requirements

Component	Preferred Format	Reason
STT (Speech-to-Text)	16 kHz, 16-bit PCM	Optimal for speech recognition
TTS (Text-to-Speech)	24 kHz, 16-bit PCM	High quality synthesis output
Recording Storage	Opus or MP3	Storage efficiency
Live Playback	Match caller codec	Avoid transcoding

Crystal Clear Audio

Codec optimization pentru calitate vocală superioară.

Vezi Demo →

Audio Codecs