Always Available
Failover automat și redundanță asigură că voice AI rămâne disponibil chiar și în cazul defecțiunilor de rețea sau carrier.
High Availability Architecture
Primary Region
București
ActiveSecondary Region
Frankfurt
Hot StandbyDR Region
Amsterdam
Warm StandbyPrimary (București) → Secondary (Frankfurt) → DR (Amsterdam)
RTO: <30 seconds | RPO: 0 (real-time replication)
SIP Trunk Failover
1
Primary: Twilio
sip.twilio.com
Active
Latency: 25ms
2
Secondary: Vonage
sip.vonage.com
Standby
Last check: 5s ago
3
Tertiary: Bandwidth
sip.bandwidth.com
Standby
Last check: 5s ago
Failover triggers: 503 Service Unavailable, 408 Timeout, Network Unreachable
Failover Configuration
// Failover configuration
{
"failover": {
"mode": "automatic",
"health_check_interval": 5000, // 5 seconds
"failure_threshold": 3, // 3 consecutive failures
"recovery_threshold": 5, // 5 consecutive successes
"triggers": [
"503_service_unavailable",
"408_request_timeout",
"network_unreachable",
"latency_exceeded",
"packet_loss_exceeded"
],
"latency_threshold_ms": 200,
"packet_loss_threshold_pct": 5,
"notification": {
"channels": ["slack", "pagerduty", "email"],
"on_failover": true,
"on_recovery": true
}
}
}Component Redundancy
Voice AI Processing
- ✓ Multiple STT providers (Deepgram, Google, Whisper)
- ✓ Multiple TTS providers (ElevenLabs, Azure, Google)
- ✓ LLM failover (GPT-4 → Claude → Local)
- ✓ Load balanced media servers
Infrastructure
- ✓ Multi-AZ deployment
- ✓ Database replication (sync)
- ✓ Redis cluster with sentinels
- ✓ Kubernetes with node auto-scaling
Health Checks
| Component | Check Type | Interval | Status | Last Check |
|---|---|---|---|---|
| Twilio SIP | OPTIONS ping | 5s | Healthy | 2s ago |
| Voice AI API | HTTP /health | 10s | Healthy | 3s ago |
| STT Service | Test transcription | 30s | Healthy | 15s ago |
| Database | Query test | 10s | Healthy | 1s ago |
Recent Failover Events
✓ Recovered
Dec 5, 14:23Twilio SIP trunk restored after 45s outage
⚡ Failover Triggered
Dec 5, 14:22Switched to Vonage due to Twilio 503 errors
✓ Recovered
Dec 3, 09:15STT provider back online, returned to primary
Uptime Metrics
99.99%
Platform Uptime
Last 12 months
28s
Avg Failover Time
Detection + Switch
3
Failover Events
Last 30 days
0
Dropped Calls
During failovers