Media Processing Foundation

Media servers gestionează fluxurile audio: mixing, transcoding, recording, și interfața cu voice AI processing.

Media Server Options

FreeSWITCH

Open-source, highly scalable. Best for complex deployments.

✓ High performance
✓ Lua/JavaScript scripting
✓ WebRTC native
✓ Enterprise features

Recommended

Asterisk

Mature, well-documented. Great for traditional PBX.

✓ Huge community
✓ Dialplan flexibility
✓ AGI scripting
✓ Proven stability

Established

Janus

WebRTC gateway focused. Lightweight and modern.

✓ WebRTC native
✓ Plugin architecture
✓ Low footprint
✓ Modern C code

WebRTC Focus

FreeSWITCH Architecture

Core

Event-driven architecture, handles call state, routing, and module coordination.

Endpoints

SIP, WebRTC, Verto

Codecs

G.711, Opus, G.729

Applications

IVR, Conference, Record

Event Socket

External control interface for Voice AI integration. Send commands, receive events.

Voice AI Integration

// FreeSWITCH ESL integration for Voice AI
const { Connection } = require('modesl');

const esl = new Connection('127.0.0.1', 8021, 'ClueCon');

esl.on('esl::ready', () => {
  // Subscribe to call events
  esl.subscribe(['CHANNEL_CREATE', 'CHANNEL_ANSWER',
                 'CHANNEL_HANGUP', 'DTMF']);
});

esl.on('esl::event::CHANNEL_ANSWER', (event) => {
  const uuid = event.getHeader('Unique-ID');

  // Start Voice AI processing
  esl.api('uuid_audio_fork',
    `${uuid} start ws://voice-ai:8080/audio both`);

  // Play greeting
  esl.execute('speak',
    'flite|kal|Hello, how can I help you?', uuid);
});

// Receive transcription from Voice AI
voiceAI.on('transcription', (text, uuid) => {
  // Process with LLM and respond...
});

Media Server Functions

Transcoding

Convert between codecs (G.711 ↔ Opus).

Mixing

Combine multiple audio streams (conferencing).

Recording

Capture audio to file (compliance, training).

DTMF

Generate and detect touch-tones.

Audio Fork

Split audio stream to Voice AI.

Playback

Play audio files or TTS output.

VAD

Voice Activity Detection for barge-in.

AGC

Automatic Gain Control for levels.

Server Metrics

156

Active Channels

23%

CPU Usage

4.2GB

Memory Used

1000

Max Channels

Deployment Best Practices

Resource Allocation

• 1 CPU core per 50-100 concurrent calls
• 4GB RAM minimum, 8GB+ recommended
• SSD storage for recordings
• Dedicated network interface

Network Configuration

• Disable firewall conntrack for RTP
• Use jumbo frames if available
• QoS marking (DSCP EF)
• Separate signaling/media interfaces

Media Processing Power

Media servers pentru voice AI de performanță.

Vezi Demo →

Media Servers