# Code Analysis: Live Calls Service

## Overview
This is a **FastAPI-based WebSocket service** for handling real-time voice calls, integrating **Mcube** (telephony platform) with **ElevenLabs** (Conversational AI). The system manages concurrent call sessions with proper isolation and resource management.

## Architecture

### Core Components

#### 1. **Main Application** (`main.py`)
- FastAPI server with WebSocket endpoints
- Health check endpoints (`/health`, `/health/performance`)
- WebSocket handler at `/ws/{session_id}` for Mcube call connections
- CORS middleware enabled for all origins
- Runs on port 7900 (configurable)

#### 2. **Call Management** (`call_manager.py`)
- **CallManager**: Manages up to 50 concurrent call sessions
- Uses semaphore for concurrency control
- Session isolation: Each call gets its own `CallSession` instance
- Thread-safe session registration/removal with async locks
- System status monitoring

#### 3. **Call Session** (`call_session.py`)
- **CallSession**: Isolated session per call with:
  - WebSocket connection manager
  - Audio service
  - Bot configuration service
  - ElevenLabs WebSocket service
- Handles:
  - Call start events (extracts business ID, agent ID, customer info)
  - Media events (audio streaming)
  - Played stream events (audio completion tracking)
  - ElevenLabs integration (audio, transcripts, VAD scores, interruptions)
  - Tool requests (end_call, transfer, voicemail detection, etc.)
- Audio queue management for reconnection scenarios
- VAD (Voice Activity Detection) for interruption handling

#### 4. **Connection Management** (`connection_manager.py`)
- **WebSocketConnectionManager**: Manages Mcube WebSocket communication
- **ConnectionState**: Tracks stream_id and call_id
- Routes events: `media`, `start`, `playedStream`
- Handles WebSocket disconnections gracefully

#### 5. **Audio Processing** (`audio_service.py`)
- **AudioService**: Coordinates audio operations
- **AudioFormatConverter**: Converts between formats (WAV, μ-law)
- **AudioTimingManager**: Tracks response timing
- **AudioBufferManager**: Manages audio chunks and synchronization marks
- Supports both 8000 Hz and 16000 Hz μ-law audio
- Processes raw audio bytes from TTS services

#### 6. **ElevenLabs Integration** (`elevenlabs_websocket_service.py`)
- **ElevenLabsWebSocketService**: Real-time conversational AI integration
- Features:
  - Bidirectional audio streaming
  - Conversation initiation with dynamic variables
  - Message handling (transcripts, responses, audio, VAD scores, interruptions)
  - Keep-alive mechanism (user_activity every 5 seconds)
  - Connection retry logic with exponential backoff
  - Rate limiting for audio chunks (80ms minimum interval)
- Audio format: `ulaw_8000` (configurable)

#### 7. **Mcube Service** (`mcube_service.py`)
- **McubeService**: Static utility class for Mcube protocol
- Creates messages: `playAudio`, `checkpoint`, `clearAudio`, `transfer`, `terminate`
- Extracts data from Mcube events

#### 8. **Bot Configuration** (`bot_configuration_service.py`)
- **BotConfigurationService**: Dynamic bot configuration from database
- Database structure:
  - **Master DB** (`voicebot_master`): DID number mapping, business info
  - **Cluster DB** (`voicebot_cluster`): Bot configurations per business
- Async database operations using thread pool
- Loads bot config by DID number with parallel queries

#### 9. **Connection Limiter** (`elevenlabs_connection_limiter.py`)
- **ElevenLabsConnectionLimiter**: Singleton limiter for ElevenLabs connections
- Limits: 30 concurrent connections (configurable)
- Uses semaphore for slot management
- Prevents exceeding workspace subscription limits

#### 10. **Logging** (`log_utils.py`)
- **Log**: Centralized logging utility
- Logs to both console and file (`logs/voicebot.log`)
- Methods: `info`, `error`, `warning`, `debug`, `event`, `json`
- Handles Unicode encoding issues

### Configuration (`config.py`)
- Environment-based configuration using `python-dotenv`
- Key settings:
  - Port: 7900
  - Mcube WebSocket URL
  - ElevenLabs API key and Agent ID
  - Database credentials (MySQL)
  - Audio format and sample rate
  - Agent phone number for transfers

## Data Flow

### Call Flow
1. **Call Start**: Mcube sends `start` event → Extract metadata → Initialize ElevenLabs WebSocket
2. **Audio Streaming**: 
   - **Inbound**: Mcube `media` events → Convert μ-law to PCM → Send to ElevenLabs
   - **Outbound**: ElevenLabs audio → Convert to μ-law → Send to Mcube as `playAudio`
3. **Interruptions**: VAD scores → Detect sustained speech → Send `interrupt` → Clear Mcube audio
4. **Call End**: Tool request or WebSocket disconnect → Cleanup resources

### Audio Format Conversion
- **Mcube → ElevenLabs**: μ-law (8000 Hz) → PCM → ElevenLabs
- **ElevenLabs → Mcube**: ElevenLabs audio → PCM → μ-law (8000 Hz) → Mcube

## Key Features

### 1. **Concurrent Call Handling**
- Supports 50 concurrent calls
- Each call is completely isolated
- Resource cleanup on disconnect

### 2. **Connection Resilience**
- Automatic reconnection with exponential backoff
- Audio queue buffering during reconnection
- Connection state detection (checks actual WebSocket state)
- Keep-alive mechanism to prevent timeouts

### 3. **Interruption Detection**
- VAD (Voice Activity Detection) score monitoring
- Sustained speech detection (0.45 threshold, 0.15s duration)
- Debouncing to prevent false positives
- Automatic audio clearing on interruption

### 4. **Tool Integration**
- **end_call**: Terminates call
- **transfer_to_agent**: Transfers to configured agent number
- **transfer_to_number**: Transfers to specified number
- **voicemail_detection**: Detects voicemail from transcripts
- **detect_language**: Basic language detection
- **skip_turn**: Skips current turn
- **play_keypad_touch_tone**: Plays DTMF tones

### 5. **Performance Optimizations**
- Async database operations (thread pool)
- Parallel bot configuration loading
- Connection pooling for HTTP requests
- Rate limiting for audio chunks
- Audio queue management

## Dependencies

### Core
- `fastapi`: Web framework
- `uvicorn`: ASGI server
- `websockets`: WebSocket client/server
- `aiohttp`: Async HTTP client

### Audio Processing
- `pydub`: Audio manipulation
- `audioop`: Audio operations (built-in)
- `numpy`: Numerical operations

### Database
- `mysql-connector-python`: MySQL driver

### AI/ML
- `faiss-cpu`: Vector similarity search
- `langchain-community`, `langchain-aws`: LangChain integration
- `boto3`: AWS SDK (for S3, etc.)

### System
- `psutil`: System metrics
- `python-dotenv`: Environment variables

## Security Considerations

### ⚠️ **Issues Found**

1. **Hardcoded API Keys** (in `config.py`):
   - ElevenLabs API key is hardcoded as default value
   - Database credentials have default values
   - **Recommendation**: Remove defaults, require environment variables

2. **CORS Configuration**:
   - Currently allows all origins (`allow_origins=["*"]`)
   - **Recommendation**: Restrict to specific domains in production

3. **Database Credentials**:
   - Default database password in code
   - **Recommendation**: Always use environment variables, never defaults

4. **No Authentication**:
   - WebSocket endpoints have no authentication
   - **Recommendation**: Add API key or token validation

## Potential Issues

### 1. **Memory Management**
- Message history limited to 100 messages per session
- Audio queue max size: 100 chunks
- **Risk**: Long-running calls could accumulate memory

### 2. **Error Handling**
- Some exceptions are caught silently
- Connection errors may not always propagate correctly
- **Recommendation**: Improve error propagation and logging

### 3. **Database Connection Pooling**
- Each query creates new connection
- **Recommendation**: Implement connection pooling

### 4. **Rate Limiting**
- No rate limiting on WebSocket connections
- **Recommendation**: Add rate limiting middleware

### 5. **Logging**
- Logs to file without rotation
- **Recommendation**: Implement log rotation

## Recommendations

### Immediate
1. ✅ Remove hardcoded credentials from `config.py`
2. ✅ Restrict CORS origins
3. ✅ Add WebSocket authentication
4. ✅ Implement database connection pooling
5. ✅ Add log rotation

### Short-term
1. Add metrics/monitoring (Prometheus, Grafana)
2. Implement circuit breakers for external services
3. Add unit tests
4. Add integration tests
5. Improve error messages for debugging

### Long-term
1. Add distributed tracing (OpenTelemetry)
2. Implement graceful shutdown
3. Add health check dependencies (database, ElevenLabs)
4. Consider message queue for high load
5. Add API documentation (OpenAPI/Swagger)

## Code Quality

### Strengths
- ✅ Good separation of concerns
- ✅ Async/await used throughout
- ✅ Comprehensive error handling in most places
- ✅ Detailed logging
- ✅ Resource cleanup on disconnect
- ✅ Connection state management

### Areas for Improvement
- ⚠️ Some commented-out code (should be removed)
- ⚠️ Magic numbers (should be constants)
- ⚠️ Some functions are too long (e.g., `handle_media_event`)
- ⚠️ Limited type hints in some places
- ⚠️ No unit tests visible

## Deployment

### Requirements
- Python 3.8+
- MySQL database
- Environment variables configured
- Port 7900 available

### Running
```bash
cd /var/www/html/live_calls/homebook
python main.py
# or
uvicorn main:app --host 0.0.0.0 --port 7900
```

### Environment Variables
- `PORT`: Server port (default: 7900)
- `ELEVENLABS_API_KEY`: ElevenLabs API key
- `ELEVENLABS_AGENT_ID`: ElevenLabs agent ID
- `DATABASE_HOST`, `DATABASE_NAME`, `DATABASE_USER`, `DATABASE_PASSWORD`
- `MCUBE_WEBSOCKET_URL`: Mcube WebSocket endpoint
- `AGENT_PHONE_NUMBER`: Default agent number for transfers

## Summary

This is a **well-architected real-time voice call service** with:
- ✅ Proper session isolation
- ✅ Concurrent call support (50 calls)
- ✅ Resilient connection handling
- ✅ Audio format conversion
- ✅ Integration with ElevenLabs Conversational AI
- ⚠️ Security improvements needed (credentials, CORS, auth)
- ⚠️ Database connection pooling needed
- ⚠️ Testing infrastructure needed

The codebase demonstrates good async programming practices and proper resource management, but requires security hardening and operational improvements for production use.

