# Homebook — Mcube Call Service (Voice Bot)

A FastAPI-based voice bot service that bridges **Mcube** (telephony/WebSocket) and **ElevenLabs Conversational AI**. It handles live calls with real-time bidirectional audio, bot configuration by DID, and supports multiple concurrent sessions.

---

## Overview

- **Role:** Accepts WebSocket connections from Mcube for incoming/outbound calls, streams audio to/from ElevenLabs Conversational AI, and sends playback/transfer/terminate commands back to Mcube.
- **Service type:** Uses ElevenLabs WebSocket Conversational AI (service type 4) for full duplex voice.
- **Concurrency:** Up to 50 concurrent call sessions; ElevenLabs connections are limited via a shared limiter (configurable, default 30).
- **Configuration:** Bot behavior is resolved per call using DID → bot mapping and bot config from MySQL (master + cluster databases).

---

## Architecture

```
┌─────────────┐     WebSocket (audio + events)      ┌──────────────────────────┐     WebSocket      ┌─────────────────┐
│   Mcube     │ ◄─────────────────────────────────► │  Homebook (FastAPI)      │ ◄─────────────────► │  ElevenLabs     │
│  (PBX/Call) │     /ws/{session_id}                │  - CallManager           │   Conv. AI API     │  Conversational  │
└─────────────┘                                     │  - CallSession (per call)│                     │  AI (Agent)      │
                                                   │  - AudioService          │                     └─────────────────┘
                                                   │  - BotConfigurationService│
                                                   └────────────┬─────────────┘
                                                                │
                                                                ▼
                                                   ┌──────────────────────────┐
                                                   │  MySQL (master + cluster) │
                                                   │  DID → bot_id, bot_config │
                                                   └──────────────────────────┘
```

- **Mcube** connects to `ws://host:port/ws/{session_id}` and sends `start`, `media`, `playedStream` events.
- **Homebook** creates one **CallSession** per `session_id`, loads bot config by DID, connects to ElevenLabs, and forwards audio both ways.
- **ElevenLabs** returns agent audio and events (transcripts, VAD, tool calls like `end_call`, transfer). Homebook turns agent audio into Mcube `playAudio` and handles tools (terminate, transfer).

---

## Project Structure

```
homebook/
├── main.py                    # FastAPI app: routes, /ws/{session_id}, health
├── config.py                  # Config from env (Mcube, ElevenLabs, DB, company)
├── requirements.txt           # Python dependencies
├── requirements-py38.txt      # Python 3.8–compatible pins
├── README.md                  # This documentation
├── LICENSE                    # MIT
├── logs/
│   └── voicebot.log          # Application log (path set in log_utils)
└── services/
    ├── __init__.py
    ├── call_manager.py       # CallManager: max concurrent calls, create/remove CallSession
    ├── call_session.py       # CallSession: per-call state, Mcube ↔ ElevenLabs orchestration
    ├── connection_manager.py # WebSocketConnectionManager + ConnectionState (Mcube WS)
    ├── audio_service.py      # Audio format conversion, playAudio/checkpoint/clearAudio
    ├── mcube_service.py      # Mcube message builders (playAudio, checkpoint, clearAudio, transfer, terminate)
    ├── bot_configuration_service.py  # DID → bot_id/bot_config (master + cluster DB)
    ├── elevenlabs_websocket_service.py # ElevenLabs WebSocket client (code present, may be commented)
    ├── elevenlabs_connection_limiter.py # Singleton limiter for concurrent ElevenLabs connections
    ├── make_calls.py         # Script to trigger outbound calls via Mcube REST API
    └── log_utils.py          # Log helper (console + voicebot.log)
```

---

## Configuration

Configuration is read from environment variables (and `.env` via `python-dotenv`). See `config.py` for all keys.

### Required

| Variable | Description |
|----------|-------------|
| `ELEVENLABS_API_KEY` | ElevenLabs API key for Conversational AI |
| `ELEVENLABS_AGENT_ID` | ElevenLabs agent ID used for the conversation |
| `MCUBE_WEBSOCKET_URL` | Mcube WebSocket URL (used for validation; actual connections use `/ws/{session_id}`) |

### Optional (with defaults)

| Variable | Default | Description |
|----------|---------|-------------|
| `PORT` | `7900` | Server listen port |
| `COMPANY_NAME` | `Mcube Telecom pvt Limited` | Default company name |
| `MCUBE_WEBSOCKET_BASE_URL` | — | Base URL for Mcube WebSocket |
| `MCUBE_AUDIO_FORMAT` | `audio/x-mulaw` | Audio format sent to Mcube |
| `MCUBE_SAMPLE_RATE` | `8000` | Sample rate (Hz) for Mcube |
| `AGENT_PHONE_NUMBER` | `7780787875` | Default agent number (e.g. transfer target) |
| `ELEVENLABS_VOICE_ID` | (set in code) | ElevenLabs voice ID |
| `ELEVENLABS_MODEL_ID` | `eleven_turbo_v2` | TTS model |
| `DATABASE_HOST` | `10.40.180.74` | MySQL host |
| `DATABASE_NAME` | `mainvoicebot_cluster` | Cluster DB name |
| `DATABASE_USER` | `admin` | DB user |
| `DATABASE_PASSWORD` | — | DB password |
| `DATABASE_CHARSET` | `utf8mb4` | DB charset |

Validation runs at import: missing required vars raise `ValueError`.

---

## Installation

1. **Python:** Use Python 3.8+ (or 3.10+ recommended). Use a virtualenv if desired.

2. **Install dependencies** (from project root):

   ```bash
   cd /var/www/html/live_calls/homebook
   python3 -m pip install -r requirements.txt
   ```

   For Python 3.8 use:

   ```bash
   python3 -m pip install -r requirements-py38.txt
   ```

3. **Environment:** Copy `.env.example` to `.env` (if present) or set the variables above. At minimum set `ELEVENLABS_API_KEY`, `ELEVENLABS_AGENT_ID`, and `MCUBE_WEBSOCKET_URL`.

---

## Running the Service

From the `homebook` directory:

```bash
python3 main.py
```

This starts Uvicorn on `0.0.0.0:PORT` (default 7900). Mcube should connect to:

- **WebSocket:** `ws://<host>:7900/ws/<session_id>`

For production, run Uvicorn via a process manager (systemd, supervisord, etc.) and set `PORT` and other env vars there.

---

## API Endpoints

| Method | Path | Description |
|--------|------|-------------|
| GET | `/` | Simple JSON: `{"message": "Mcube Call Service Server is running!"}` |
| GET | `/health` | Health check: status, timestamp, service name, version |
| GET | `/health/performance` | System metrics (CPU, memory, disk) and status; requires `psutil` |
| WebSocket | `/ws/{session_id}` | Mcube call connection: `start`, `media`, `playedStream` events |

---

## Call Flow (high level)

1. Mcube opens `ws://host:port/ws/{session_id}`.
2. **CallManager** creates a **CallSession** (or rejects if at max concurrent calls).
3. **CallSession** waits for a `start` event; from it reads stream/call IDs and headers (e.g. `X-BID`, `X-Agent-id`, `X-CustNumber`, `X-CustName`, `X-DID`).
4. **BotConfigurationService** resolves DID → bot_id and bot config from master + cluster DB; CallSession uses this (e.g. ElevenLabs agent_id if overridden per bot).
5. **ElevenLabs WebSocket** is initialized (with **ElevenLabsConnectionLimiter**); conversation starts.
6. **Media:** Mcube sends `media` (base64 audio). CallSession forwards to ElevenLabs. ElevenLabs returns agent audio; **AudioService** / **McubeService** build `playAudio` (+ checkpoint/clearAudio) and send to Mcube.
7. **playedStream** from Mcube is handled for sync (e.g. when a segment has been played).
8. **Tools:** On `end_call` (or similar) from ElevenLabs, CallSession sends terminate/transfer to Mcube and cleans up.
9. On WebSocket disconnect or terminal event, **CallManager** removes the session and **CallSession.cleanup()** runs.

---

## Services (summary)

- **CallManager:** Caps concurrent calls (e.g. 50), creates/removes **CallSession**, holds `active_sessions`.
- **CallSession:** One per call; owns connection_manager, audio_service, bot_configuration_service, ElevenLabs client; handles `handle_call_start`, `handle_media_event`, `handle_played_stream_event`, and cleanup.
- **WebSocketConnectionManager:** Receives from Mcube (`receive_from_mcube`), dispatches to handlers; sends JSON to Mcube (`send_to_mcube`).
- **AudioService:** Converts/formats audio for Mcube; builds/queues playAudio and checkpoint/clearAudio.
- **McubeService:** Static helpers to build Mcube messages (playAudio, checkpoint, clearAudio, transfer, terminate).
- **BotConfigurationService:** DB access: DID → bot_id (and business_id) from master; bot config from cluster; used to select agent and company/bot name.
- **ElevenLabsConnectionLimiter:** Singleton semaphore to cap concurrent ElevenLabs WebSocket connections (e.g. 30).
- **Log (log_utils):** Writes to console and `logs/voicebot.log`.

---

## Database

- **Master DB** (`mainvoicebot_master`): `did_numbers` (DID → bot_id, business_id), business name lookups.
- **Cluster DB** (e.g. `mainvoicebot_cluster`): Bot configurations used for the call (e.g. agent_id, company name).
- Connections are created per request in **BotConfigurationService** (no long-lived connection pool in code).

---

## Logging

- **Destination:** `homebook/logs/voicebot.log` (see `log_utils._log_file_path`).
- **Content:** Session lifecycle, ElevenLabs connection, call start (DID, bot, customer), errors, and cleanup.
- **Levels:** Use `Log.info`, `Log.warning`, `Log.error`, `Log.debug`, `Log.header` as needed; all go to file and console.

---

## Outbound calls (optional)

`services/make_calls.py` is a standalone script that calls the Mcube REST API (`outbound-calls-superdash`) to place outbound calls. It uses a fixed auth token and list of contact numbers; adjust URL, headers, and list for your environment. It is not required for the WebSocket voice bot to run.

---

## License

MIT License. See [LICENSE](LICENSE) in this directory.
