# Server Capacity Analysis for WebSocket Connections

## System Resource Limits

### File Descriptors (Critical for Connections)
```
User limit (ulimit):           1,048,576  ✅ Excellent
System max files:              1,000,000  ✅ Excellent  
Service hard limit:              524,288  ✅ Excellent
Service soft limit:                1,024  ⚠️ Low but can be increased
```

### Network Limits
```
Socket backlog (somaxconn):        2,048  ✅ Good
Available ports:              28,231 ports (32768-60999)
Max tasks (TasksMax):             77,149  ✅ Excellent
```

### Hardware Resources
```
CPU Cores:                    24 cores    ✅ Excellent
Total RAM:                    64 GB       ✅ Excellent
Used RAM:                     10.6 GB (16.5%)
Available RAM:                53 GB       ✅ Plenty available
Current connections:          51          (baseline)
```

## Per-Connection Resource Estimates

### Memory Usage Per Concurrent Call
```
WebSocket connection overhead:     2-5 MB
Call session (services, state):    5-10 MB
Audio buffering:                   3-8 MB
Database connections/pooling:      1-2 MB
Misc (logging, queues):            2-5 MB
────────────────────────────────────────
Total per call (average):          15-30 MB
Conservative estimate:             25 MB per call
```

### CPU Usage Per Concurrent Call
```
WebSocket I/O (async):             Minimal
Audio processing:                  2-5% per call
ElevenLabs streaming:              1-2% per call
Database queries:                  <1% per call
────────────────────────────────────────
Total per call:                    3-8% CPU
With 24 cores (2400% total):       Can handle 300-800 calls easily
```

### File Descriptors Per Call
```
Mcube WebSocket (incoming):        1 FD
ElevenLabs WebSocket (outgoing):   1 FD
Database connection:               1 FD
Log files:                         ~1 FD (shared)
Misc (buffers, pipes):             1-2 FD
────────────────────────────────────────
Total per call:                    4-6 FD
With 524,288 limit:                Can handle 87,000+ calls
```

## Capacity Calculations

### Based on RAM (53 GB available)
```
Memory per call: 25 MB

100 calls    =   2.5 GB  → 50.5 GB free  ✅
200 calls    =   5.0 GB  → 48.0 GB free  ✅
500 calls    =  12.5 GB  → 40.5 GB free  ✅
1,000 calls  =  25.0 GB  → 28.0 GB free  ✅
2,000 calls  =  50.0 GB  →  3.0 GB free  ⚠️
```

**RAM limit: ~1,800-2,000 concurrent calls**

### Based on CPU (24 cores)
```
CPU per call: ~5% under load

100 calls    =  500% CPU  → 1,900% free  ✅
200 calls    = 1,000% CPU  → 1,400% free  ✅
500 calls    = 2,500% CPU  →     0% free  ⚠️ (overloaded at peak)
1,000 calls  = 5,000% CPU  →            ❌ (2x overload)

With async I/O and waiting:
Realistic: 800-1,000 calls  ✅
```

**CPU limit: ~800-1,000 concurrent calls** (with async efficiency)

### Based on Network Connections
```
Outgoing connections per call: ~2-3
Available ephemeral ports: 28,231

28,231 ports ÷ 3 per call = ~9,400 calls maximum
```

**Network limit: ~9,000 concurrent calls**

### Based on File Descriptors
```
Service hard limit: 524,288 FD
FD per call: 5

524,288 ÷ 5 = 104,857 calls
```

**File descriptor limit: ~100,000 concurrent calls**

## Bottleneck Analysis

| Resource | Limit | Bottleneck? |
|----------|-------|-------------|
| **RAM** | 1,800-2,000 calls | 🟡 **Possible** |
| **CPU** | 800-1,000 calls | 🔴 **PRIMARY BOTTLENECK** |
| **Network Ports** | 9,000 calls | 🟢 Not limiting |
| **File Descriptors** | 100,000+ calls | 🟢 Not limiting |
| **Socket Backlog** | 2,048 queue | 🟢 Sufficient |

**Primary Bottleneck: CPU (24 cores)**

## Recommendations by Use Case

### Conservative (Guaranteed Stable)
```
Recommended: 200-300 concurrent calls

Why:
- Leaves 80% CPU headroom for spikes
- Only uses 5-7.5 GB RAM (12%)
- Very stable under load
- Room for growth
```

### Moderate (Production Recommended)
```
Recommended: 500-700 concurrent calls

Why:
- Uses ~50% CPU at peak
- Uses ~12-17 GB RAM (25%)
- Good balance of capacity vs stability
- Handles traffic spikes well
```

### Aggressive (Maximum Capacity)
```
Recommended: 800-1,000 concurrent calls

Why:
- Near CPU capacity (~90-100% at peak)
- Uses ~20-25 GB RAM (40%)
- Maximum throughput
- May struggle with sudden spikes
- Requires active monitoring
```

### Extreme (Theoretical Maximum)
```
Maximum: 1,500-2,000 concurrent calls

Why NOT recommended:
- CPU will be overloaded (200%+)
- High risk of service degradation
- Increased latency and dropped calls
- Only for burst traffic, not sustained
```

## Our Recommendation for Your Server

### **Start with: 200 concurrent calls**
- 4x your current limit (50)
- Very safe and stable
- Allows you to monitor and tune

### **Scale to: 500 concurrent calls**
- Sweet spot for your hardware
- Good capacity without risk
- Production-ready configuration

### **Maximum: 800-1,000 concurrent calls**
- Only if you need this capacity
- Requires monitoring and tuning
- May need optimization of code

## How to Configure

Edit `homebook/services/call_manager.py` line 129:

### Conservative (Start here)
```python
call_manager = CallManager(max_concurrent_calls=200)
```

### Moderate (Production)
```python
call_manager = CallManager(max_concurrent_calls=500)
```

### Aggressive (Maximum)
```python
call_manager = CallManager(max_concurrent_calls=1000)
```

Then restart:
```bash
sudo systemctl restart websocket_api.service
```

## Monitoring Commands

### Real-time Resource Monitoring
```bash
# Watch CPU and memory
htop

# Monitor service memory
watch -n 2 'ps aux | grep "python3.*main.py" | grep -v grep'

# Count active connections
watch -n 2 'netstat -an | grep :7900 | grep ESTABLISHED | wc -l'

# Monitor system resources
vmstat 2
```

### Load Testing
```bash
# Monitor during load test
# CPU usage
mpstat 2

# Memory usage
free -m -s 2

# Network connections
ss -s

# File descriptors in use
lsof -u vmc | wc -l
```

## System Tuning (Optional)

If you want to go above 500 calls, consider these optimizations:

### 1. Increase Service File Descriptor Soft Limit

Edit `/etc/systemd/system/websocket_api.service`:
```ini
[Service]
LimitNOFILE=100000
```

### 2. Increase Socket Backlog
```bash
sudo sysctl -w net.core.somaxconn=4096
```

### 3. Optimize TCP Settings
```bash
sudo sysctl -w net.ipv4.tcp_max_syn_backlog=4096
sudo sysctl -w net.ipv4.ip_local_port_range="1024 65535"
```

### 4. Make Changes Permanent
Add to `/etc/sysctl.conf`:
```
net.core.somaxconn=4096
net.ipv4.tcp_max_syn_backlog=4096
net.ipv4.ip_local_port_range=1024 65535
```

## Summary Table

| Configuration | Concurrent Calls | CPU Usage | RAM Usage | Stability | Use Case |
|---------------|------------------|-----------|-----------|-----------|----------|
| **Current** | 50 | ~10% | 1.2 GB | ✅ Stable | Testing |
| **Conservative** | 200 | ~40% | 5 GB | ✅ Very Stable | Low risk |
| **Moderate** | 500 | ~60% | 12 GB | ✅ Stable | **Recommended** |
| **Aggressive** | 1,000 | ~90% | 25 GB | ⚠️ Risky | High load |
| **Maximum** | 2,000 | 🔴 200%+ | 50 GB | ❌ Unstable | Not recommended |

## Final Answer

**Your server can handle: 500-1,000 concurrent WebSocket connections**

**Recommended starting point: 200 concurrent calls**
**Production recommendation: 500 concurrent calls**
**Absolute maximum (with tuning): 1,000 concurrent calls**

The bottleneck is CPU (24 cores), not RAM. Each call uses ~5% CPU due to audio processing and streaming. With async I/O efficiency, you can realistically support 500-700 sustained concurrent calls comfortably.

