# Call Transcription & Analysis Pipeline — Complete Guide

**BID 6004 (Mcube Presales) | Updated: 2026-03-11**

---

## Table of Contents

1. [Overview](#1-overview)
2. [Architecture Diagram](#2-architecture-diagram)
3. [Prerequisites](#3-prerequisites)
4. [Database Tables Involved](#4-database-tables-involved)
5. [Stage 1 — Find an Eligible Call](#5-stage-1--find-an-eligible-call)
6. [Stage 2 — Transcription (Sarvam saaras:v2.5)](#6-stage-2--transcription-sarvam-saarasv25)
7. [Stage 3 — Quality Analysis (AWS Bedrock Nova)](#7-stage-3--quality-analysis-aws-bedrock-nova)
8. [Stage 4 — BANT Analysis](#8-stage-4--bant-analysis)
9. [Stage 5 — Results & Storage](#9-stage-5--results--storage)
10. [Self-Service Module Reference](#10-self-service-module-reference)
11. [Worked Example — Call 97393902541773212620](#11-worked-example--call-97393902541773212620)
12. [Troubleshooting](#12-troubleshooting)
13. [Key Files Reference](#13-key-files-reference)

---

## 1. Overview

The PCAA (Post-Call Analytics) pipeline for BID 6004 processes sales calls through three sequential stages:

```
Source Call Recording (WAV/MP3)
        │
        ▼
  ┌─────────────┐
  │  STAGE 1   │  Pick eligible call from 6004_raw_calls
  │   SELECT   │  Criteria: ANSWER, duration > 5min, not yet in sarvamresponse
  └──────┬──────┘
         │
         ▼
  ┌─────────────┐
  │  STAGE 2   │  Sarvam saaras:v2.5 → English transcript + speaker diarization
  │ TRANSCRIBE │  Output saved to 6004_sarvamresponse
  └──────┬──────┘
         │
         ▼
  ┌─────────────┐
  │  STAGE 3   │  AWS Bedrock Nova-Lite → Quality scores + BANT + summary
  │  ANALYZE   │  Output saved to 6004_callanalytics + 6004_bant_analysis
  └─────────────┘
```

The entire flow can be run end-to-end via the self-service module:

```bash
cd /home/aiteam/pcaa-dev/dashboard-backend
python3 call_self_service.py
```

---

## 2. Architecture Diagram

```
┌──────────────────────────────────────────────────────────────────────┐
│                        PCAA Pipeline — BID 6004                      │
│                                                                      │
│  ┌─────────────────────┐       ┌─────────────────────────────────┐  │
│  │  MySQL (voicebot_   │       │  External APIs                  │  │
│  │  cluster @ 127.0.   │       │                                 │  │
│  │  0.1:3306)          │       │  ┌──────────────────────────┐   │  │
│  │                     │       │  │  Sarvam AI               │   │  │
│  │  6004_raw_calls     │──────▶│  │  api.sarvam.ai           │   │  │
│  │  (call metadata,    │       │  │  model: saaras:v2.5      │   │  │
│  │   fileurl, status)  │       │  │  - diarization (2 spkrs) │   │  │
│  │                     │       │  │  - English translation   │   │  │
│  │  6004_sarvamresponse│◀──────│  └──────────────────────────┘   │  │
│  │  (transcript, segs) │       │                                 │  │
│  │                     │       │  ┌──────────────────────────┐   │  │
│  │  6004_callanalytics │◀──────│  │  AWS Bedrock             │   │  │
│  │  (scores, BANT, etc)│       │  │  region: eu-north-1      │   │  │
│  │                     │       │  │  model: nova-lite-v1:0   │   │  │
│  │  6004_bant_analysis │◀──────│  └──────────────────────────┘   │  │
│  │  (BANT fields)      │       │                                 │  │
│  └─────────────────────┘       └─────────────────────────────────┘  │
│                                                                      │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  Python Scripts                                               │  │
│  │                                                               │  │
│  │  call_self_service.py          ← Main self-service CLI        │  │
│  │  run_single_transcription.py   ← Transcription only          │  │
│  │  run_single_analysis.py        ← Analysis only               │  │
│  │  pipeline_6004.py              ← Continuous pipeline daemon   │  │
│  │  analyze_calls_with_parameters.py  ← Core analysis engine    │  │
│  │  agent_runner.py               ← AI agent executor           │  │
│  └───────────────────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────────────────────┘
```

---

## 3. Prerequisites

### Python packages
```bash
pip install sarvamai pymysql python-dotenv boto3 requests
```

### Environment variables (in `dashboard-backend/.env`)

| Variable | Purpose |
|---|---|
| `DB_HOST` | MySQL host (default: `127.0.0.1`) |
| `DB_USER` | MySQL user (default: `admin`) |
| `DB_PASSWORD` | MySQL password |
| `DB_NAME` | MySQL database (default: `voicebot_cluster`) |
| `SARVAM_PIPELINE_KEY` | Sarvam AI subscription key (preferred) |
| `SARVAM_SUBSCRIPTION_KEY` | Sarvam AI subscription key (fallback) |
| `AWS_REGION` | AWS region for Bedrock (e.g. `eu-north-1`) |
| `AWS_ACCESS_KEY_ID` | AWS access key |
| `AWS_SECRET_ACCESS_KEY` | AWS secret key |
| `AWS_NOVA_MODEL` | Bedrock model ID (default: `amazon.nova-lite-v1:0`) |

### Working directory
All scripts must be run from `dashboard-backend/`:
```bash
cd /home/aiteam/pcaa-dev/dashboard-backend
```

---

## 4. Database Tables Involved

### `6004_raw_calls` — Source call metadata
| Column | Type | Description |
|---|---|---|
| `callid` | VARCHAR | Unique call identifier (PK) |
| `agentname` | VARCHAR | Agent name |
| `call_starttime` | DATETIME | Call start |
| `call_endtime` | DATETIME | Call end |
| `fileurl` | TEXT | URL to WAV/MP3 audio file |
| `call_status` | VARCHAR | `ANSWER` for connected calls |
| `status` | INT | `0`=unprocessed, `1`=transcribed, `2`=analyzed |
| `transcription_status` | VARCHAR | `pending`, `completed`, `failed` |

**Eligibility query** (calls ready for transcription):
```sql
SELECT r.callid, r.fileurl
FROM 6004_raw_calls r
LEFT JOIN 6004_sarvamresponse s ON r.callid = s.callid
WHERE r.call_status = 'ANSWER'
  AND r.fileurl IS NOT NULL AND r.fileurl != ''
  AND s.callid IS NULL                                          -- not yet transcribed
  AND TIMESTAMPDIFF(SECOND, r.call_starttime, r.call_endtime) >= 300  -- > 5 minutes
ORDER BY r.call_starttime DESC;
```

---

### `6004_sarvamresponse` — Transcription output
| Column | Type | Description |
|---|---|---|
| `callid` | VARCHAR | FK → raw_calls |
| `transcript` | LONGTEXT | Full English transcript (flat) |
| `speaker_segments` | JSON | Array of `{speaker, role, text, start, end}` objects |
| `duration` | FLOAT | Audio duration in seconds |
| `num_speakers` | INT | Number of detected speakers |
| `request_id` | VARCHAR | Sarvam job ID |
| `raw_response` | LONGTEXT | Complete Sarvam API JSON response |
| `stt_provider` | VARCHAR | `sarvam` |
| `created_at` | DATETIME | Insert timestamp |

**Speaker segment format:**
```json
{
  "speaker": "Speaker 0",
  "speaker_id": "speaker_0",
  "role": "agent",
  "text": "Hello, how can I help you today?",
  "start": 0.5,
  "end": 3.2,
  "start_time": 0.5,
  "end_time": 3.2
}
```
> **Convention:** `speaker_0` = agent, `speaker_1` = customer

---

### `6004_callanalytics` — Analysis output
| Column | Type | Description |
|---|---|---|
| `callid` | VARCHAR | FK → raw_calls |
| `summary` | TEXT | 2–3 sentence call summary |
| `call_purpose` | VARCHAR | Main reason for the call |
| `sentiment` | ENUM | `positive`, `neutral`, `negative` |
| `quality_score` | FLOAT | Overall score as percentage (0–100) |
| `total_possible_score` | INT | Max score across all applicable parameters |
| `parameter_scores` | JSON | Per-parameter `{score, max_score, applicable}` |
| `parameter_detections` | JSON | Per-parameter `{transcript_segment, timestamp, reasoning}` |
| `parameters_not_applicable` | JSON | List of parameter names skipped |
| `objections_concerns` | TEXT | Customer objections identified |
| `objection_type` | VARCHAR | Short classification (e.g. "Spam and Cost Concerns") |
| `agent_talk_time` | FLOAT | Seconds agent spoke |
| `customer_talk_time` | FLOAT | Seconds customer spoke |
| `agent_speak_percentage` | FLOAT | % of call agent was talking |
| `customer_speak_percentage` | FLOAT | % of call customer was talking |
| `talk_listen_ratio` | VARCHAR | Ratio string e.g. `"42:58"` |
| `dead_air_percentage` | FLOAT | % of call with silence |
| `talk_listen_assessment` | TEXT | Qualitative assessment |
| `analysis_model` | VARCHAR | Bedrock model used |
| `raw_response` | JSON | Full AI response |

---

### `6004_bant_analysis` — BANT output
| Column | Description |
|---|---|
| `callid` | FK → raw_calls |
| `budget_value` | Budget mentioned (or "Not mentioned") |
| `budget_evidence` | Exact transcript quote |
| `budget_confidence` | `high` / `medium` / `low` |
| `authority_value` | Decision-maker status |
| `authority_evidence` | Transcript evidence |
| `need_value` | Customer's stated need |
| `need_evidence` | Transcript evidence |
| `timeline_value` | Purchasing timeline |
| `timeline_evidence` | Transcript evidence |
| `customer_profile_summary` | 1–2 sentence BANT summary |

---

## 5. Stage 1 — Find an Eligible Call

**Criteria for an eligible call:**
- `call_status = 'ANSWER'` — must be a connected call
- `fileurl` is not empty — audio must be accessible
- No matching row in `6004_sarvamresponse` — not yet transcribed
- Duration ≥ 300 seconds (5 minutes) — short calls produce poor transcripts

**Using the self-service module to list eligible calls:**
```bash
python3 call_self_service.py --list
```

Sample output:
```
  CALLID                    AGENT                        START TIME               DURATION
  ───────────────────────── ──────────────────────────── ────────────────────── ──────────
  97393902541773221196      Richa vishwakarma            2026-03-11 14:56:38    10m 17s
  97019434441773207277      Durga Devi Malleswarapu      2026-03-11 11:04:37    12m 22s
  88809649641773035697      Shubham Kumari               2026-03-09 11:24:58    13m 39s
```

**Manual SQL:**
```sql
SELECT r.callid,
       r.agentname,
       r.call_starttime,
       TIMESTAMPDIFF(SECOND, r.call_starttime, r.call_endtime) AS duration_sec,
       r.fileurl
FROM 6004_raw_calls r
LEFT JOIN 6004_sarvamresponse s ON r.callid = s.callid
WHERE r.call_status = 'ANSWER'
  AND r.fileurl IS NOT NULL
  AND s.callid IS NULL
  AND TIMESTAMPDIFF(SECOND, r.call_starttime, r.call_endtime) >= 300
ORDER BY r.call_starttime DESC
LIMIT 10;
```

---

## 6. Stage 2 — Transcription (Sarvam saaras:v2.5)

### How it works

1. **Download audio** — HTTP GET the `fileurl` (WAV file from Mcube recordings server)
2. **Create Sarvam job** — `speech_to_text_translate_job.create_job()` with:
   - `model = "saaras:v2.5"` — Sarvam's multilingual model with English translation
   - `with_diarization = True` — separates agent/customer speech
   - `num_speakers = 2`
   - `prompt = "Translate all speech to English"`
3. **Upload audio** — `job.upload_files([tmp_wav_path])`
4. **Poll for completion** — `job.wait_until_complete(poll_interval=5, timeout=600)`
5. **Download result JSON** — Sarvam returns a JSON with `transcript` + `diarized_transcript.entries`
6. **Parse & save** — Insert into `6004_sarvamresponse`, update `6004_raw_calls.status = 1`

### Known issue: Azure 403 on upload
Sarvam's SDK internally uploads to Azure Blob Storage. The first attempt occasionally returns HTTP 403. The pipeline retries up to 5 times with a 1-second delay before failing:

```python
for attempt in range(5):
    try:
        job = client.speech_to_text_translate_job.create_job(...)
        if job.upload_files([tmp.name]):
            break
    except RuntimeError as e:
        if '403' in str(e) and attempt < 4:
            time.sleep(1)
            job = None
        else:
            raise
```

### Sarvam output JSON structure
```json
{
  "transcript": "Hello how are you... [full flat English text]",
  "request_id": "20260311_15dac2d6-e2f4-4259-acef-268f0ffc0266",
  "diarized_transcript": {
    "entries": [
      {
        "speaker_id": "speaker_0",
        "transcript": "Hello, how can I help you?",
        "start_time_seconds": 0.5,
        "end_time_seconds": 3.2
      },
      {
        "speaker_id": "speaker_1",
        "transcript": "Yes, I wanted to ask about...",
        "start_time_seconds": 3.5,
        "end_time_seconds": 7.1
      }
    ]
  }
}
```

### Running transcription only
```bash
# Auto-pick next call and transcribe
python3 call_self_service.py --transcribe-only

# Transcribe a specific call
python3 call_self_service.py --callid 97393902541773212620 --transcribe-only

# Or use the dedicated single-call script
python3 run_single_transcription.py 97393902541773212620
```

### Verify transcription saved
```sql
SELECT callid, LENGTH(transcript) AS transcript_len,
       num_speakers, duration, created_at
FROM 6004_sarvamresponse
WHERE callid = '97393902541773212620';
```

---

## 7. Stage 3 — Quality Analysis (AWS Bedrock Nova)

### How it works

1. **Load quality parameters** — fetched from the `quality_parameters` table for BID 6004 (9 parameters configured)
2. **Build prompt** — transcript + all 9 parameters + scoring instructions are combined into a single prompt
3. **Call AWS Bedrock** — `bedrock-runtime.invoke_model()` with `amazon.nova-lite-v1:0`
4. **Parse JSON response** — extract per-parameter scores, evidence, timestamps, reasoning
5. **Calculate overall score** — `round((total_earned / total_possible) * 100, 2)`
6. **Calculate talk-listen ratio** — derived from `speaker_segments` timing data
7. **Classify objections** — secondary Nova call to map objections to predefined categories (if configured)
8. **Save to DB** — `6004_callanalytics` + `6004_bant_analysis`

### Quality parameters for BID 6004 (9 total, 100 points)

| # | Parameter | Max Score |
|---|---|---|
| 1 | Professional greeting and introduction | 10 |
| 2 | Active listening and understanding customer needs | 15 |
| 3 | Building rapport with customer | 10 |
| 4 | Lead qualification and discovery questions | 10 |
| 5 | Product or service knowledge and presentation | 15 |
| 6 | Handling objections and concerns effectively | 15 |
| 7 | Clear communication and articulation | 5 |
| 8 | Professional and courteous tone throughout | 10 |
| 9 | Proper closing and next steps | 10 |
| **TOTAL** | | **100** |

### Prompt structure
```
You are an expert call quality analyst. Analyze the following call transcript
against specific quality parameters.

CALL TRANSCRIPT:
[full transcript text]

QUALITY PARAMETERS TO EVALUATE:
**Parameter:** Professional greeting and introduction
- Max Score: 10
- What to check: [check_description from DB]
- Sample utterances: [examples]
...

INSTRUCTIONS:
For EACH parameter:
1. Determine Applicability
2. Score 0–max_score (generous — only deduct for clear failures)
3. Provide Evidence (exact quote + timestamp)
4. Reasoning

CUSTOMER PROFILING (BANT):
Also extract Budget, Authority, Need, Timeline...

OUTPUT FORMAT (JSON only):
{
  "parameters": { ... },
  "customer_profile": { budget, authority, need, timeline },
  "overall_summary": "...",
  "sentiment": "positive/neutral/negative",
  ...
}
```

### Running analysis only (call must already be transcribed)
```bash
# Specific call
python3 call_self_service.py --callid 97393902541773212620 --analyze-only

# Or use the dedicated script
python3 run_single_analysis.py 97393902541773212620

# With JSON output
python3 call_self_service.py --callid 97393902541773212620 --analyze-only --output report.json
```

### Verify analysis saved
```sql
SELECT callid, quality_score, sentiment, call_purpose,
       agent_speak_percentage, customer_speak_percentage
FROM 6004_callanalytics
WHERE callid = '97393902541773212620';
```

---

## 8. Stage 4 — BANT Analysis

BANT is extracted **within the same Nova call** as the quality analysis (no separate API request). The prompt instructs the model to return a `customer_profile` block alongside the quality parameters.

### BANT fields extracted

| Field | What it captures |
|---|---|
| **Budget** | Any mention of budget, price sensitivity, current spend |
| **Authority** | Whether the caller is the decision maker or influencer |
| **Need** | The customer's explicit pain points and requirements |
| **Timeline** | Any urgency or timeframe mentioned for a purchase |

Each field returns:
```json
{
  "value": "Voice broadcasting services to avoid spam",
  "evidence": "exact transcript quote",
  "reasoning": "why this was inferred",
  "confidence": "high/medium/low"
}
```

The `save_bant_analysis()` method in `db_handler.py` writes these into `6004_bant_analysis`.

---

## 9. Stage 5 — Results & Storage

### What gets saved

| Table | What's written | Trigger |
|---|---|---|
| `6004_sarvamresponse` | Transcript + speaker segments | After Sarvam job completes |
| `6004_raw_calls` | `status=1`, `transcription_status='completed'` | After transcript saved |
| `6004_callanalytics` | Quality scores, sentiment, BANT, talk ratios | After Nova analysis |
| `6004_bant_analysis` | BANT fields per-call | After Nova analysis |

### Intermediate files (for debugging)
- `/tmp/sarvam_{callid}.json` — raw Sarvam API response
- `/tmp/analysis_{callid}.json` — raw Nova analysis response
- `/tmp/full_report_{callid}.json` — structured report (only when `--output` is used)

### Viewing results via API (dashboard backend on port 8002)
```bash
# List analyzed calls
GET /api/calls/6004?status=analyzed

# Single call details
GET /api/calls/6004/{callid}
```

---

## 10. Self-Service Module Reference

**File:** `dashboard-backend/call_self_service.py`

### Command-line options

| Flag | Description | Default |
|---|---|---|
| *(no flags)* | Auto-pick next eligible call, run full pipeline | — |
| `--callid <ID>` | Process a specific call ID | auto-pick |
| `--bid <BID>` | Business ID | `6004` |
| `--min-duration <N>` | Minimum duration in seconds | `300` (5 min) |
| `--transcribe-only` | Run transcription only, skip analysis | off |
| `--analyze-only` | Run analysis only (call must be transcribed) | off |
| `--list` | Print eligible calls and exit | off |
| `--output <path>` | Save full JSON report to file | not saved |
| `--quiet` | Suppress progress messages | off |

### Usage examples

```bash
# Full pipeline on next eligible call
python3 call_self_service.py

# Full pipeline on a specific call, save report
python3 call_self_service.py \
  --callid 97393902541773212620 \
  --output /tmp/report.json

# Only transcribe (no analysis)
python3 call_self_service.py --transcribe-only

# Only analyze a call already transcribed
python3 call_self_service.py \
  --callid 97393902541773212620 \
  --analyze-only

# List untranscribed calls above 10 minutes for BID 6004
python3 call_self_service.py --list --min-duration 600

# Process a call from a different BID
python3 call_self_service.py --bid 7408

# Silent mode (no progress, just the report)
python3 call_self_service.py --quiet
```

### Module internals

```
call_self_service.py
│
├── list_eligible_calls(bid, min_duration)
│       └── SQL: raw_calls LEFT JOIN sarvamresponse WHERE no transcript + duration filter
│
├── transcribe_audio(audio_url, callid)
│       ├── requests.get(audio_url)          # download WAV
│       ├── SarvamAI.create_job(...)          # submit to saaras:v2.5
│       ├── job.upload_files([tmp.wav])       # with 403-retry loop
│       ├── job.wait_until_complete(...)      # poll every 5s, timeout 600s
│       └── requests.get(download_url).json()# fetch result JSON
│
├── save_transcript(bid, callid, sarvam_result)
│       ├── parse diarized_transcript.entries → speaker_segments list
│       └── INSERT INTO {bid}_sarvamresponse + UPDATE {bid}_raw_calls
│
├── analyze_call(bid, callid, transcript, segments, duration)
│       ├── CallAnalyzer(config).analyze_call(...)
│       │       ├── get quality parameters from DB
│       │       ├── build_analysis_prompt(transcript, parameters)
│       │       ├── bedrock_runtime.invoke_model(nova-lite-v1:0)
│       │       ├── calculate_talk_listen_ratio(speaker_segments)
│       │       └── save_call_analytics() + save_bant_analysis()
│       └── returns analytics_data dict
│
├── print_report(...)         # formatted console output
└── build_json_report(...)    # structured dict for --output
```

---

## 11. Worked Example — Call 97393902541773212620

**Context:** Presales call between agent Richa vishwakarma and prospect Nitin (real estate company using voice broadcasting for marketing campaigns).

### Step 1 — Discovery
```bash
python3 call_self_service.py --list
# → 97393902541773212620  Richa vishwakarma  2026-03-11 12:33  16m 56s
```

### Step 2 — Transcription
```bash
python3 call_self_service.py --callid 97393902541773212620 --transcribe-only
```
- Downloaded: 15.4 MB WAV
- Sarvam Job ID: `20260311_15dac2d6-e2f4-4259-acef-268f0ffc0266`
- Result: 17,020 char transcript, 254 diarized segments
- Saved to: `6004_sarvamresponse`

**Transcript opening:**
> *"Hello? Hello, hello. Yes, Mr. Nitin, Richard is fine. Hi Richa, how are you?... Just first before my requirement, I exactly want to understand what exactly are your portfolio of services..."*

### Step 3 — Analysis
```bash
python3 call_self_service.py --callid 97393902541773212620 --analyze-only
```

### Results

#### Overall Quality Score: **87/100**

| Parameter | Score | Key Evidence |
|---|---|---|
| Professional greeting | 8/10 | "Hello? Yes, Mr. Nitin, Richard is fine" — slightly informal |
| Active listening | 14/15 | Acknowledged spam concern, asked clarifying questions |
| Building rapport | 8/10 | Friendly, informative tone throughout |
| Lead qualification | 9/10 | "How many people are there in your sales team, sir?" |
| Product knowledge | 14/15 | Explained IVR, click-to-call, OBD pricing clearly |
| Handling objections | 12/15 | Addressed DID spam & registration concerns |
| Clear communication | 4/5 | Minor repetition/filler words |
| Professional tone | 9/10 | Courteous throughout |
| Proper closing | 9/10 | Scheduled follow-up call |

#### Sentiment: Neutral

#### BANT Analysis

| Dimension | Value | Confidence |
|---|---|---|
| Budget | Not mentioned | Low |
| Authority | Decision Maker | Medium |
| **Need** | Voice broadcasting + cloud telephony to avoid spam & improve campaign effectiveness | **High** |
| Timeline | Not mentioned | Low |

**BANT Summary:** Decision maker seeking effective, cost-efficient voice broadcasting that avoids spam — primary pain point clearly established.

#### Objections
- Concerns about numbers being marked as spam
- Cost comparison with competitors
- DID registration complexity
- **Objection Type:** Service Effectiveness

#### Call Purpose
Prospect wants to understand Mcube's cloud telephony and voice broadcasting offering to solve their spam-marking problem on outbound campaigns.

---

## 12. Troubleshooting

### "No module named 'sarvamai'"
```bash
pip install sarvamai
```

### Sarvam upload fails with "403" repeatedly
This is an Azure Blob Storage intermittent 403. The pipeline retries 5 times. If it keeps failing:
- Check `SARVAM_PIPELINE_KEY` is valid
- Try re-running — the issue is transient

### "No transcript found" on analysis
The call hasn't been transcribed yet. Run without `--analyze-only`:
```bash
python3 call_self_service.py --callid <ID>
```

### AWS Bedrock `AccessDeniedException`
- Verify `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` in `.env`
- Verify `AWS_REGION` matches where the model is enabled (`eu-north-1`)
- Check IAM permissions include `bedrock:InvokeModel`

### Talk-listen ratio shows 0% agent / 100% customer
Sarvam diarization occasionally assigns all segments to one speaker. This is a known limitation when audio quality is poor or both speakers have similar voice characteristics. The transcript content is still correct.

### "No quality parameters found for BID 6004"
The `quality_parameters` table may not have rows for BID 6004. Check:
```sql
SELECT COUNT(*) FROM quality_parameters WHERE bid = '6004';
```
If 0, the analyzer falls back to a generic analysis without parameter scores.

---

## 13. Key Files Reference

| File | Purpose |
|---|---|
| `call_self_service.py` | **Main self-service CLI** — full pipeline in one script |
| `run_single_transcription.py` | Transcribe a single specific call |
| `run_single_analysis.py` | Analyze a single already-transcribed call |
| `pipeline_6004.py` | Continuous pipeline daemon (sync+transcribe+analyze) |
| `analyze_calls_with_parameters.py` | Core quality analysis engine (`CallAnalyzer` class) |
| `agent_runner.py` | Generic AI agent runner (used by new `call_processor.py`) |
| `quality_parameters_handler.py` | Load/manage quality parameters from DB |
| `talk_listen_calculator.py` | Calculate talk:listen ratios from speaker segments |
| `db_handler.py` | All database operations (`save_call_analytics`, `save_bant_analysis`, etc.) |
| `config.py` | Environment-based configuration |
| `.env` | API keys and DB credentials |

---

*Documentation generated: 2026-03-11 | Pipeline version: saaras:v2.5 + nova-lite-v1:0*
