# Webhook Payloads

Gensail delivers webhook payloads when algorithm processing completes. This guide documents the payload structure for both successful and failed processing outcomes.

## Payload Types

Webhooks are sent per algorithm run, and each job can have multiple algorithms configured. The webhook payload includes a `status` field to indicate the outcome:

| Status | Description |
|  --- | --- |
| `completed` | Algorithm processed successfully, extracted data included |
| `failed` | Algorithm processing failed, error details included |


## Success Payload (`status: "completed"`)

When an algorithm completes successfully, the webhook payload includes the full extracted data:

```json
{
    "status": "completed",
    "timestamp": "2026-01-22T12:00:00.000000Z",
    "call": {
        "algorithm_run_id": 12345,
        "algorithm_id": "appointment_scheduling",
        "job_id": 7913,
        "recording_url": "https://storage.example.com/recording.mp3",
        "recording_duration": 120.5,
        "contacted_date": "2026-01-22",
        "contacted_time": "10:30:00",
        "answered_by": "Sofia",
        "insurance_type": "PPO",
        "extracted_source": "Direct",
        "call_sentiment": "Positive",
        "call_sentiment_reason": "Patient scheduled appointment and expressed satisfaction",
        "call_summary": "Patient called to schedule a dental checkup...",
        "call_transcript": "Receptionist: Thank you for calling...",
        "confidence_score": 85.5,
        "annotated_transcript": "Receptionist: Thank you for calling <answered_by>Sofia</answered_by>..."
    },
    "data": [
        {
            "patient": {
                "name": "John Doe",
                "phone": "+15551234567",
                "email": "john@example.com",
                "date_of_birth": "1985-03-15",
                "additional_spellings": ["Jon Doe"]
            },
            "appointment": {
                "patient_type": "New Patient",
                "visit_reason": "General Checkup",
                "scheduled_date": "2026-01-25",
                "scheduled_time": "14:00:00",
                "notes": "Patient prefers afternoon appointments"
            }
        }
    ],
    "confidence": {
        "overall_score": 85.5,
        "flags": ["high_confidence"],
        "breakdown": {
            "call": {
                "answered_by": {
                    "score": 5.0,
                    "max": 5,
                    "confidence": "high",
                    "agreement": 1.0
                }
            },
            "data": [
                {
                    "patient": {
                        "name": {
                            "score": 10.0,
                            "max": 10,
                            "confidence": "high"
                        }
                    }
                }
            ]
        }
    }
}
```

### Success Payload Fields

#### Top Level

| Field | Type | Description |
|  --- | --- | --- |
| `status` | string | Always `"completed"` for successful processing |
| `timestamp` | string | ISO 8601 timestamp when webhook was generated |
| `call` | object | Call-level metadata and extracted fields |
| `data` | array | Array of patient/appointment entries |
| `confidence` | object | Confidence scoring breakdown |


#### Call Section

| Field | Type | Description |
|  --- | --- | --- |
| `algorithm_run_id` | integer | Unique ID of this algorithm run |
| `algorithm_id` | string | Algorithm that processed this call (e.g., `appointment_scheduling`) |
| `job_id` | integer | Job ID in the pipeline |
| `recording_url` | string | URL of the source audio recording |
| `recording_duration` | number | Duration in seconds |
| `contacted_date` | string | Date when call occurred (YYYY-MM-DD) |
| `contacted_time` | string | Time when call occurred (HH:MM:SS) |
| `answered_by` | string | Name of the person who answered |
| `call_transcript` | string | Full transcript with speaker labels |
| `confidence_score` | number | Overall confidence score (0-100) |
| `annotated_transcript` | string or array | Transcript with extraction annotations (see below) |


Additional passthrough fields from the source system may be included based on your webhook configuration.

### Annotated Transcript Format

The `annotated_transcript` field contains the transcript with XML-style annotations marking extracted values. The format depends on your transcription configuration:

#### String Format (default)

When `utterances` is not enabled, `annotated_transcript` is a string with newline-separated lines:

```json
{
    "annotated_transcript": "Receptionist: Thank you for calling. This is <answered_by>Sofia</answered_by>.\nPatient: Hi, my name is <patient.name data-index=\"0\">John</patient.name>."
}
```

#### Array Format (when `utterances: true`)

When `utterances: true` is enabled in transcription settings (alongside `diarize: true`), `annotated_transcript` becomes an array of objects with timing information:

```json
{
    "annotated_transcript": [
        {
            "name": "Receptionist",
            "text": "Thank you for calling. This is <answered_by>Sofia</answered_by>.",
            "start_time": 0.0
        },
        {
            "name": "Patient",
            "text": "Hi, my name is <patient.name data-index=\"0\">John</patient.name>.",
            "start_time": 2.5
        }
    ]
}
```

#### Array Entry Fields

| Field | Type | Description |
|  --- | --- | --- |
| `name` | string or null | Speaker role (e.g., "Receptionist", "Patient"). Null for voicemails. |
| `text` | string | Utterance text with XML annotation tags |
| `start_time` | number or null | Start timestamp in seconds. Null if alignment failed. |


#### Edge Cases

**Voicemail calls**: Speaker labels are stripped, so `name` will be `null` for all entries:

```json
{
    "annotated_transcript": [
        {
            "name": null,
            "text": "Hi, this is <patient.name data-index=\"0\">Sarah</patient.name>. Please call me back.",
            "start_time": null
        }
    ]
}
```

**Alignment failure**: If timestamps cannot be aligned with transcript lines, `start_time` will be `null` but the array format is still used:

```json
{
    "annotated_transcript": [
        {
            "name": "Receptionist",
            "text": "Hello, dental office.",
            "start_time": null
        }
    ]
}
```

## Failure Payload (`status: "failed"`)

When algorithm processing fails, a failure webhook is sent with error details:

```json
{
    "status": "failed",
    "timestamp": "2026-01-22T12:00:00.000000Z",
    "call": {
        "algorithm_run_id": 12346,
        "algorithm_id": "appointment_scheduling",
        "job_id": 7914,
        "recording_url": "https://storage.example.com/recording.mp3",
        "recording_duration": 45.0,
        "contacted_date": "2026-01-22",
        "contacted_time": "11:00:00"
    },
    "error": {
        "stage": "extraction",
        "message": "LLM failed to extract structured data from transcript",
        "code": "EXTRACTION_FAILED"
    }
}
```

### Failure Payload Fields

#### Top Level

| Field | Type | Description |
|  --- | --- | --- |
| `status` | string | Always `"failed"` for failed processing |
| `timestamp` | string | ISO 8601 timestamp when webhook was generated |
| `call` | object | Minimal call metadata (no transcript) |
| `error` | object | Error details |


#### Call Section (Failure)

The failure payload includes a minimal `call` section for identification:

| Field | Type | Description |
|  --- | --- | --- |
| `algorithm_run_id` | integer | Unique ID of the failed algorithm run |
| `algorithm_id` | string | Algorithm that failed |
| `job_id` | integer | Job ID in the pipeline |
| `recording_url` | string | URL of the source audio recording |
| `recording_duration` | number | Duration in seconds (if available) |
| `contacted_date` | string | Date when call occurred |
| `contacted_time` | string | Time when call occurred |


**Note:** The transcript is NOT included in failure payloads to keep them minimal.

#### Error Section

| Field | Type | Description |
|  --- | --- | --- |
| `stage` | string | Processing stage where failure occurred |
| `message` | string | Human-readable error description |
| `code` | string | Machine-readable error code |


### Error Codes

| Code | Stage | Description |
|  --- | --- | --- |
| `TRANSCRIPTION_FAILED` | transcription | Audio transcription failed |
| `EXTRACTION_FAILED` | extraction | LLM failed to extract structured data |
| `VALIDATION_FAILED` | validation | Extracted data failed schema validation |
| `PROCESSING_FAILED` | (any) | General processing failure |


**Notes:**

- Webhook delivery failures (`error_stage: "publishing"`) do NOT trigger failure webhooks to avoid infinite loops.
- Transcription failures (`error_stage: "transcription"`) generate failure webhooks for each configured algorithm, even though no extraction was attempted. This ensures you're notified of all processing failures.


## Handling Both Payload Types

Your webhook endpoint should check the `status` field to determine how to process the payload:

### Python Example

```python
from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/webhook', methods=['POST'])
def handle_webhook():
    payload = request.get_json()

    if payload['status'] == 'completed':
        # Process successful extraction
        job_id = payload['call']['job_id']
        algorithm_run_id = payload['call']['algorithm_run_id']
        patients = payload['data']

        for entry in patients:
            patient = entry.get('patient', {})
            appointment = entry.get('appointment', {})
            # ... process extracted data

        return jsonify({'status': 'processed'}), 200

    elif payload['status'] == 'failed':
        # Handle failure notification
        job_id = payload['call']['job_id']
        algorithm_run_id = payload['call']['algorithm_run_id']
        error = payload['error']

        log_error(
            job_id=job_id,
            algorithm_run_id=algorithm_run_id,
            stage=error['stage'],
            code=error['code'],
            message=error['message']
        )

        return jsonify({'status': 'acknowledged'}), 200

    return jsonify({'error': 'Unknown status'}), 400
```

### Node.js Example

```javascript
app.post('/webhook', (req, res) => {
    const payload = req.body;

    if (payload.status === 'completed') {
        // Process successful extraction
        const { job_id, algorithm_run_id } = payload.call;
        const patients = payload.data;

        patients.forEach(entry => {
            const { patient, appointment } = entry;
            // ... process extracted data
        });

        res.json({ status: 'processed' });

    } else if (payload.status === 'failed') {
        // Handle failure notification
        const { job_id, algorithm_run_id } = payload.call;
        const { stage, code, message } = payload.error;

        console.error(`Job ${job_id} failed at ${stage}: ${message} (${code})`);

        res.json({ status: 'acknowledged' });
    }
});
```

## Webhook Delivery

- Webhooks are sent per algorithm run (a job may have multiple algorithms)
- Both success and failure webhooks are sent to the same configured URL
- Failed webhook deliveries are retried with exponential backoff (3 retries)
- All webhooks include `X-Job-ID` and `X-Algorithm-ID` headers for routing


## Related Documentation

- [Webhook Authentication](/gensail-analytics/guides/webhook-authentication) - Verify webhook signatures
- [Webhook Credential Configuration](/gensail-analytics/guides/webhook-credential-configuration) - Configure outgoing authentication


## Support

For questions about webhook payloads, contact [support@gensail.com](mailto:support@gensail.com).