# Turns

Turn-level speech-to-text over WebSocket. The server emits events at turn boundaries.

```
wss://api.reson8.dev/v1/speech-to-text/turns
```

## Request

### Headers

| Header                   | Value                                         |
|--------------------------|-----------------------------------------------|
| Authorization            | `ApiKey <api_key>` or `Bearer <access_token>` |
| Sec-WebSocket-Protocol   | `bearer, <access_token>`                      |

See [Authentication](../../documentation/general/authentication.md) for which header to use in different situations.

### Query Parameters

| Parameter            | Type    | Default | Description                                                |
|----------------------|---------|---------|------------------------------------------------------------|
| `encoding`           | string  | `auto`  | Audio encoding: `auto`, `pcm_s16le`, `mulaw`, or `alaw`    |
| `sample_rate`        | number  | `16000` | Sample rate in Hz (only used depending on encoding)        |
| `channels`           | number  | `1`     | Number of audio channels (only used depending on encoding) |
| `language`           | string  |         | Language to transcribe. Recommended for best quality. When omitted, the server auto-detects each utterance independently. See [Languages](../../documentation/speech-to-text/languages.md) for supported codes |
| `custom_model_id`    | string  |         | Optional. ID of a [custom model](../custom-model/create.md) to bias transcription. Overrides the model configured on the API client |
| `include_timestamps` | boolean | `false` | Include `start_ms` and `duration_ms` on `turn_end_candidate` and words |
| `include_words`      | boolean | `false` | Include word-level detail on `turn_end_candidate`          |
| `include_language`   | boolean | `false` | Include detected `language` on `turn_end_candidate`        |
| `patterns`           | string  |         | Optional. Comma-separated regex-style patterns for short alphanumeric tokens (order codes, licence plates) to recover. Only set when the token is likely present — see [Patterns](../../documentation/speech-to-text/patterns.md) |

### Example

=== "Python"

    ```python
    import asyncio
    import websockets

    async def transcribe():
        url = "wss://api.reson8.dev/v1/speech-to-text/turns"
        headers = {"Authorization": "ApiKey <your_api_key>"}

        async with websockets.connect(url, additional_headers=headers) as ws:
            # Send audio data...

            async for message in ws:
                print(message)

    asyncio.run(transcribe())
    ```

=== "JavaScript (Browser)"

    ```javascript
    const token = "<your_access_token>";
    const url = "wss://api.reson8.dev/v1/speech-to-text/turns";

    // Passes token via Sec-WebSocket-Protocol header
    const ws = new WebSocket(url, ["bearer", token]);

    ws.onopen = () => {
      // Send audio data...
    };

    ws.onmessage = (event) => {
      console.log(event.data);
    };
    ```

## Sending Messages

### Audio

Binary WebSocket frame containing audio data.

## Receiving Messages

### Turn Start

Sent when the speaker begins a new turn. Not emitted when a turn resumes after a [Turn Continuation](#turn-continuation).

```json
{
  "type": "turn_start"
}
```

### Turn End Candidate

Sent when the server detects a silence likely to end the turn. It will either be confirmed by a `Turn End` or cancelled by a `Turn Continuation`.

=== "Default"

    ```json
    {
      "type": "turn_end_candidate",
      "text": "the patient presented with chest pain"
    }
    ```

=== "Everything Included"

    ```json
    {
      "type": "turn_end_candidate",
      "text": "the patient presented with chest pain",
      "language": "en",
      "start_ms": 1200,
      "duration_ms": 2400,
      "words": [
        { "text": "the", "start_ms": 1200, "duration_ms": 200 },
        { "text": "patient", "start_ms": 1410, "duration_ms": 450 },
        { "text": "presented", "start_ms": 1880, "duration_ms": 500 },
        { "text": "with", "start_ms": 2400, "duration_ms": 200 },
        { "text": "chest", "start_ms": 2620, "duration_ms": 350 },
        { "text": "pain", "start_ms": 3000, "duration_ms": 600 }
      ]
    }
    ```

| Field         | Type   | Included                       | Description                |
|---------------|--------|--------------------------------|----------------------------|
| `text`        | string | Always                         | The recognized text        |
| `language`    | string | When `include_language=true`   | The detected language code |
| `start_ms`    | number | When `include_timestamps=true` | Start time in milliseconds |
| `duration_ms` | number | When `include_timestamps=true` | Duration in milliseconds   |
| `words`       | array  | When `include_words=true`      | Word-level detail          |

Each word contains:

| Field         | Type   | Included                       | Description                |
|---------------|--------|--------------------------------|----------------------------|
| `text`        | string | Always                         | The recognized word        |
| `start_ms`    | number | When `include_timestamps=true` | Start time in milliseconds |
| `duration_ms` | number | When `include_timestamps=true` | Duration in milliseconds   |

### Turn End

Confirms that the previous turn end candidate is the final end for the turn.

```json
{
  "type": "turn_end"
}
```

### Turn Continuation

Cancels the previous turn end candidate as the speaker continued speaking. A later `turn_end_candidate` will include the full text from before and after the continuation.

```json
{
  "type": "turn_continuation"
}
```

## Errors

| Status | Code              | Description                    |
|--------|-------------------|--------------------------------|
| 400    | `INVALID_REQUEST` | Missing or invalid parameters  |
| 401    | `UNAUTHORIZED`    | Invalid or expired credentials |
| 500    | `INTERNAL_ERROR`  | Unexpected server error        |
