Skip to content

LiveKit AgentsMarkdown

Use Reson8 speech-to-text with LiveKit Agents to build real-time voice AI applications.

Installation

pip install livekit-plugins-reson8

Quick start

from livekit.agents import AutoSubscribe, JobContext, WorkerOptions, cli
from livekit.agents.voice import VoiceAgent
from livekit.plugins import openai, reson8

async def entrypoint(ctx: JobContext):
    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)

    agent = VoiceAgent(
        instructions="You are a helpful assistant.",
        stt=reson8.STT(),   # streaming + server-side turn detection, any language
        llm=openai.LLM(),   # any LLM
        tts=openai.TTS(),   # any TTS
    )
    agent.start(ctx.room)

    await agent.say("Hallo, hoe kan ik je helpen?")

if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

reson8.STT() auto-detects the spoken language. Pass any language code to pin it, e.g. reson8.STT(language="nl") for Dutch. The LLM and TTS can be any provider supported by LiveKit, including models available through LiveKit Inference.

How it works

Microphone → Reson8 STT (streaming + turn detection) → LLM → TTS → Speaker

When LiveKit uses the plugin in a voice agent, reson8.STT() opens a streaming connection to Reson8's turn-aware endpoint. Reson8 detects conversational turn boundaries server-side — no separate VAD plugin is required:

  1. Preflight transcript — an eager guess that the turn is over, so your agent can start responding immediately.
  2. Final transcript — confirms the preflight once the turn really is complete.
  3. Cancellation — if the speaker keeps talking, the preflight is withdrawn and streaming continues.

This server-side turn detection keeps voice-agent responses low-latency while avoiding premature interruptions when the user pauses mid-sentence.

Upgrading from Silero VAD

Earlier versions of this guide paired Reson8's batch API with livekit-plugins-silero and a StreamAdapter. The current plugin streams with built-in turn detection, so you no longer need livekit-plugins-silero, livekit-plugins-turn-detector, or a StreamAdapter — just pass reson8.STT() directly as the agent's stt.

Configuration

Only api_key is required (can also be set via the RESON8_API_KEY environment variable). All other parameters are optional.

Parameter Required Default Description
api_key Yes RESON8_API_KEY env var API key from console.reson8.dev
api_url No https://api.reson8.dev API base URL
language No None (auto-detect) One of the supported languages
custom_model_id No None Custom model ID for domain-specific transcription
sample_rate No 16000 Audio sample rate in Hz
encoding No pcm_s16le Audio encoding
channels No 1 Number of audio channels
include_timestamps No False Include timing data on transcripts
include_words No False Include word-level detail
include_confidence No False Include confidence scores (batch recognition)
include_language No False Report the detected language while streaming

Call STT.update_options(...) to change settings at runtime; active streaming sessions reconnect automatically to apply them.

Environment variables

RESON8_API_KEY=your-api-key
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your-livekit-key
LIVEKIT_API_SECRET=your-livekit-secret