xAI

Overview

xAI provides two text-to-speech services:

XAIHttpTTSService: Batch synthesis via HTTP API. Sends complete text and receives the full audio response.
XAITTSService: Streaming synthesis via WebSocket. Streams text incrementally and receives audio chunks as they’re synthesized, reducing latency.

Both support multiple languages and audio encoding formats.

xAI TTS API Reference

Complete API reference for all parameters and methods

WebSocket Example

Streaming WebSocket example with interruption handling

HTTP Example

Batch HTTP example

xAI Documentation

Official xAI voice API documentation

Installation

uv add "pipecat-ai[xai]"

Prerequisites

xAI Account: Sign up at xAI
API Key: Generate an API key from your account dashboard (also works with Grok API keys)

Set the following environment variable:

export GROK_API_KEY=your_api_key

Configuration

XAIHttpTTSService

api_key

str

required

xAI API key for authentication.

base_url

str

default:"https://api.x.ai/v1/tts"

xAI TTS endpoint URL. Override for custom or proxied deployments.

sample_rate

int

default:"None"

Output audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.

encoding

str

default:"pcm"

Output audio encoding format. Supported formats: "pcm", "mp3", "wav", "mulaw", "alaw".

aiohttp_session

aiohttp.ClientSession

default:"None"

Optional shared aiohttp session for HTTP requests. If None, the service creates and manages its own session.

settings

XAIHttpTTSService.Settings

default:"None"

Runtime-configurable settings. See Settings below.

Settings

Runtime-configurable settings passed via the settings constructor argument using XAIHttpTTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`None`	Model identifier. (Inherited from base settings.)
`voice`	`str`	`"eve"`	Voice identifier. (Inherited from base settings.)
`language`	`Language \| str`	`Language.EN`	Language code. (Inherited from base settings.)

XAITTSService

api_key

str

required

xAI API key for authentication.

base_url

str

default:"wss://api.x.ai/v1/tts"

xAI TTS WebSocket endpoint URL. Override for custom or proxied deployments.

sample_rate

int

default:"None"

Output audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.

codec

str

default:"pcm"

Output audio codec. Supported codecs: "pcm", "wav", "mulaw", "alaw". Defaults to "pcm" so emitted TTSAudioRawFrame objects need no decoding downstream.

settings

XAITTSService.Settings

default:"None"

Runtime-configurable settings. Uses the same settings structure as XAIHttpTTSService. Changing voice or language settings at runtime reconnects the WebSocket with new query parameters.

Supported Languages

xAI TTS supports 20 languages. Use the Language enum from pipecat.transcriptions.language:

Arabic (Egyptian, Saudi, UAE): Language.AR, Language.AR_EG, Language.AR_SA, Language.AR_AE
Bengali: Language.BN
Chinese: Language.ZH
English: Language.EN
French: Language.FR
German: Language.DE
Hindi: Language.HI
Indonesian: Language.ID
Italian: Language.IT
Japanese: Language.JA
Korean: Language.KO
Portuguese (Brazil, Portugal): Language.PT, Language.PT_BR, Language.PT_PT
Russian: Language.RU
Spanish (Spain, Mexico): Language.ES, Language.ES_ES, Language.ES_MX
Turkish: Language.TR
Vietnamese: Language.VI

Usage

WebSocket Streaming (XAITTSService)

Basic Setup

import os
from pipecat.services.xai.tts import XAITTSService

tts = XAITTSService(
    api_key=os.getenv("GROK_API_KEY"),
    settings=XAITTSService.Settings(
        voice="eve",
    ),
)

With Custom Language

from pipecat.transcriptions.language import Language

tts = XAITTSService(
    api_key=os.getenv("GROK_API_KEY"),
    settings=XAITTSService.Settings(
        voice="eve",
        language=Language.ES,
    ),
)

With Custom Sample Rate and Codec

tts = XAITTSService(
    api_key=os.getenv("GROK_API_KEY"),
    sample_rate=24000,
    codec="wav",
    settings=XAITTSService.Settings(
        voice="eve",
    ),
)

HTTP Batch (XAIHttpTTSService)

Basic Setup

import os
from pipecat.services.xai.tts import XAIHttpTTSService

tts = XAIHttpTTSService(
    api_key=os.getenv("GROK_API_KEY"),
    settings=XAIHttpTTSService.Settings(
        voice="eve",
    ),
)

With Custom Encoding

tts = XAIHttpTTSService(
    api_key=os.getenv("GROK_API_KEY"),
    encoding="mp3",
    settings=XAIHttpTTSService.Settings(
        voice="eve",
    ),
)

With Shared HTTP Session

import aiohttp

async with aiohttp.ClientSession() as session:
    tts = XAIHttpTTSService(
        api_key=os.getenv("GROK_API_KEY"),
        aiohttp_session=session,
        settings=XAIHttpTTSService.Settings(
            voice="eve",
        ),
    )

Updating Settings at Runtime

Voice settings can be changed mid-conversation using TTSUpdateSettingsFrame. This works for both services:

from pipecat.frames.frames import TTSUpdateSettingsFrame
from pipecat.services.xai.tts import XAITTSSettings
from pipecat.transcriptions.language import Language

await task.queue_frame(
    TTSUpdateSettingsFrame(
        delta=XAITTSSettings(
            language=Language.FR,
        )
    )
)

Note: For XAITTSService, changing voice or language settings reconnects the WebSocket with updated query parameters.

Notes

Service choice:
- Use XAITTSService (WebSocket) for lower latency streaming synthesis where audio begins playing before the full utterance finishes.
- Use XAIHttpTTSService (HTTP) for simpler batch synthesis or when WebSocket connections are not available.
Default audio format: Both services default to raw PCM output, which matches Pipecat’s downstream expectations without extra decoding.
Encoding/codec options: When using non-PCM formats (mp3, wav, mulaw, alaw), ensure your audio pipeline can handle the selected format.
Session management:
- XAIHttpTTSService: If you don’t provide an aiohttp_session, the service creates and manages its own session lifecycle automatically.
- XAITTSService: WebSocket connection is managed automatically; settings changes that affect URL parameters (voice, language) trigger a reconnection.

Pipecat Server

Pipecat Subagents

Client SDKs

Pipecat Flows

Pipecat Cloud

CLI

Overview

xAI TTS API Reference

WebSocket Example

HTTP Example

xAI Documentation

Installation

Prerequisites

Configuration

XAIHttpTTSService

Settings

XAITTSService

Supported Languages

Usage

WebSocket Streaming (XAITTSService)

Basic Setup

With Custom Language

With Custom Sample Rate and Codec

HTTP Batch (XAIHttpTTSService)

Basic Setup

With Custom Encoding

With Shared HTTP Session

Updating Settings at Runtime

Notes

Pipecat Server

Pipecat Subagents

Client SDKs

Pipecat Flows

Pipecat Cloud

CLI

Documentation Index

​Overview

xAI TTS API Reference

WebSocket Example

HTTP Example

xAI Documentation

​Installation

​Prerequisites

​Configuration

​XAIHttpTTSService

​Settings

​XAITTSService

​Supported Languages

​Usage

​WebSocket Streaming (XAITTSService)

​Basic Setup

​With Custom Language

​With Custom Sample Rate and Codec

​HTTP Batch (XAIHttpTTSService)

​Basic Setup

​With Custom Encoding

​With Shared HTTP Session

​Updating Settings at Runtime

​Notes

Overview

Installation

Prerequisites

Configuration

XAIHttpTTSService

Settings

XAITTSService

Supported Languages

Usage

WebSocket Streaming (XAITTSService)

Basic Setup

With Custom Language

With Custom Sample Rate and Codec

HTTP Batch (XAIHttpTTSService)

Basic Setup

With Custom Encoding

With Shared HTTP Session

Updating Settings at Runtime

Notes