Context Summarization

Overview

Context summarization automatically compresses older conversation history when token or message limits are reached. It is enabled on LLMAssistantAggregatorParams, configured via LLMAutoContextSummarizationConfig (auto-trigger thresholds) and LLMContextSummaryConfig (summary generation params), and managed by LLMContextSummarizer. For a walkthrough of how to enable and customize context summarization, see the Context Summarization guide.

LLMAssistantAggregatorParams

from pipecat.processors.aggregators.llm_response_universal import LLMAssistantAggregatorParams

The summarization-related fields on LLMAssistantAggregatorParams.

enable_auto_context_summarization

bool

default:"False"

Enables automatic context summarization. When False (the default), the summarizer is still created internally so that on-demand summarization via LLMSummarizeContextFrame works, but automatic trigger checks are skipped. Set to True to enable automatic summarization when either max_context_tokens or max_unsummarized_messages is reached.

auto_context_summarization_config

LLMAutoContextSummarizationConfig | None

default:"None"

Configuration for automatic summarization thresholds and summary generation. When None, default LLMAutoContextSummarizationConfig values are used.

LLMAutoContextSummarizationConfig

from pipecat.utils.context.llm_context_summarization import LLMAutoContextSummarizationConfig

Controls when automatic context summarization triggers.

max_context_tokens

int | None

default:"8000"

Maximum context size in estimated tokens before triggering summarization. Tokens are estimated using the heuristic of 1 token per 4 characters. Set to None to disable token-based triggering. At least one of max_context_tokens or max_unsummarized_messages must be set.

max_unsummarized_messages

int | None

default:"20"

Maximum number of new messages before triggering summarization, even if the token limit has not been reached. Set to None to disable message-count triggering. At least one of max_context_tokens or max_unsummarized_messages must be set.

summary_config

LLMContextSummaryConfig

default:"LLMContextSummaryConfig()"

Configuration for how summaries are generated. See below.

LLMContextSummaryConfig

from pipecat.utils.context.llm_context_summarization import LLMContextSummaryConfig

Controls how summaries are generated. Used as summary_config inside LLMAutoContextSummarizationConfig, or passed directly to LLMSummarizeContextFrame for on-demand summarization.

target_context_tokens

int

default:"6000"

Target token count for the generated summary. Passed to the LLM as max_tokens. Auto-adjusted to 80% of max_context_tokens if it exceeds that value.

min_messages_after_summary

int

default:"4"

Number of recent messages to preserve uncompressed after each summarization.

summarization_prompt

str | None

default:"None"

Custom system prompt for the LLM when generating summaries. When None, uses a built-in default prompt.

summary_message_template

str

default:"\"Conversation summary: {summary}\""

Template for formatting the summary when injected into context. Must contain {summary} as a placeholder. Allows wrapping summaries in custom delimiters (e.g., XML tags) so system prompts can distinguish summaries from live conversation.

llm

LLMService | None

default:"None"

Dedicated LLM service for generating summaries. When set, summarization requests are sent to this service instead of the pipeline’s primary LLM. Useful for routing summarization to a cheaper or faster model. When None, the pipeline LLM handles summarization.

summarization_timeout

float

default:"120.0"

Maximum time in seconds to wait for the LLM to generate a summary. If exceeded, summarization is aborted and future summarization attempts are unblocked.

LLMSummarizeContextFrame

from pipecat.frames.frames import LLMSummarizeContextFrame

Push this frame into the pipeline to trigger on-demand context summarization without waiting for automatic thresholds.

config

LLMContextSummaryConfig | None

default:"None"

Per-request override for summary generation settings (prompt, token budget, messages to keep). When None, the summarizer’s default LLMContextSummaryConfig is used.

On-demand summarization works even when enable_auto_context_summarization is False — the summarizer is always created internally to handle manually pushed frames.

from pipecat.frames.frames import LLMSummarizeContextFrame

# Trigger with default settings
await llm.queue_frame(LLMSummarizeContextFrame())

# Trigger with per-request overrides
await llm.queue_frame(
    LLMSummarizeContextFrame(
        config=LLMContextSummaryConfig(
            target_context_tokens=2000,
            min_messages_after_summary=2,
        )
    )
)

If a summarization is already in progress, the manual request is ignored.

LLMContextSummarizer

from pipecat.processors.aggregators.llm_context_summarizer import LLMContextSummarizer

Monitors context size and orchestrates summarization. Created automatically by LLMAssistantAggregator when enable_auto_context_summarization=True.

Event Handlers

Event	Parameters	Description
`on_summary_applied`	`event: SummaryAppliedEvent`	Emitted after a summary has been successfully applied to the context.

on_summary_applied

The on_summary_applied event is exposed on both LLMContextSummarizer and LLMAssistantAggregator. Register handlers on the aggregator for cleaner access:

@assistant_aggregator.event_handler("on_summary_applied")
async def on_summary_applied(aggregator, summarizer, event: SummaryAppliedEvent):
    logger.info(
        f"Context summarized: {event.original_message_count} -> "
        f"{event.new_message_count} messages "
        f"({event.summarized_message_count} summarized, "
        f"{event.preserved_message_count} preserved)"
    )

You can also register handlers directly on the summarizer if you have access to it:

summarizer = assistant_aggregator._summarizer
@summarizer.event_handler("on_summary_applied")
async def on_summary_applied(summarizer, event: SummaryAppliedEvent):
    logger.info(
        f"Context summarized: {event.original_message_count} -> "
        f"{event.new_message_count} messages"
    )

SummaryAppliedEvent

from pipecat.processors.aggregators.llm_context_summarizer import SummaryAppliedEvent

Event data emitted when context summarization completes successfully.

original_message_count

int

Number of messages in context before summarization.

new_message_count

int

Number of messages in context after summarization.

summarized_message_count

int

Number of messages that were compressed into the summary.

preserved_message_count

int

Number of messages preserved uncompressed (initial system message at messages[0] if present, plus recent messages).

Deprecated: LLMContextSummarizationConfig

from pipecat.utils.context.llm_context_summarization import LLMContextSummarizationConfig

LLMContextSummarizationConfig is deprecated since v0.0.104. Use LLMAutoContextSummarizationConfig with a nested LLMContextSummaryConfig instead. The old class still works but emits a DeprecationWarning.

Both max_context_tokens and max_unsummarized_messages can now be set to None independently to disable that threshold. At least one must remain set.

The old class flattened all parameters into a single object. Migrate by splitting trigger thresholds (max_context_tokens, max_unsummarized_messages) into LLMAutoContextSummarizationConfig and summary generation params into LLMContextSummaryConfig:

# Before (deprecated)
config = LLMContextSummarizationConfig(
    max_context_tokens=4000,
    target_context_tokens=3000,
    max_unsummarized_messages=10,
)

# After
config = LLMAutoContextSummarizationConfig(
    max_context_tokens=4000,
    max_unsummarized_messages=10,
    summary_config=LLMContextSummaryConfig(
        target_context_tokens=3000,
    ),
)

Similarly, the LLMAssistantAggregatorParams fields were renamed:

enable_context_summarization → enable_auto_context_summarization
context_summarization_config → auto_context_summarization_config

The old field names still work with a DeprecationWarning.

Pipecat Server

Pipecat Subagents

Client SDKs

Pipecat Flows

Pipecat Cloud

CLI

Context Summarization

Overview

LLMAssistantAggregatorParams

LLMAutoContextSummarizationConfig

LLMContextSummaryConfig

LLMSummarizeContextFrame

LLMContextSummarizer

Event Handlers

on_summary_applied

SummaryAppliedEvent

Deprecated: LLMContextSummarizationConfig

Pipecat Server

Pipecat Subagents

Client SDKs

Pipecat Flows

Pipecat Cloud

CLI

Documentation Index

​Overview

​LLMAssistantAggregatorParams

​LLMAutoContextSummarizationConfig

​LLMContextSummaryConfig

​LLMSummarizeContextFrame

​LLMContextSummarizer

​Event Handlers

​on_summary_applied

​SummaryAppliedEvent

​Deprecated: LLMContextSummarizationConfig

Overview

LLMAssistantAggregatorParams

LLMAutoContextSummarizationConfig

LLMContextSummaryConfig

LLMSummarizeContextFrame

LLMContextSummarizer

Event Handlers

on_summary_applied

SummaryAppliedEvent

Deprecated: LLMContextSummarizationConfig