Skip to main content

Documentation Index

Fetch the complete documentation index at: https://daily-docs-pr-4424.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Context summarization automatically compresses older conversation history when token or message limits are reached. It is enabled on LLMAssistantAggregatorParams, configured via LLMAutoContextSummarizationConfig (auto-trigger thresholds) and LLMContextSummaryConfig (summary generation params), and managed by LLMContextSummarizer. For a walkthrough of how to enable and customize context summarization, see the Context Summarization guide.

LLMAssistantAggregatorParams

from pipecat.processors.aggregators.llm_response_universal import LLMAssistantAggregatorParams
The summarization-related fields on LLMAssistantAggregatorParams.
enable_auto_context_summarization
bool
default:"False"
Enables automatic context summarization. When False (the default), the summarizer is still created internally so that on-demand summarization via LLMSummarizeContextFrame works, but automatic trigger checks are skipped. Set to True to enable automatic summarization when either max_context_tokens or max_unsummarized_messages is reached.
auto_context_summarization_config
LLMAutoContextSummarizationConfig | None
default:"None"
Configuration for automatic summarization thresholds and summary generation. When None, default LLMAutoContextSummarizationConfig values are used.

LLMAutoContextSummarizationConfig

from pipecat.utils.context.llm_context_summarization import LLMAutoContextSummarizationConfig
Controls when automatic context summarization triggers.
max_context_tokens
int | None
default:"8000"
Maximum context size in estimated tokens before triggering summarization. Tokens are estimated using the heuristic of 1 token per 4 characters. Set to None to disable token-based triggering. At least one of max_context_tokens or max_unsummarized_messages must be set.
max_unsummarized_messages
int | None
default:"20"
Maximum number of new messages before triggering summarization, even if the token limit has not been reached. Set to None to disable message-count triggering. At least one of max_context_tokens or max_unsummarized_messages must be set.
summary_config
LLMContextSummaryConfig
default:"LLMContextSummaryConfig()"
Configuration for how summaries are generated. See below.

LLMContextSummaryConfig

from pipecat.utils.context.llm_context_summarization import LLMContextSummaryConfig
Controls how summaries are generated. Used as summary_config inside LLMAutoContextSummarizationConfig, or passed directly to LLMSummarizeContextFrame for on-demand summarization.
target_context_tokens
int
default:"6000"
Target token count for the generated summary. Passed to the LLM as max_tokens. Auto-adjusted to 80% of max_context_tokens if it exceeds that value.
min_messages_after_summary
int
default:"4"
Number of recent messages to preserve uncompressed after each summarization.
summarization_prompt
str | None
default:"None"
Custom system prompt for the LLM when generating summaries. When None, uses a built-in default prompt.
summary_message_template
str
default:"\"Conversation summary: {summary}\""
Template for formatting the summary when injected into context. Must contain {summary} as a placeholder. Allows wrapping summaries in custom delimiters (e.g., XML tags) so system prompts can distinguish summaries from live conversation.
llm
LLMService | None
default:"None"
Dedicated LLM service for generating summaries. When set, summarization requests are sent to this service instead of the pipeline’s primary LLM. Useful for routing summarization to a cheaper or faster model. When None, the pipeline LLM handles summarization.
summarization_timeout
float
default:"120.0"
Maximum time in seconds to wait for the LLM to generate a summary. If exceeded, summarization is aborted and future summarization attempts are unblocked.

LLMSummarizeContextFrame

from pipecat.frames.frames import LLMSummarizeContextFrame
Push this frame into the pipeline to trigger on-demand context summarization without waiting for automatic thresholds.
config
LLMContextSummaryConfig | None
default:"None"
Per-request override for summary generation settings (prompt, token budget, messages to keep). When None, the summarizer’s default LLMContextSummaryConfig is used.
On-demand summarization works even when enable_auto_context_summarization is False — the summarizer is always created internally to handle manually pushed frames.
from pipecat.frames.frames import LLMSummarizeContextFrame

# Trigger with default settings
await llm.queue_frame(LLMSummarizeContextFrame())

# Trigger with per-request overrides
await llm.queue_frame(
    LLMSummarizeContextFrame(
        config=LLMContextSummaryConfig(
            target_context_tokens=2000,
            min_messages_after_summary=2,
        )
    )
)
If a summarization is already in progress, the manual request is ignored.

LLMContextSummarizer

from pipecat.processors.aggregators.llm_context_summarizer import LLMContextSummarizer
Monitors context size and orchestrates summarization. Created automatically by LLMAssistantAggregator when enable_auto_context_summarization=True.

Event Handlers

EventParametersDescription
on_summary_appliedevent: SummaryAppliedEventEmitted after a summary has been successfully applied to the context.

on_summary_applied

The on_summary_applied event is exposed on both LLMContextSummarizer and LLMAssistantAggregator. Register handlers on the aggregator for cleaner access:
@assistant_aggregator.event_handler("on_summary_applied")
async def on_summary_applied(aggregator, summarizer, event: SummaryAppliedEvent):
    logger.info(
        f"Context summarized: {event.original_message_count} -> "
        f"{event.new_message_count} messages "
        f"({event.summarized_message_count} summarized, "
        f"{event.preserved_message_count} preserved)"
    )
You can also register handlers directly on the summarizer if you have access to it:
summarizer = assistant_aggregator._summarizer
@summarizer.event_handler("on_summary_applied")
async def on_summary_applied(summarizer, event: SummaryAppliedEvent):
    logger.info(
        f"Context summarized: {event.original_message_count} -> "
        f"{event.new_message_count} messages"
    )

SummaryAppliedEvent

from pipecat.processors.aggregators.llm_context_summarizer import SummaryAppliedEvent
Event data emitted when context summarization completes successfully.
original_message_count
int
Number of messages in context before summarization.
new_message_count
int
Number of messages in context after summarization.
summarized_message_count
int
Number of messages that were compressed into the summary.
preserved_message_count
int
Number of messages preserved uncompressed (initial system message at messages[0] if present, plus recent messages).

Deprecated: LLMContextSummarizationConfig

from pipecat.utils.context.llm_context_summarization import LLMContextSummarizationConfig
LLMContextSummarizationConfig is deprecated since v0.0.104. Use LLMAutoContextSummarizationConfig with a nested LLMContextSummaryConfig instead. The old class still works but emits a DeprecationWarning.
Both max_context_tokens and max_unsummarized_messages can now be set to None independently to disable that threshold. At least one must remain set.
The old class flattened all parameters into a single object. Migrate by splitting trigger thresholds (max_context_tokens, max_unsummarized_messages) into LLMAutoContextSummarizationConfig and summary generation params into LLMContextSummaryConfig:
# Before (deprecated)
config = LLMContextSummarizationConfig(
    max_context_tokens=4000,
    target_context_tokens=3000,
    max_unsummarized_messages=10,
)

# After
config = LLMAutoContextSummarizationConfig(
    max_context_tokens=4000,
    max_unsummarized_messages=10,
    summary_config=LLMContextSummaryConfig(
        target_context_tokens=3000,
    ),
)
Similarly, the LLMAssistantAggregatorParams fields were renamed:
  • enable_context_summarizationenable_auto_context_summarization
  • context_summarization_configauto_context_summarization_config
The old field names still work with a DeprecationWarning.