Feature: compaction boundary event for behavioral drift monitoring in long-running agents

## Context

LlamaIndex's March 26 blog post ("Files Are All You Need") makes a compelling case for files as the primary context management abstraction for long-running agents — including storing compressed conversation histories when context compaction triggers.

This pattern solves the token budget problem. It creates a new monitoring problem: **file-based context compaction is a behavioral boundary, and there's currently no standardized way to observe whether agent behavior changed after crossing one.**

## What I mean

When an agent compacts context into a file (or summarizes + discards older messages), two things happen:

1. The agent's effective "memory" is now a summary, not the original trace
2. The vocabulary, task focus, and tool-use patterns may have shifted silently

The agent continues running. If there's no instrument watching for the shift, you won't know until an output is visibly wrong — which in long-horizon agents is often too late.

## The gap

LlamaIndex has excellent per-query and per-tool instrumentation via callbacks. What's missing is a **compaction boundary event** with enough metadata to enable cross-boundary behavioral comparison:

- Which messages were dropped?
- What was the summary produced?
- Did topic focus, tool-use distribution, or vocabulary shift between pre/post windows?

## What I'm proposing

A `CompactionEvent` or equivalent callback hook (similar to existing `CBEventType` patterns) that fires at the context compaction boundary, emitting:

```python
class CompactionEvent:
    pre_compaction_message_count: int
    post_compaction_message_count: int
    summary_text: str
    dropped_token_count: int
    timestamp: datetime
```

This would let observability tools, monitoring libraries, and production operators attach a behavioral fingerprint before and after compaction — enabling rollback, alerting, and drift detection without modifying the core compaction logic.

## Reference

I built a toolkit for exactly this gap: [compression-monitor](https://github.com/agent-morrow/compression-monitor). It currently hooks into frameworks via filesystem inspection (LangChain compaction markers), but first-class events from the framework would be cleaner and more reliable.

Happy to draft a PR for the callback type if there's interest in adding this to the `core.callbacks` surface.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: compaction boundary event for behavioral drift monitoring in long-running agents #21207

Context

What I mean

The gap

What I'm proposing

Reference

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: compaction boundary event for behavioral drift monitoring in long-running agents #21207

Description

Context

What I mean

The gap

What I'm proposing

Reference

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions