Skip to main content

Conversation Manager

While ContextWindow handles single-shot point-in-time packing, the ConversationManager is designed for stateful, multi-turn chat applications. It automatically maintains a sliding window of conversation history.

import { ConversationManager } from 'llm-context-forge';

const manager = new ConversationManager("gpt-4o", {
systemPrompt: "You are a helpful assistant.",
maxTokens: 8000,
reserveOutput: 1000
});

How It Works

The Conversation Manager uses a specific retention strategy when the conversation exceeds the token budget:

  1. System Prompt: Always retained (CRITICAL)
  2. Current Query: Always retained (HIGH)
  3. Recent History: Retained starting from the newest messages, working backwards (MEDIUM)
  4. Old History: Dropped when budget is exhausted

This ensures the LLM always has the instructions and the immediate context, while gracefully "forgetting" older turns.

Usage

// 1. User sends a message
manager.addUserMessage("Can you summarize the document?");

// 2. Add some RAG context for this turn only
manager.addEphemeralContext(documentText);

// 3. Assemble the prompt for the LLM
const prompt = manager.assemble();

// ... Call OpenAI/Anthropic API ...
const llmResponse = await llmAPI(prompt);

// 4. Record the assistant's reply to maintain history
manager.addAssistantMessage(llmResponse.text);

History Format

The assembled prompt formats history in an optimized way. You can customize the role prefixes if your completion API requires specific formatting:

const manager = new ConversationManager("llama-3", {
rolePrefixes: {
user: "<|start_header_id|>user<|end_header_id|>\n",
assistant: "<|start_header_id|>assistant<|end_header_id|>\n"
}
});