Conversation Manager
While ContextWindow handles single-shot point-in-time packing, the ConversationManager is designed for stateful, multi-turn chat applications. It automatically maintains a sliding window of conversation history.
import { ConversationManager } from 'llm-context-forge';
const manager = new ConversationManager("gpt-4o", {
systemPrompt: "You are a helpful assistant.",
maxTokens: 8000,
reserveOutput: 1000
});
How It Works
The Conversation Manager uses a specific retention strategy when the conversation exceeds the token budget:
- System Prompt: Always retained (CRITICAL)
- Current Query: Always retained (HIGH)
- Recent History: Retained starting from the newest messages, working backwards (MEDIUM)
- Old History: Dropped when budget is exhausted
This ensures the LLM always has the instructions and the immediate context, while gracefully "forgetting" older turns.
Usage
// 1. User sends a message
manager.addUserMessage("Can you summarize the document?");
// 2. Add some RAG context for this turn only
manager.addEphemeralContext(documentText);
// 3. Assemble the prompt for the LLM
const prompt = manager.assemble();
// ... Call OpenAI/Anthropic API ...
const llmResponse = await llmAPI(prompt);
// 4. Record the assistant's reply to maintain history
manager.addAssistantMessage(llmResponse.text);
History Format
The assembled prompt formats history in an optimized way. You can customize the role prefixes if your completion API requires specific formatting:
const manager = new ConversationManager("llama-3", {
rolePrefixes: {
user: "<|start_header_id|>user<|end_header_id|>\n",
assistant: "<|start_header_id|>assistant<|end_header_id|>\n"
}
});