Roadmap
Our vision is for LLM Context Forge to become the standard, infrastructure-grade tool for completely deterministic context management regardless of the underlying LLM provider.
✅ Completed (v1.x)
- Exact Tokenization: Exact cl100k, o200k, huggingface tokenizers without heuristics.
- Intelligent Chunking: 5 robust strategies (sentence, paragraph, semantic, code, fixed).
- Priority Packing: The
ContextWindowalgorithm mimicking system-level queues. - Pre-flight Estimation: Cross-model cost calculators.
- Dual Language Parity: 100% equivalent behavior in Python and TypeScript.
🚧 In Progress (v1.5)
- Streaming Context Window: Ability to dynamically stream the packed items using async generators down to the client.
- Image Context Sizing: Adding exact token sizing for image inputs (e.g., resizing logic for GPT-4o vision tokens).
- Rust Core / FFI binding: Rewriting the heaviest math/chunking operations in Rust and linking to both Python and Node for ultimate speed.
📋 Planned (v2.0)
- Multi-Modal Native: Audio and Video token estimations natively within the context packings.
- KV-Cache State Tracking: Tracking what prompts have been sent to APIs supporting Context Caching (like Anthropic) to accurately predict cache-hit costs vs cache-miss costs.