Skip to main content

Roadmap

Our vision is for LLM Context Forge to become the standard, infrastructure-grade tool for completely deterministic context management regardless of the underlying LLM provider.

✅ Completed (v1.x)

Exact Tokenization: Exact cl100k, o200k, huggingface tokenizers without heuristics.
Intelligent Chunking: 5 robust strategies (sentence, paragraph, semantic, code, fixed).
Priority Packing: The ContextWindow algorithm mimicking system-level queues.
Pre-flight Estimation: Cross-model cost calculators.
Dual Language Parity: 100% equivalent behavior in Python and TypeScript.

🚧 In Progress (v1.5)

Streaming Context Window: Ability to dynamically stream the packed items using async generators down to the client.
Image Context Sizing: Adding exact token sizing for image inputs (e.g., resizing logic for GPT-4o vision tokens).
Rust Core / FFI binding: Rewriting the heaviest math/chunking operations in Rust and linking to both Python and Node for ultimate speed.

📋 Planned (v2.0)

Multi-Modal Native: Audio and Video token estimations natively within the context packings.
KV-Cache State Tracking: Tracking what prompts have been sent to APIs supporting Context Caching (like Anthropic) to accurately predict cache-hit costs vs cache-miss costs.

✅ Completed (v1.x)
🚧 In Progress (v1.5)
📋 Planned (v2.0)