Skip to main content

Roadmap

Our vision is for LLM Context Forge to become the standard, infrastructure-grade tool for completely deterministic context management regardless of the underlying LLM provider.

✅ Completed (v1.x)

  • Exact Tokenization: Exact cl100k, o200k, huggingface tokenizers without heuristics.
  • Intelligent Chunking: 5 robust strategies (sentence, paragraph, semantic, code, fixed).
  • Priority Packing: The ContextWindow algorithm mimicking system-level queues.
  • Pre-flight Estimation: Cross-model cost calculators.
  • Dual Language Parity: 100% equivalent behavior in Python and TypeScript.

🚧 In Progress (v1.5)

  • Streaming Context Window: Ability to dynamically stream the packed items using async generators down to the client.
  • Image Context Sizing: Adding exact token sizing for image inputs (e.g., resizing logic for GPT-4o vision tokens).
  • Rust Core / FFI binding: Rewriting the heaviest math/chunking operations in Rust and linking to both Python and Node for ultimate speed.

📋 Planned (v2.0)

  • Multi-Modal Native: Audio and Video token estimations natively within the context packings.
  • KV-Cache State Tracking: Tracking what prompts have been sent to APIs supporting Context Caching (like Anthropic) to accurately predict cache-hit costs vs cache-miss costs.