◆ v1.x — Now on PyPI & npm

Production-Grade Context Window Infrastructure for LLMs

Deterministic token counting, intelligent chunking, priority-based context packing, and pre-flight cost estimation. One API — Python & TypeScript.

Deterministic counting via tiktoken and js-tiktoken. No heuristics — exact cl100k_base, o200k_base, and model-specific encodings.

Five strategies — Sentence, Paragraph, Semantic, Code, and Fixed — that respect natural boundaries with configurable overlap.

CRITICAL → HIGH → MEDIUM → LOW. Guarantees system prompts survive. Lower-priority items gracefully dropped at the boundary.

Pre-flight USD cost calculation with per-model pricing. Compare costs across GPT-4o, Claude, Gemini, and Llama models instantly.

Ship with 15+ production models. Register custom models with exact context windows, pricing, and tokenizer backends.

Identical algorithms in Python and TypeScript. Same inputs produce the same token counts, chunks, and assembled contexts.

Install in seconds

Python

$ pip install llm-context-forge

TypeScript

$ npm install llm-context-forge