Skip to main content
◆ v1.x — Now on PyPI & npm

Production-Grade Context Window Infrastructure for LLMs

Deterministic token counting, intelligent chunking, priority-based context packing, and pre-flight cost estimation. One API — Python & TypeScript.

PyPI versionnpm versionMIT LicensePython 3.9+TypeScript 5.0+

Exact Token Counting

Deterministic counting via tiktoken and js-tiktoken. No heuristics — exact cl100k_base, o200k_base, and model-specific encodings.

Intelligent Chunking

Five strategies — Sentence, Paragraph, Semantic, Code, and Fixed — that respect natural boundaries with configurable overlap.

Priority Context Packing

CRITICAL → HIGH → MEDIUM → LOW. Guarantees system prompts survive. Lower-priority items gracefully dropped at the boundary.

Cost Estimation

Pre-flight USD cost calculation with per-model pricing. Compare costs across GPT-4o, Claude, Gemini, and Llama models instantly.

Model Registry

Ship with 15+ production models. Register custom models with exact context windows, pricing, and tokenizer backends.

Cross-Platform Parity

Identical algorithms in Python and TypeScript. Same inputs produce the same token counts, chunks, and assembled contexts.

Install in seconds

Python
$ pip install llm-context-forge

See all install options →

TypeScript
$ npm install llm-context-forge

See TypeScript setup →