Deterministic token counting, intelligent chunking, priority-based context packing, and pre-flight cost estimation. One API — Python & TypeScript.
Deterministic counting via tiktoken and js-tiktoken. No heuristics — exact cl100k_base, o200k_base, and model-specific encodings.
Five strategies — Sentence, Paragraph, Semantic, Code, and Fixed — that respect natural boundaries with configurable overlap.
CRITICAL → HIGH → MEDIUM → LOW. Guarantees system prompts survive. Lower-priority items gracefully dropped at the boundary.
Pre-flight USD cost calculation with per-model pricing. Compare costs across GPT-4o, Claude, Gemini, and Llama models instantly.
Ship with 15+ production models. Register custom models with exact context windows, pricing, and tokenizer backends.
Identical algorithms in Python and TypeScript. Same inputs produce the same token counts, chunks, and assembled contexts.