Skip to main content

Getting Started

LLM Context Forge is production-grade infrastructure for managing LLM context windows. It provides deterministic token counting, intelligent document chunking, priority-based context packing, and pre-flight cost estimation — available as identical, cross-platform libraries for Python and TypeScript.

The Problem

Every team building on LLMs hits the same walls:

ProblemConsequence
Context overflowSilent truncation or 400 errors from the API
Heuristic token countinglen(text) / 4 is wrong 15-30% of the time
Naive chunkingSplitting mid-sentence destroys retrieval quality
Pricing surprises$200 bills from untested prompt pipelines
Platform inconsistencyPython prototype ≠ TypeScript production behavior

The Solution

LLM Context Forge eliminates all five problems with a single, dependency-light package:

┌─────────────────────────────────────────────────────┐
│ LLM Context Forge │
├──────────┬──────────┬──────────┬────────────────────┤
│ Token │ Smart │ Context │ Cost │
│ Counter │ Chunker │ Packer │ Estimator │
├──────────┴──────────┴──────────┴────────────────────┤
│ Model Registry (15+ models) │
│ OpenAI · Anthropic · Google · Meta · Mistral │
└─────────────────────────────────────────────────────┘

Quick Install

pip install llm-context-forge

30-Second Demo

from llm_context_forge import TokenCounter, ContextWindow, Priority

# Count tokens exactly
counter = TokenCounter("gpt-4o")
print(counter.count("Hello, world!")) # deterministic result

# Build a context window with priorities
window = ContextWindow("gpt-4o")
window.add_block("You are an expert assistant.", Priority.CRITICAL, "system")
window.add_block("User question here...", Priority.HIGH, "query")
window.add_block("Retrieved document...", Priority.MEDIUM, "rag_0")

prompt = window.assemble(max_tokens=4000)
stats = window.usage()
print(f"Used {stats.tokens_used} tokens, dropped {stats.excluded} blocks")

What's Next?

  • Core Concepts — Understand how tokenization, chunking, and packing work under the hood
  • Python SDK — Full Python setup with CLI, REST API, and advanced tokenizers
  • TypeScript SDK — Node.js/browser setup with full type safety

:::tip Cross-Platform Parity The Python and TypeScript editions produce identical results for the same inputs. You can prototype in Python and ship in TypeScript (or vice versa) with zero behavioral drift. :::