Skip to main content

Model Registry

The ModelRegistry acts as the single source of truth for all components. It maps model identifiers (e.g., gpt-4o) to their respective tokenization backends, context limits, and pricing rates.

Built-in Models

LLM Context Forge ships with configurations for major production models.

Model IDProviderContext WindowInput (per 1M)Output (per 1M)
gpt-4oopenai128,000$2.50$10.00
gpt-4o-miniopenai128,000$0.15$0.60
gpt-4openai8,192$30.00$60.00
claude-3-5-sonnetanthropic200,000$3.00$15.00
claude-3-haikuanthropic200,000$0.25$1.25
gemini-1.5-progoogle2,000,000$1.25$5.00
gemini-1.5-flashgoogle1,000,000$0.075$0.30
llama-3.1-405bmeta128,000$0.00$0.00
mistral-largemistral32,000$2.00$6.00

Note: Pricing and windows are updated regularly to track provider changes.

Registering Custom Models

You can register fine-tuned, self-hosted, or new models dynamically at runtime.

Basic Registration

If your model uses a standard encoding (like cl100k_base), registration is simple:

from llm_context_forge import ModelRegistry, ModelConfig

ModelRegistry.register(
"company-finetune-v1",
ModelConfig(
provider="self-hosted",
context_window=32000,
encoding="cl100k_base",
input_price_1m=0.0,
output_price_1m=0.0
)
)

# Now other components can use it
counter = TokenCounter("company-finetune-v1")

Custom Tokenizer Backend

If you rely on a specific HuggingFace tokenizer, you can map the model to it:

ModelRegistry.register(
"llama-3-custom",
ModelConfig(
provider="self-hosted",
context_window=8192,
backend="transformers",
hf_repo_id="meta-llama/Meta-Llama-3-8B"
)
)