Skip to main content

mycontext-ai v0.5.0: Async Execution, Token-Budget Assembly, and Production Hardening

· 3 min read

mycontext-ai v0.5.0 is live on PyPI. This release focuses on three capabilities that matter in production: non-blocking async execution, token-accurate context assembly, and validated structured output — plus a set of reliability improvements that make the library safer and more observable.

What's New

Async Execution

ctx.aexecute() is a native async coroutine backed by litellm.acompletion. It never blocks the event loop and integrates directly into FastAPI routes, agent loops, and any other async application:

result = await ctx.aexecute(provider="openai", model="gpt-4o-mini")

Fan out multiple independent contexts in parallel — wall-clock time equals the slowest call:

results = await asyncio.gather(
ctx_root_cause.aexecute(provider="openai"),
ctx_risk.aexecute(provider="anthropic"),
ctx_summary.aexecute(provider="openai"),
)

Token-Budget Assembly

assemble_for_model() builds a prompt guaranteed to fit within a model's context window. It counts tokens accurately with tiktoken, orders sections by priority, and trims the lowest-priority content when the budget is tight:

# Fits precisely into gpt-4o-mini's window
prompt = ctx.assemble_for_model(model="gpt-4o-mini")

# Custom budget — useful inside agentic loops
prompt = ctx.assemble_for_model(model="gpt-4o", max_tokens=4000)

The old character-based 6000-char truncation in to_prompt() is replaced by this model-aware, token-accurate approach.

Pydantic-Validated Structured Output

All intelligence-layer LLM responses (suggest_patterns, generate_context, TemplateIntegratorAgent) now go through a three-tier parse pipeline:

  1. Instructor (when installed) — function-calling mode with automatic retry on validation failure
  2. Pydantic + JSON extraction — validates the schema before returning
  3. Regex fallback — field-by-field extraction for edge cases

A single transient LLM formatting error will not break your pipeline.

Reliability Improvements

  • Template injection preventionsafe_format_template blocks attribute/item access patterns ({obj.attr}, {obj[key]}) in all template substitutions
  • Structured logging — every exception in the intelligence layer is logged at WARNING with exc_info=True before falling back; nothing is silently swallowed
  • Execution tracing — every LLM call records model, tokens, cost, latency, and errors in an in-process Span
  • Retry logic — the LiteLLM provider retries transient errors with exponential backoff
  • Heuristic routingsmart_execute() classifies simple questions without an LLM call; only complex questions go through the full router
  • Lazy pattern registry — the 85-pattern registry loads once and is reused across all TransformationEngine instances

Installing

pip install mycontext-ai==0.5.0

# With structured output parsing
pip install mycontext-ai==0.5.0 instructor

# With accurate token counting
pip install "mycontext-ai[tokens]==0.5.0"

Full CHANGELOG and API reference.