Rule reference
Every warning and error AIOptimize emits — with severity, what surface it fires on, why it matters, and the shape of the fix. KB v0.3.0-preview · last revised 2026-05-07.
Suppress on a single run with aioptimize scan . --ignore D003,D013. Suppress permanently per project via .aioptimize.toml.
At a glance
| ID |
Severity |
Surface |
Catches |
| D001 | warn | SDK call · Python + TS | Anthropic system prompt without cache_control |
| D002 | warn | SDK call · wrapper-aware | Deprecated Anthropic / OpenAI model id |
| D003 | info | SDK call · wrapper-aware | Missing max_tokens / maxTokens |
| D004 | info | SDK call · Python + TS | Interactive surface missing stream=True |
| D005 | info | SDK call · Python + TS | Batch-eligible workload not using Batch API |
| D006 | warn | SDK call · Python + TS | OpenAI prompt-hinted JSON instead of structured output |
| D007 | error | File scan | Hardcoded API key in source |
| D008 | warn | SDK call · Python + TS | System instruction placed in user role |
| D009 | warn | Skill / subagent / workflow YAML | Frontmatter pins a deprecated model |
| D010 | info | Skill / subagent | Empty or stub description: |
| D011 | warn | Subagent | Tools list grants more than the body uses |
| D012 | warn | settings.json hooks | Hook curls a provider endpoint with no --max-time |
| D013 | info | .mcp.json | Stdio MCP entry without a timeout |
| D015 | warn | Skill / subagent / command body | Prose recommends a deprecated model id |
| D016 | warn | TS / JS workflow code | Vercel Workflow step.run LLM call without fallback |
There is no D014. That ID was reserved for LangChain class-method detection (ChatAnthropic(...).invoke(...)), which requires variable-assignment tracking and is not yet implemented.
Details
Each card describes when the rule fires, why it matters, and the shape of the fix. For examples in real PRs, see the demo projects under examples/.
D001
warn
SDK call · Python + TS
Anthropic system prompt not cached
Fires on Anthropic messages.create calls with a string system= argument longer than ~1024 tokens and no cache_control={'type': 'ephemeral'} on a content block.
Why: ephemeral prompt caching returns 25–90% cost reduction on cache hits and ~85% lower first-token latency. Reused system prompts are the canonical trigger.
system=[{"type":"text","text":SYSTEM_PROMPT,"cache_control":{"type":"ephemeral"}}]
D002
warn
SDK call · wrapper-aware
Deprecated model version
Fires when the model= argument matches an entry in the KB's deprecated_models list. Wrapper-aware: also catches claude-agent-sdk.query(model=…) and Vercel AI SDK anthropic("…") / openai("…") factory calls.
Why: deprecated models continue to load but silently degrade — slower routing, higher cost, eventual hard failure. The KB ships the recommended replacement per id.
D003
info
SDK call · wrapper-aware
Missing max_tokens / maxTokens
Fires when neither max_tokens= (Python / Anthropic SDK) nor maxTokens: (Vercel AI SDK camelCase) is present.
Why: unbounded output combined with streaming or agent loops can produce four-figure runs. Setting it to the smallest value the feature needs is a hard cost ceiling.
D004
info
SDK call · Python + TS
Interactive surface missing stream=True
Fires on calls inside functions whose names suggest an interactive handler (chat, respond, stream, cli) and that have no stream=True.
Why: streaming cuts perceived first-token latency by 3–5x. Any surface a human is waiting on benefits.
D005
info
SDK call · Python + TS
Batch-eligible workload not using Batch API
Fires on calls inside for / while loops, or inside functions named backfill / bulk / batch.
Why: Anthropic and OpenAI Batch APIs return a 50% discount on standard token pricing in exchange for up to 24h of latency. Loop call sites are the canonical batch-eligible pattern.
D006
warn
SDK call · Python + TS
Prompt-hinted JSON instead of structured output
Fires on OpenAI chat.completions.create calls that instruct the model to "return JSON" in prose without setting response_format=.
Why: structured outputs guarantee schema validity, eliminate retry loops on malformed JSON, and drop tokens spent on format instructions.
Hardcoded API key in source
Fires on literals matching sk-ant-[A-Za-z0-9_\-]{20,} or sk-[A-Za-z0-9]{20,} in any scanned file.
Why: hardcoded keys leak through git history, logs, container images, and client bundles. Rotating a leaked key is expensive; an env lookup prevents the leak.
api_key=os.environ["ANTHROPIC_API_KEY"]
D008
warn
SDK call · Python + TS
System instruction placed in user role
Fires when the first user message starts with system-shaped phrasing ("You are…", "Act as…") and there is no system= argument.
Why: OpenAI prioritizes system messages differently than user messages. Putting persona / behavioral rules in user content dilutes instruction-following and blocks prompt-caching reuse.
Agentic-config detectors
Fire on Claude-Code-style configuration files: skills, subagents, slash commands, hook definitions, MCP server registries, and YAML workflows.
D009
warn
Skill / subagent / workflow YAML
Frontmatter pins a deprecated model
Fires when model: in a skill / subagent / YAML workflow step matches the KB's deprecated-models list.
Why: frontmatter pins are easy to set and forget. A pinned model that quietly enters deprecation continues to load, but the harness silently degrades.
D010
info
Skill / subagent
Empty or stub description
Fires when a skill or subagent has no description: or one that is empty / TODO / TBD / FIXME.
Why: the harness ranks skills against user intent using the description. Empty descriptions mean the skill is dead weight — never invoked, but still loaded into context.
Tool over-permission
Fires when the subagent's tools: list contains a tool that never appears as a literal token in its body. Implicit-tool exemptions: Read, Write, Edit, Glob, Grep are not flagged because they are commonly invoked from natural language.
Why: over-permissive subagents can fire destructive tools (Bash, Write, WebFetch) by accident, and bloat context with unused tool schemas.
D012
warn
settings.json hooks
Hook unbounded LLM call
Fires when a hook's command shells out to api.anthropic.com or api.openai.com via curl and does not pass --max-time or --connect-timeout.
Why: hooks fire on every harness lifecycle event. An unbounded LLM request inside a frequently-firing hook is a runaway-cost vector.
Stdio MCP server has no timeout
Fires when a server entry in .mcp.json with type: stdio (or no explicit type) has no timeout field.
Why: stdio servers run as child processes. Without a timeout, a stalled child blocks the harness's tool-call path indefinitely, making the agent appear hung.
D015
warn
Skill / subagent / command body
Body recommends a deprecated model in prose
Fires when the prose body of a skill / subagent / slash command mentions a deprecated model id from the KB list.
Why: prose recommendations propagate into derived prompts. The frontmatter check (D009) misses these. A line like "use claude-3-opus-20240229 for this" overrides the harness default and is invisible to D009.
D016
warn
TS / JS workflow code
Vercel Workflow step has no fallback
Fires when a step.run(...) block in a file that imports @vercel/workflow contains a generateText / streamText / generateObject / streamObject call and the same block has neither experimental_fallback, fallbackModel, fallback:, nor retries: / retry:.
Why: workflow steps are durable boundaries. A single-model failure inside a step fails the whole workflow with no recovery path.