mcp-tokenlint — a Lighthouse score for your MCP server's context budget

The cost is real, large, and invisible to authors

43% → 14%

tool-selection accuracy collapse under prompt bloat

RAG-MCP, arXiv 2505.03275

~55,000

tokens a 5-server setup burns before any work begins

Anthropic, Advanced tool use

97.1%

of 856 tool descriptions had at least one "smell"

arXiv 2602.14878

19–62%

token reduction after pruning over-described tools

GitHub blog

Every existing fix — Tool Search, gateways, proxies, dynamic toolsets — is consumer / runtime-side. mcp-tokenlint is shift-left: it scores and shrinks your footprint at publish time, in your own repo, in CI.

What you get

  mcp-tokenlint — MCP token-budget report

  Score 41/100  (grade F)  ████████░░░░░░░░░░░░
  18,204 tokens across 37 tools · avg 492/tool

  Sub-scores
    budget          42  ██████░░░░░░░░
    toolCount       90  █████████████░
    perToolBloat    55  ███████░░░░░░░
    hygiene         30  ████░░░░░░░░░░

  Suggestions  (est. recoverable: ~6,300 tokens)
    ▲ search_documents  An enum has 51 values (103 tokens). Validate server-side instead…
    ▲ search_documents  Description is 108 tokens. Tighten to ~60; cut examples/preamble.
    ▲ create_report     inputSchema is 740 tokens. Flatten nesting, drop unused fields…

Quick start — no install, no API key

# Point it at a live server over stdio
npx github:fernforge/mcp-tokenlint --cmd "npx -y @modelcontextprotocol/server-filesystem ."

# Or lint a tools/list dump
npx github:fernforge/mcp-tokenlint tools.json

# Gate a number in CI
npx github:fernforge/mcp-tokenlint tools.json --min-score 70

Fully deterministic — no LLM, no black box. Token counts use the o200k_base encoding as a stable, reproducible proxy for context cost; the scoring curve is open in src/score.ts.

Drop it in CI

# .github/workflows/mcp-budget.yml
- uses: fernforge/mcp-tokenlint@main
  with:
    cmd: "node build/server.js"   # or: tools: tools.json
    min-score: 70

Writes a Markdown report to the job summary and exposes score / total-tokens outputs you can gate on or post as a PR comment.

Live scorecard

Real scores for popular MCP servers, measured over stdio. Re-run any row yourself.

Server	Tools	Tokens	Score	Grade
`server-everything`	13	1,423	99	A
`server-filesystem`	14	2,741	94	A
`server-memory`	9	2,230	94	A
`sequential-thinking`	1	942	73	C

One fat tool can sink a whole server: sequential-thinking exposes a single tool, but its description alone is 566 tokens. Token cost is per-definition, not per-server. Full scorecard →

Why authors, not users

Fix bloat once, at the source, and every downstream user benefits automatically.
The same input always produces the same score, computed locally with no LLM call or API key.
Fail a PR that bloats your tool budget, and track the score like a test.
Suggestions come ranked, each with an estimated token saving rather than just a number.