All posts
·8 min read

Your Copilot is quietly leaking 53,000 tokens per call

What I found when I parsed real GitHub Copilot chat exports — and the dashboard I built to fix it before June 1, 2026.

  • Copilot
  • Cost
  • MCP
  • Tooling
Laptop analytics scene visualizing hidden token waste as glowing data fragments.

The setup

GitHub is moving Copilot to usage-based billing on June 1, 2026. Every chat, agent session and CLI interaction will draw from a monthly AI Credits pool. 1 credit = $0.01.

So I did what any reasonable engineer would do: I exported a few weeks of debug logs and started measuring.

The shock

In one fairly normal session, 96% of the tool schemas injected into every single request went completely unused — burning roughly 53,000 tokens of dead weight per API call.

That isn't a usage pattern problem. That's a configuration problem hiding in plain sight.

The three big cost drivers

  1. Conversation length. Longer threads = more tokens = more credits. Context isn't free.
  2. Agentic mode. A single agent task can quietly trigger dozens of model calls internally.
  3. Model choice. Frontier models can cost 10× more per token than lighter ones.

What I built

A small dashboard — Copilot Cost Lens — that parses chat exports and breaks down:

  • Token bloat by MCP server.
  • Context accumulation across sessions.
  • Cache hit rate.
  • Estimated credit burn per interaction.

It immediately told me which MCP servers to disable per workspace, which tasks to route to lighter models, and where agentic loops were quietly hemorrhaging credits.

What you should do before June 1

  • Disable MCP servers your current project doesn't need.
  • Reset context between repetitive batch tasks.
  • Match model complexity to task complexity.
  • Know your token footprint before the meter starts running.

Treat your context window like a budget, not a buffet.

Up next

AI photography is photography

On taste, intent, and why the camera was never the interesting part.