Tokens saved should not be your north star metric. You should be able to show th...

hrmtst93837 · 2026-03-09T09:00:50 1773046850

Token counts alone tell you nothing about correctness, latency, or developer ergonomics. Run a deterministic test suite that exercises representative MCP calls against both native MCP and mcp2cli while recording token usage, wall time, error rate, and output fidelity.

Measure fidelity with exact diffs and embedding similarity, and include streaming behavior, schema-change resilience, and rate-limit fallbacks in the cases you care about. Check the repo for a runnable benchmark, archived fixtures captured with vcrpy or WireMock, and a clear test harness that reproduces the claimed 96 to 99 percent savings.

stephantul · 2026-03-09T13:32:30 1773063150

Are you an llm? That would be so ironic

danlitt · 2026-03-09T14:38:50 1773067130

I found this comment because I was wondering the same thing on a completely unrelated thread. I strongly suspect this is a bot.

hrmtst93837 · 2026-03-09T14:52:16 1773067936

You can post this under every of my comments, that does not make it true. I can go to your account and do the same on your comments.

danlitt · 2026-03-09T14:57:41 1773068261

ok, I'll stop. I am not the only person who suspected you!

hrmtst93837 · 2026-03-09T15:12:49 1773069169

I use LLMs to support in writing comments, like brainstorming and fixing grammar + spelling. But many people use that these days.

stephantul · 2026-03-09T15:25:10 1773069910

This is such a funny interaction

hrmtst93837 · 2026-03-09T15:44:55 1773071095

Happens all the time nowadays here on HN. IMHO, The llm accusations go out of hand

hrmtst93837 · 2026-03-09T14:53:06 1773067986

No, unless you ask danlitt who tries to suspect me of llm under every of my comments.

rakag · 2026-03-09T09:44:20 1773049460

The AI prose is getting so tiring to read

"We measured this. Not estimates — actual token counts using the cl100k_base tokenizer against real schemas, verified by an automated test suite."