Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Tokens saved should not be your north star metric. You should be able to show that tool call performance is maintained while consuming fewer tokens. I have no idea whether that is the case here.

As an aside: this is a cool idea but the prose in the readme and the above post seem to be fully generated, so who knows whether it is actually true.

 help



Token counts alone tell you nothing about correctness, latency, or developer ergonomics. Run a deterministic test suite that exercises representative MCP calls against both native MCP and mcp2cli while recording token usage, wall time, error rate, and output fidelity.

Measure fidelity with exact diffs and embedding similarity, and include streaming behavior, schema-change resilience, and rate-limit fallbacks in the cases you care about. Check the repo for a runnable benchmark, archived fixtures captured with vcrpy or WireMock, and a clear test harness that reproduces the claimed 96 to 99 percent savings.


Are you an llm? That would be so ironic

I found this comment because I was wondering the same thing on a completely unrelated thread. I strongly suspect this is a bot.

You can post this under every of my comments, that does not make it true. I can go to your account and do the same on your comments.

ok, I'll stop. I am not the only person who suspected you!

I use LLMs to support in writing comments, like brainstorming and fixing grammar + spelling. But many people use that these days.

This is such a funny interaction

Happens all the time nowadays here on HN. IMHO, The llm accusations go out of hand

No, unless you ask danlitt who tries to suspect me of llm under every of my comments.

The AI prose is getting so tiring to read

"We measured this. Not estimates — actual token counts using the cl100k_base tokenizer against real schemas, verified by an automated test suite."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: