I'm presently in the process of building (read: directing claude/codex to build)...

discreteevent · 2026-02-25T16:00:21 1772035221

> It's nothing short of invigorating to have this degree of control over something so powerful

Is this really that different to programming? (Maybe you haven't programmed before?)

h14h · 2026-02-25T20:31:12 1772051472

Fair point.

> It's nothing short of invigorating to have this degree of control over something so powerful

I'm a SWE w/ >10 years, and you're right, this part has always been invigorating.

I suppose what's "new" here is the drastically reduced amount of cognitive energy I need build complex projects in my spare time. As someone who was originally drawn to software because of how much it lowered the barrier to entry of birthing an idea into existence (when compared to hardware), I am genuinely thrilled to see said barrier lowered so much further.

Sharing my own anecdotal experience:

My current day job is leading development of a React Native mobile app in Typescript with a backend PaaS, and the bulk of my working memory is filled up by information in that domain. Given this is currently what pays the bills, it's hard to justify devoting all that much of my brain deep-diving into other technologies or stacks merely for fun or to satisfy my curiosity.

But today, despite those limitations, I find myself having built a bespoke AI agent written from scratch in Go, using a janky beta AI Inference API with weird bugs and sub-par documentation, on a VPS sandbox with a custom Tmux & Neovim config I can "mosh" into from anywhere using finely-tuned Tailscale access rules.

I have enough experience and high-level knowledge that it's pretty easy for me to develop a clear idea of what exactly I want to build from a tooling/architecture standpoint, but prior to Claude, Codex, etc., the "how" of building it tended to be a big stumbling block. I'd excitedly start building, only to run into the random barriers of "my laptop has an ancient version of Go from the last project I abandoned" or "neovim is having trouble starting the lsp/linter/formatter" and eventually go "ugh, not worth it" and give up.

Frankly, as my career progressed and the increasingly complex problems at work left me with vanishingly less brain-space for passion projects, I was beginning to feel this crushing sense of apathy & borderline despair. I felt I'd never be able make good on my younger self's desire to bring these exciting ideas of mine into existence. I even got to the point where I convinced myself it was "my fault" because I lacked the metal to stomach the challenges of day-to-day software development.

Now I can just decide "Hmm.. I want an lightweight agent in a portable binary. Makes sense to use Go." or "this beta API offers super cheap inference, so it's worth dealing with some jank" and then let an LLM work out all the details and do all the troubleshooting for me. Feels like a complete 180 from where I was even just a year or two ago.

At the risk of sounding hyperbolic, I don't think it's overstating things to say that the advent of "agentic engineering" has saved my career.

afro88 · 2026-02-25T11:16:37 1772018197

What models and inference provider?

h14h · 2026-02-25T15:51:12 1772034672

I'm using kimi-k2-instruct as the primary model and building out tool calls that use gpt-oss-120b to allow it to opt-in to reasoning capabilities.

Using Vultr for the VPS hosting, as well as their inference product which AFAIK is by far the cheapest option for hosting models of these class ($10/mo for 50M tokens, and $0.20/M tokens after that). They also offer Vector Storage as part of their inference subscription which makes it very convenient to get inference + durable memory & RAG w/ a single API key.

Their inference product is currently in beta, so not sure whether the price will stay this low for the long haul.

ac29 · 2026-02-25T22:28:26 1772058506

You can definitely get gpt-oss-120b for much less than $0.20/M on openrouter (cheapest is currently 3.9c/M in 14c/M out). Kimi K2 is an order of magnitude larger and more expensive though.

What other models do they offer? The web page is very light on details

h14h · 2026-02-26T12:59:54 1772110794

Oh dang I had no idea that gpt-oss-120b was that cheap these days.

And yeah, given Vultr inference is in beta, their docs ain't great. In addition to kimi-k2-instruct and gpt-oss-120b, they currently offer:

deepseek-r1-distill-llama-70b deepseek-r1-distill-qwen-32b qwen2.5-coder-32b-instruct

Best way to get accurate up-to-date info on supported models is via their api: https://api.vultrinference.com/#tag/Models/operation/list-mo...

K2 is the only of the 5 that supports tool calling. In my testing, it seems like all five support RAG, but K2 loses knowledge of its registered tools when you access it through the RAG endpoint forcing you to pick one capability or the other (I have a ticket open for this).

Also, the R1-distill models are annoying to use because reasoning tokens are included in the output wrapped in <think> tags instead of being parsed into the "reasoning_content" field on responses. Also also, gpt-oss-120b has a "reasoning" field instead of "reasoning_content" like the R1 models.