Great to see more sandboxing options. The next gap we'll see: sandboxes isolate ...

GrinningFool · 2026-02-17T16:03:38 1771344218

I think it's funny that we're moving in the direction of providing extremely fine-grained permissions models to serve AI and prevent it from accessing things it should not - but that's a level of control we will never have (or even expect to have) over third parties that use our sensitive data.

TheTaytay · 2026-02-16T23:55:39 1771286139

Yes please! I feel like we need filters for everything: file reading, network ingress egress, etc Starting with simpler filters and then moving up the semantic ones…

ryanrasti · 2026-02-17T04:46:14 1771303574

Exactly! The key is making the filters composable and declarative. What's your use case/integrations you'd be most interested in?

mlinksva · 2026-02-17T04:31:04 1771302664

ExoAgent (from your bio/past comments) looks really interesting. Godspeed!

subscribed · 2026-02-17T00:25:17 1771287917

So basically WAF, but smarter :)

beepbooptheory · 2026-02-17T01:03:34 1771290214

Maybe this is just me, but you'd think at some point it's not really a "sandbox" anymore.

dotancohen · 2026-02-17T06:12:47 1771308767

When the whole beach is in the sandbox, the sandbox is no longer the isolated environment it ostensibly should be.

ATechGuy · 2026-02-16T23:46:54 1771285614

And how are you going to define what ocaps/flows are needed when agent behavior is not defined?

ryanrasti · 2026-02-17T04:52:24 1771303944

This is a really good question because it hits on the fundamental issue: LLMs are useful because they can't be statically modeled.

The answer is to constrain effects, not intent. You can define capabilities where agent behavior is constrained within reasonable limits (e.g., can't post private email to #general on Slack without consent).

The next layer is UX/feedback: can compile additional policy based as user requests it (e.g., only this specific sender's emails can be sent to #general)

botusaurus · 2026-02-17T05:56:51 1771307811

but how do you check that an email is being sent to #general, agents are very creative at escaping/encoding, they could even paraphrase the email in words

decades ago securesm OSes tracked the provenience of every byte (clean/dirty), to detect leaks, but it's hard if you want your agent to be useful

ryanrasti · 2026-02-17T06:36:39 1771310199

> decades ago securesm OSes tracked the provenience of every byte (clean/dirty), to detect leaks, but it's hard if you want your agent to be useful

Yeah, you're hitting on the core tradeoff between correctness and usefulness.

The key differences here: 1. We're not tracking at byte-level but at the tool-call/capability level (e.g., read emails) and enforcing at egress (e.g., send emails) 2. Agent can slowly learn approved patterns from user behavior/common exceptions to strict policy. You can be strict at the start and give more autonomy for known-safe flows over time.

botusaurus · 2026-02-17T11:29:35 1771327775

what about the interaction between these 2 flows:

- summarize email to text file

- send report to email

the issue is tracking that the first step didnt contaminate the second step, i dont see how you can solve this in a non-probabilistic works 99% of the time way

ryanrasti · 2026-02-17T20:11:00 1771359060

I think what you're saying is agent can write to an intermediate file, then read from it, bypassing the taint-tracking system.

The fix is to make all IO tracked by the system -- if you read a file it has taints as part of the read, either from your previous write or configured somehow.

gostsamo · 2026-02-17T06:13:00 1771308780

you can restrict the email send tool to have to/cc/bcc emails hardcoded in a list and an agent independent channel should be the one to add items to it. basically the same for other tools. You cannot rewire the llm, but you can enumerate and restrict the boundaries it works through.

exfiltrating info through get requests won't be 100% stopped, but will be hampered.

botusaurus · 2026-02-17T06:21:20 1771309280

parent was talking about a different problem. to use your framing, how you ensure that in the email sent to the proper to/cc/bcc as you said there is no confidential information from another email that shouldnt be sent/forwarded to these to/cc/bcc

gostsamo · 2026-02-17T06:58:31 1771311511

The restricted list means that it is much harder for someone to social engineer their way in on the receiving end of an exfiltration attack. I'm still rather skeptical of agents, but a pattern where the agent is allowed mostly readonly access, its output is mainly user directed, and the rest of the output is user approved, you cut down the possible approaches for an attack to work.

If you want more technical solutions, put a dumber clasifier on the output channel, freeze the operation if it looks suspicious instead of failing it and provoking the agent to try something new.

None of this is a silver bullet for a generic solution and that's why I don't have such an agent, but if one is ready to take on the tradeoffs, it is a viable solution.

ATechGuy · 2026-02-17T06:17:46 1771309066

TBH, this looks like an LLM-assisted response.

zmmmmm · 2026-02-17T06:51:20 1771311080

and then the next:

> you're hitting on the core tradeoff between correctness and usefulness

The question is, is it a completely unsupervised bot or is a human in the loop. I kind of hope a human is not in the loop with it being such a caricature of LLM writing.

amne · 2026-02-17T09:55:27 1771322127

you have to reference Royal food tasting somehow. just saying