Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This seems to be the source report: https://openai.com/index/disrupting-malicious-ai-uses/ (since it would of course kill CNN, like almost all media outlets, to link to a non-affiliated primary source...)

Does this level of detail seem strange to anybody else? Shining such a strong light on OpenAI's moderation/manual review efforts seems like it would draw unwanted attention to the fact that ChatGPT conversations are anything but private, and seems somewhat at odds with their recent outrage about the subpoena for user chats in the NYT case.

Manual reviews of sensitive data are ok as long as their own employees are the reviewers, I suppose?

 help



From Anthropics recent blog post: https://www.anthropic.com/news/detecting-and-preventing-dist...

> By examining request metadata, we were able to trace these accounts to specific researchers at the lab.

> The volume, structure, and focus of the prompts were distinct from normal usage patterns

Clearly some employees of Anthropic personally looked at individual inputs and outputs of their API


I thought that was pretty open? Even their more privacy-oriented Zero Data Retention agreement (which isn’t so easy to get on your business account) includes an exception “where needed to comply with law or combat misuse”

This feels very planted. Wouldn't be surprised if this some attempt to look patriotic with the DoW turning up the heat against Anthropic.

given that OpenAI just nabbed the contract, I'd say that's spot on

that creepy feeling of "being watched" has mostly kept me from taking advantage of any SOTA models, i only dabble in a few local ones.

The level of detail does not seem surprising. they're both charged with maintaining a facade of privacy while eliminating any and all miss-use. Certainly they heavily analyze basically everything given to them.

And generally as a society we've been ok with basically zero privacy as long as the data we send stays inside the company we sent it too. Google reads all your emails? Sure thing, read away, just don't send them to the popo. Apple knows when you're ovulating? no problem, just don't tell Amazon. etc


Same here. My assumption is that anything sent to a hosted model is public information, it will be trained on and it will be collated with your identity. And even if public models have guardrails that prevent that information from being regurgitated as slop, every CEO/owner/investor/government/etc will all have access to the uncensored models that include everything.

If you've ever ran a SaaS business, you know this and you know you can have "God Mode" access to everything, even if you swear up and down that you don't/won't.

The owners of these models aren't your friends, they see you as objects. They want to take as much value as they possibly can from you and will starve you if/when the option appears. That includes selling and sharing whatever data they have on you to the highest bidders, and some of those bidders want scapegoats to parade around as domestic terrorists.

The fact that companies are willing to send their IP and business processes to entities that can easily launder it and out compete them is mind-boggling, as well.


If you own a store and people walk in, you observe this and take a mental note. You know who visits the store. If you sell tokens, you know who buys and what people buy. It's just that now, the metadata (what I buy, when I buy, what I look like) and my intrinsics (my data) are one. I send tokens, get tokens back. If there was a way to round-robin somehow across vendors to control who gets what, I'd do it.

When contracting out manufacturing, it's common sense to spread across manufacturers, so no single manufacturer has everything. They may have half a shell. Or an peripheral module without the core. Or a core without anything around it.


The signal founder is doing private llm’s at confer.to

Have you tried it? I’ve been meaning to.

Yes. Somewhat expensive given its web only (no api) but it works very well and new features are added continuously.

I use my local models to generate input for the SOTA models, so there is enough noise that the companies don't know what is real or not :)

Get list of your inputs mixed with generated ones and ask some model to tell you which ones are yours.

Other than that the approach in general is weak, most people likely generate lots of noise themselves. It's just about that one time you asked about X.


Yes, it is either a lie or an admission that OpenAI is a global surveillance mechanism.

Alas! My vision of One Fed Per Child hath come to pass!

in the year 2026 is there really anyone out there still who thinks that anything they do online is private on any way?

[flagged]


Literally could just have someone working at the embassy roleplay on their lunch break in a cafe to generate this evidence.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: