New ChatGPT Models Seem to Leave Watermarks on Text

antirez · 2025-04-21T07:43:44 1745221424

Students cheating with LLMs show more a weakness of a system than anything else. Rich folks had other people make assignment for them for ages, now this got democratized. The educational institutions are mostly disinterested in really doing what they should do to teach critical thinking and understanding if students really understood. To check the home made assignments is a small part of that (in theory). It's not LLMs, is terrible schools.

trescenzi · 2025-04-21T12:19:13 1745237953

I studied CS and Philosophy in college. The philosophy department at the time was struggling to get people to declare so they started a campaign with the slogan “Thinking of a major? Major in thinking”. Which I’ve always thought was both clever and accurate. Such departments become more valuable as the harder skills are more automatable. I doubt that will translate into more philosophy majors but one can dream.

otabdeveloper4 · 2025-04-21T07:55:47 1745222147

Teachers can spot LLM-generated homework from a mile away. (Well, unless the homework assignment itself comes from a stock or autogenerated source.)

vasco · 2025-04-21T08:16:26 1745223386

They can when they can. Sometimes they just won't like a kid or be tired or just wrong and falsely accuse them. There's been plenty already. I had a kid in my class during some years that would brag to other kids that his mom wrote all his longer, non-math homeworks. You're only putting yourself back by cheating, when time comes for proper exams for university placement you won't have it available and you won't take anyone else's place. Most places nowadays won't even fail kids from passing to the next grade, so it doesn't even matter.

mhatma · 2025-04-21T07:31:42 1745220702

Looks to me like these "watermarks" are embedded in monetary numbers, acronyms etc. Maybe to stop them breaking into different lines?

crooked-v · 2025-04-21T07:52:45 1745221965

U+202F in the screenshot, between "FY" and "2024", is the "Narrow No-Break Space ". Similarly, U+A0 is the "No-Break Space" (aka  ).

It's not watermarks, it's just scraped typography.

helsinkiandrew · 2025-04-21T07:56:03 1745222163

The examples they give all look like valid uses of different Non-breaking spaces, with width hints for their use/location, this might be a little overzealous if written by a human but perhaps not for a machine.

https://en.wikipedia.org/wiki/Non-breaking_space

Different display apps may "display them identically" but others and typesetters/printing apps might not.

rep_lodsb · 2025-04-21T07:41:51 1745221311

Yeah, it's probably not an intentional watermark, just something the model has been trained to do. Maybe some professionally written news articles already use them for the same purpose?

Still hope HN adds a filter to block any comment with those characters in it :)

johnisgood · 2025-04-21T07:54:45 1745222085

It is very easy to filter those out from the output of GPT, though, using basic UNIX utilities. In fact, many methods don't survive reformatting or copy-pasting, not requiring filtering at all.

It is a very basic watermark technique (text steganography) if it indeed is supposed to be one.

A more advanced one would be a linguistic (grammar-based) one, but I am not going to give any more ideas. :D

rep_lodsb · 2025-04-21T08:18:39 1745223519

It's easy to remove those characters, but that still requires being aware of them, and an intent to deceive. So many people just copy LLM output here because they (wrongly) believe it adds something of value to a discussion.

johnisgood · 2025-04-21T08:30:36 1745224236

I do not think either that pasting output of LLM typically adds anything to the conversation. It might, usually it does not.

niel · 2025-04-21T07:38:46 1745221126

This is the most likely explanation.

I mean, sure, these characters could be used to help estimate the likelihood text was generated (because human writers might be less likely to add proper non-breaking spaces), but I doubt these are watermarks.

mgraczyk · 2025-04-21T07:36:00 1745220960

As others mention, these do not look like watermarks, it looks like it's just emitting exotic whitespace in certain cases.

bjt12345 · 2025-04-21T07:42:15 1745221335

Exotic whitespaces that destroy a programmer's free time.

rep_lodsb · 2025-04-21T07:43:49 1745221429

Probably won't show up in source code at all, though nothing of value would be lost if they did!

cluckindan · 2025-04-21T08:03:52 1745222632

Those characters do matter, though. They are the difference between

    $2.5
    billion

and

    $2.5 billion

cluckindan · 2025-04-21T08:02:39 1745222559

You mean typographically correct whitespace

mgraczyk · 2025-04-21T08:05:05 1745222705

But should probably be removed when copying, especially code

xigoi · 2025-04-21T14:23:03 1745245383

Why? If nt’s typographically correct, it is not going to appear in places where it could cause problems.

TekMol · 2025-04-21T07:44:07 1745221447

This looks like an attempt at content marketing, rather than a valid blog post.

A shocking title which brought this to the HN frontpage, but then it does not hold. The characters all look legit in the position they are used.

code-less · 2025-05-02T16:51:09 1746204669

you can use this tool to remove watermarks: https://gptwatermark.com

neel8986 · 2025-04-21T07:43:23 1745221403

Ideally this watermarks are much more subtle. For example they put watermark through different distribution of word sequence which are difficult to remove or identify

adt · 2025-04-21T17:24:53 1745256293

Nah, this is bs.

Google do it with Gemini tho:

https://lifearchitect.ai/watermarking/

derelicta · 2025-04-21T08:53:25 1745225605

Oof. Time to share this with the rest of my classmates lmao

Magma7404 · 2025-04-21T06:24:59 1745216699

I don't have a ChatGPT account and I can't reproduce it without being logged in. They should say whether it requires an account or not, because it's not very conclusive for me.

ungreased0675 · 2025-04-21T08:09:31 1745222971

Why were you expecting to use ChatGPT without an account?

relaxing · 2025-04-21T12:13:56 1745237636

Because you can use chatgpt without an account.

Magma7404 · 2025-04-21T16:29:18 1745252958

Because it's possible and because all the proxies who give you a free access to those models don't have those so-called watermarks, which makes me suspicious about this story.

And why the fuck was I downvoted to hell for asking a simple question and giving my opinion about the whole thing? Am I on reddit or what?