Hacker Newsnew | past | comments | ask | show | jobs | submit | Terr_'s commentslogin

IMO a better approach would be individualized addresses.

So someone visiting your blog who wants to e-mail you can burn some CPU cycles to "earn" an address that hasn't been given out to anybody else, e.g. user+TOKEN@example.com, where it is algorithmically-unlikely for them to be able to guess a different TOKEN that will work. Then if abuse occurs, you can just retire that one address.

Naturally, this would be best with an e-mail client that is aware of the scheme, and with a mail-service that has some API for generating new addresses, such as if you want to cold e-mail somebody and use a new from/return address.

Some years ago I had the fanciful idea of doing it with a phone-app, where it manages creating new addresses as-needed, disabling them, and keeping notes about who you gave them to.


I suppose the unfamiliarity of the term in English shouldn't be too surprising if it's a translation. Looks like the Wikipedia entry was created 2023, the oldest citation in its list is 2010.

On a marginally lighter note, I wonder if there are any connections to the "Managed Democracy" of the Helldivers games, where it is used as a hyper-normalized euphemism by a brutal regime.


I'm a Russian native speaker and that's first time that I hear these term ever.

It's hard to explain without spoilers, but Isaac Asimov's The Feeling of Power (1958) is relevant to this concept of warfare.

It's huge when you consider all the data humans have stored and transferred orally over the millennia.

Music, meter, and rhyme are all (among other things) algorithms for indexing and error-correction, tools very suitable to the squishy hardware.


> He simply needs to

I think a lot of people struggle to imagine the kinds of dirty-deeds ("ratf***ing") that are both possible and effective, especially when the perpetrators don't (feel) constrained by an implicit baseline of plausible consistency or morality. Being unable to brainstorm them up is, perhaps, a kind of backhanded compliment.

Imagine trying to warn someone in 2010 that in a few years an outgoing President, stung at an election loss, could foment a violent mob that would break into the Capitol to hunt and chase legislators that were formalizing that loss, issue blanket pardons for everyone involved, and his party would still protect him from being impeached over it.

For that matter, some people are still surprised to learn about the "Brooks Brothers Riot" [0] of 2000, where a crowd of Republican campaign staffers threatened workers into stopping a recount of certain ballots.

[0] https://www.theguardian.com/us-news/2020/sep/24/us-elections...


> You can't tell me with a straight face that all of the thousands of developers who develop these products/services care deeply about the quality of the product.

What about caring and being depressed because quality comes from systems rather than (just) individuals?


And boats, amd submerged drones, and mines...

There's probably a strong self-selection factor going on, in terms of the kind of person that typically seeks out that kind of experience.

Recycling an old post:

> We had the first 4+ years to learn that "malice or incompetence" is not the right question. There's been more than enough pathological input to show it becomes a denial-of-service attack on observers.

> The correct answer is both, until and unless the perpetrators wish to come forward and defend themselves as just malicious or just incompetent.

One might also view it as a kind politically-flavored nerd-sniping. [0] Sometimes the only winning move is not to play.

[0] https://xkcd.com/356/


> OCR for construction documents does not work

I'm reminded of the Xerox JBIG2 bug back in ~2013, where certain scan settings could silently replace numbers inside documents, and bad construction-plans were one of the cases that led to it being discovered. [0]

It wasn't overt OCR per se, end-user users weren't intending to convert pixels to characters or vice-versa.

[0] https://www.youtube.com/watch?v=c0O6UXrOZJo&t=6m03s


If I recall it was an artifact of the compression algo.

Full context and details: https://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres...


JBIG2 does glyph binning, as you say not exactly OCR, but similar. So chunks of the image that look sufficiently similar get replaced with a reference to a single instance.

> not exactly OCR, but similar. So chunks of the image that look sufficiently similar get replaced with a reference to a single instance.

How can we describe OCR that wouldn't match this definition exactly?


Glyph binning looks for any chunks in the image that are similar to eachother, regardless of what they are. Letters, eyeballs, pennies, triangles, etc without caring what it is. OCR looks specifically to try and identify characters (i.e. it starts with a knowledge of an alphabet, then looks for things in the image that look like those.

If the image is actually text, both of them can end up finding things. Binning will identify "these things look almost the same", while OCR will identify "these look like the letter M"


It's not too hard, while they share some mechanics, the underlying use-cases and requirements are very different.

_______ Optical character recognition:

1. You have a set of predefined patterns of interest which are well-known.

2. You're trying your best to find all occurrences of those patterns. If a letter appears only once, you still need to detect it.

3. You don't care much about visual similarity within a category. The letter "B" written in extremely different fonts is the same letter.

4. You care strongly about the boundaries between categories. For example, "B+" must resolve to two known characters in sequence.

5. You want to keep details of exactly where something was found, or at the least in what order they were found. You're creating a layer of new details, which may be added to the artifact.

_______ "Glyph compression":

1. You don't have a predefined set of patterns, the algorithm is probably trying to dynamically guess at patterns which are sufficiently similar and frequent.

2. Your aren't trying to find all occurrences, only sufficiently similar and common ones, to maximize compression. If a letter appears only once, it can be ignored.

3. You do care strongly about visual similarity within a category, you don't want to mix-n-match fonts.

4. You don't care about clear category lines, if "B+" becomes its own glyph, that's no problem.

5. You're discarding detail from the artifact, to make it smaller.


Jbig2 dynamically pulls reference chunks out of the image, which makes it more likely to have insufficient separation between the target shapes.

It also gives a false sense of security when it displays dirty pixels that still clearly show a specific digit, since you think you're basically looking at the original.


That's a description of Jbig2, not a description of OCR.

Jbig2 is an OCR algorithm that doesn't assume the document comes from a pre-existing alphabet.


You asked what the difference was, and I said the difference. Was it unclear that to fit the phrasing of your question, we add "OCR doesn't"? I would not personally call Jbig2 OCR.

> You asked what the difference was, and I said the difference.

Take another look at my comment.


Let me try rephrasing to make the response to your original comment as clear as possible.

Question: "How can we describe OCR that wouldn't match this definition exactly?"

Answer: This definition largely fits OCR, but "reference to a single instance" is a weird way to phrase it. A better definition of OCR would include how it uses builtin knowledge of glyphs and text structure, unlike JBIG2 which looks for examples dynamically. And that difference in technique gives you a significant difference in the end results.

Is that better?

The definition you quoted is not an "exact" fit to OCR, it's a mildly misleading fit to OCR, and clearing up the misleading part makes it no longer fit both.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: