Anecdotes of occasional problems, even at a low or unquantified rate, are valid & useful evidence that something negative is happening.
Anecdotes that sometimes those problems don't occur are nearly worthless. Of course that's true - the original anecdotal
complaint already implicitly relies on, & grants, the idea that there's some default, "hoped for" ideal from which their experience has fallen short.
To chime in, "never had your problems" thus adds no info. Yes, people lucky enough not to hit those Signal limits that cause others to lose data exist, of course. But how does that testimony help those with problems? Should their frustration be considered less important or credible, because of your luck?
The as-if portrayal is one way your anecdote will be perceived, even if that wasn't your intent.
Why do you need a message history? The only use I can think of, if someone uses it against you in court. I don't remember looking up anything in history.
I look up old iMessages, emails, group chat comments, and so forth constantly, often finding valuable gems of wit, reference-material, recommendations, or media that I dimly remember from years or even decades ago.
Signal and other messaging apps offer a 'search' bar across all sessions & history, so I doubt I'm the only one.
It's hard for me to imagine being so present-focused such a history wouldn't be personally useful.
Or, so worried about "someone [using] it against [me] in court" that I'd need more than the occasional auto-expiration, and specifically my messenger "protecting" me with intermittently-enforced loss-of-histories (on just theft/loss/hard-failure of primary device).
Plot these against the rate of similar attempts (or successes) over the past century if you want to convince others of anything other than your own subjective presentist perspective.
The Wikipedia page is useful, and as you've identified the 2025 MN Representative Hortman murder as the "first assassination of a sitting legislator at the state or federal level in my lifetime" – not counting the 2015 murder of SC Senator Pinckney – is it safe to assume you're a precociously-posting 10-year-old?
I was born in 1970; per your reference, there've been a bunch of state & federal legislators (or recently-former legislators) killed for political (or pseudo-political deranged) motives "in my lifetime" – and far more in the 1970s than in the last 10 years.
In my lifetime, one sitting President was shot at & missed (Ford in 1976), and one was shot at & hit by a ricochet (Reagan in 1981) – again, more in the past than the shots that grazed candidate Trump in 2024.
The Wikipedia-listed murders of other officeholders, like mayors or judges, are also more frequent in the past than recently – especially going before either of our lifetimes.
So trend impressions are very subject to frames of reference & familiarity with history.
I suspect if people in general had a deeper & broader sense of how common political violence has been, both in US history & worldwide, they'd be, on the one hand, less prone to panic over recent events & rhetoric (even though it is concerning), but also on the other hand more appreciative of the relative peace of recent decades (even with the last few years' events).
> not counting the 2015 murder of SC Senator Pinckney
fair enough. not sure how I skipped over that one.
> So trend impressions are very subject to frames of reference & familiarity with history.
I don't disagree with this. but nonetheless in my lifetime (< 30 years) I have mostly lived through only the "relative peace of recent decades" so the increase in political violence over the last few years is very scary.
Not sure you can judge whether these modern models do well on the 'arithmetic analogization' task based on absolute similarity values – & especially L2 distances.
That it ever worked was simply that, among the universe of candidate answers, the right answer was closer to the arithmetic-result-point than other candidates – not necessarily close on any absolute scale. Especially in higher dimensions, everything gets very angularly far from everything else - the "curse of dimensionality".
But the relative differences may still be just as useful/effective. So the real evaluation of effectiveness can't be done with the raw value diff(king-man+woman, queen) alone. It needs to check if that value is less than that for every other alternative to 'queen'.
(Also: canonically these exercises were done as cosine-similarities, not Euclidean/L2 distance. Rank orders will be roughly the same if all vectors normalized to the unit sphere before arithmetic & comparisons, but if you didn't do that, it would also make these raw 'distance' values less meaningful for evaluating this particular effect. The L2 distance could be arbitrarily high for two vectors with 0.0 cosine-difference!)
> It needs to check if that value is less than that for every other alternative to 'queen'.
There you go: Closest 3 words (by L2) to the output vector for the following models, out of the most common 2265 spoken English words among which is also "queen":
voyage-3-large: king (0.46), woman (0.47), young (0.52), ... queen (0.56)
ollama-qwen3-embedding:4b: king (0.68), queen (0.71), woman (0.81)
text-embedding-3-large: king (0.93), woman (1.08), queen (1.13)
All embeddings are normalized to unit length, therefore L2 dists are normalized.
So of those 3, despite the superficially "large" distances, 2 of the 3 are just as good at this particular analogy as Google's 2013 word2vec vectors, in that 'queen' is the closest word to the target, when query-words ('king', 'woman', 'man') are disqualified by rule.
But also: to really mimic the original vector-math and comparison using L2 distances, I believe you might need to leave the word-vectors unnormalized before the 'king'-'man'+'woman' calculation – to reflect that the word-vectors' varied unnormalized magnitudes may have relevant translational impact – but then ensure the comparison of the target-vector to all candidates is between unit-vectors (so that L2 distances match the rank ordering of cosine-distances). Or, just copy the original `word2vec.c` code's cosine-similarity-based calculations exactly.
Another wrinkle worth considering, for those who really care about this particular analogical-arithmetic exercise, is that some papers proposed simple changes that could make word2vec-era (shallow neural network) vectors better for that task, and the same tricks might give a lift to larger-model single-word vectors as well.
For example:
- Levy & Goldberg's "Linguistic Regularities in Sparse and Explicit Word Representations" (2014), suggesting a different vector-combination ("3CosMul")
- Mu, Bhat & Viswanath's "All-but-the-Top: Simple and Effective Postprocessing for Word Representations" (2017), suggesting recentering the space & removing some dominant components
> you might need to leave the word-vectors unnormalized before the 'king'-'man'+'woman' calculation – to reflect that the word-vectors' varied unnormalized magnitudes may have relevant translational impact
I believe translation should be scale-invariant, and scale should not affect rank ordering
> I believe translation should be scale-invariant, and scale should not affect rank ordering
I don't believe this is true with regard to ending angles after addition steps between vectors of varying magnitudes.
Imagine just in 2D: vector A at 90° & magnitude 1.0, vector B at 0° & magnitude 0.5, and vector B' at 0° but normalized to magnitude 1.0.
The vectors (A+B) and (A+B') will be at both different magnitudes and different directions.
Thus, cossim(A,(A+B')) will be notably less than cossim(A,(A+B)), and more generally, if imagining the whole unit circles as filled with candidate nearest-neighbors, (A+B) and (A+B') may have notably different ranked lists of cosine-similarity nearest-neighbors.
If by 'doc2vec' you mean the word2vec-like 'Paragraph Vectors' technique: even though that's a far simpler approach than the transformer embeddings, it usually works pretty well for coarse document similarity. Even the famous word2vec vector-addition operations kinda worked, as illustrated by some examples in the followup 'Paragraph Vector' paper in 2015: https://arxiv.org/abs/1507.07998
So if for you the resulting doc-to-doc similarities seemed nonsensical, there was likely some process error in model training or application.
When you describe a tax that is "paid annually and only if you win", that's plain generic income tax.
That's not the gambling-activity-specific taxes that Stoller's article discusses - typically applied to gambling businesses' revenues, not bet winners specifically.