This explanation is wrong as I've already said (256 is not the result of any con...

llm_nerd · on Feb 22, 2024

Stewing so much you had to double-dip reply? Ouch.

As much as I would love to waste my time replying again to your magic thinking, instead I'll just politely chuckle and move on. Good luck.

famouswaffles · on Feb 22, 2024

>As much as I would love to waste my time replying again to your nonsense, instead I'll just politely chuckle and move on. Good luck.

You have your head so far up your ass even direct confirmation from the model builders themselves won't sway you. The comment wasn't for you. The comment is linked sources for the original poster and for the curious.

You see I don't have to hide behind a veneer of "Trust me bro. It works like this".

llm_nerd · on Feb 22, 2024

>even direct confirmation from the model builders themselves

Linking papers that you clearly haven't read and can't contextually apply -- as with the ViT or your misunderstanding of image tiling -- is not the sound strategy you hope it is. It doesn't confirm your claims.

I'm not asking anyone to "Trust me bro". So...have you called the Gemini Pro 1.5 API and tokenized an image or a video yet?

There is a certain element of this that is just spectacularly obvious to anyone who spent even a moment of critical thought -- if they're so capable -- on it. Your claim is that a high resolution image is tiled to a 16x16 array...and the magic model can at some later point magically on demand extract any and all details, such as OCR, from that 16x16. This betrays a fundamental ignorance of even the most basic of information theory.

Again, I would love to just block you and avoid the defensive insults you keep hurling, but this site lacks the ability. Stop replying to me, however many more contextually nonsensical citations you think will save face. Thanks.

famouswaffles · on Feb 22, 2024

>So...have you called the Gemini Pro 1.5 API and tokenized an image or a video yet?

You continue to blow my mind. Have you...have you even used the gemini pro api before ? You can't use the api to get the image tokens.

>This betrays a fundamental ignorance of even the most basic of information theory.

Wow, something else you don't understand. Go figure.