Hacker Newsnew | past | comments | ask | show | jobs | submit | smokel's commentslogin

> AI really wants to use Project Panama

It would help if you briefly specified the AI you are using here. There are wildly different results between using, say, an 8B open-weights LLM and Claude Opus 4.6.


I've been using several. LM Studio and any of the open weight models that can fit my GPU's RAM (24GB) are not great in this area. The Claude models are slightly better but not worth they extra cost most of the time since I typically have to spend almost the same amount of time reworking and re-prompting, plus it's very easy to exhaust credits/tokens. I mostly bounce back and forth between the codex and Gemini models right now and this includes using pro models with high reasoning.

> Imagine that we made an LLM out of all dolphin songs ever recorded, would such LLM ever reach human level intelligence? Obviously and intuitively the answer is NO.

Not so fast. People have built pretty amazing thought frameworks out of a few axioms, a few bits, or a few operations in a Turing machine. Dolphin songs are probably more than enough to encode the game of life. It's just how you look at it that makes it intelligence.


I've written an Obsidian clone for myself, which has proper Emacs keybindings. Took me a few hours too many to get in all the features that I need.

What I find interesting is that I have little motivation to open source it. Making it usable for others requires a substantial amount of time, which would otherwise be just a fraction of the development time.


I was thinking about doing the same. Build a clone with AI custom tailored for my own quirks. And not bothering to open source it because it's too bespoke for anyone else. How hard was this? Can you share any advice?

It turned out to be pretty hard in some places. I'm using CodeMirror as the basic building block, which is great, but it does not support WYSIWYG table editing out of the box. Getting that to work requires one to use a separate CodeMirror instance for the cell editor, which makes things rather complicated. For the LLM as well :)

I think I've spent ~20 hours and a couple of $100 of Claude Opus tokens in Cursor. So it's not cheaper or easier, but the amount of frustration saved with having proper Emacs keybindings might delay catastrophic global warming by a few days.

Oh, and of course I'm not compatible with all the Obsidian extensions, nor do I have proper hosting for server-side sync yet. All in all, a fool's errand, but I'm having fun.


Im doing the exact same thing but Im building my Obsidian clone with Rust and gpui and primarily with Codex. So far I estimate Ive been solely vibe coding it for ~15 hours now with only one small change made by hand. Id be interested in comparing notes/our different approaches to this. Feel free to shoot me an email at jerlendds at osintbuddy dot com if you want to chat.

I have a small demo video of yesterdays work here: https://github.com/jerlendds/mdi

Theres since been many additions, Ill update the video tonight


Thank you! Re extensions: my thinking was that if you build a clone, then extensions become irrelevant. Just build what you need directly into the software. Extensions systems always seemed to me to be a second class citizen. I think I read an old story of Linus Torvalds using an old fork of microemacs and whenever he disliked something he would just go tweak it's C code (e.g. key bindings). I'm kind of thinking that but done with an LLM. Software could in theory be smaller and more bespoke. And it you want it to work differently, you just prompt an LLM to change the actual source code. Then you don't need higher level configuration/cuatomization interfaces. Simpler software.

This is an interesting topic. I always loved the idea of extensions, for multiple reasons. But they do have their disadvantages, and I'm eager to find out how extension systems will hold up in the time of LLMs.

A major advantage of (certain) extension mechanisms is that you can update them in real-time. For example, in Emacs you can change functions without losing the current state of the application. In Processing or live coding environments, you can even update functions that affect real-time animation or audio.

Another advantage is that they can pose a very nice API, that allows for other people to learn an abstraction of the core application. If you are the sole developer, and if you can spend the time to keep an active memory of the core application, this does not help much. But it can certainly help others to build upon your foundation. Gimp and Emacs are great examples of this.

A disadvantage is that you have to keep supporting the extension mechanism, or otherwise extensions will break. That makes an ecosystem somewhat more slow to adapt. Emacs is the prime example here. We're still stuck with single-threaded text mode :)


I have a theory (and I'm sure I am far from the first one to voice it) that the number of useful open source projects released to the public will be on the decline now that anyone scratch their own itch with a few hours of vibe coding. Why would I spend hours evaluating a dozen different note-taking applications and _maybe_ find one that is _kinda close_ to what I want, if I can instead have Claude vibe me one up _exactly_ the way I want it?

(I actually did write my own note-taking application, but that was before LLMs were any good at writing code.)


Because when it eventually and inevitably corrupts your data, you won't know what to do or have any recourse?

Surely any sane person vibe coding a note taking app just has it save all the notes as markdown files to disk? At that point making a backup is trivial and they're unlikely to get corrupted.

So why vibe code a version of a thing that already exists in a dozen different permutations, and with actual eyes on the codebase?

In a typical open source project only one person has had a look at a particular piece of code. Only in the larger and more mature projects do people actually spend time reviewing code. Also, if you don't pay for the free code, there is often no serious recourse to recover your data either.

As stated in my first comment, Obsidian does not support Emacs keybindings properly, nor is it open source. Writing an extension to add Emacs keybindings is not at all trivial, because you have to work around a lot of existing and undocumented functionality.

There are other reasons for not vibe coding your own alternative, but as LLMs keep progressing, these reasons may become less relevant.


> This outperforms the majority of online llm services

I assume you mean outperforms in speed on the same model, not in usability compared to other more capable models.

(For those who are getting their hopes up on using local LLMs to be any replacement for Sonnet or Opus.)


Obviously it's not going to be of a paid tier 2T sized SOTA model quality, but it can probably roughly match Haiku at the very least. And for tasks that aren't super complex that's already enough.

Personally though, I find Qwen useless for anything but coding tasks because if its insufferable sycophancy. It's like 4o dialed up to 20, every reply starts with "You are absolutely right" with zero self awareness. And for coding, only the best model available is usually sensible to use otherwise it's just wasted time.


That's why I start any prompt to Qwen 3.5 with:

persona: brief rude senior


I'm using:

persona: drunken sailor

Because then at least the tone matches the quality of the output and I'm reminded of what I can expect.


But then what do you do with it early in the morning?

For starters, shave his belly with a rusty razor, obviously ;)

Does it tend to break out into sea shanties?

Yo, ho, ho, and a bottle of rum.


This also works

persona: emotionless vulcan


Does "persona: air traffic controller" work?

If I could set up a voice assistant that actually verifies commands, instead of assuming it heard everything correctly 100% of the time, it might even be useful.


persona: fair witness

https://fairwitness.bot/


You just paste in that YAML? Is this an official llm config format that is parsed out?

Yeah, just paste it in there - the LLM will figure it out. Play with it if you want to tweak the formatting - you could try JSON instead, but for readability I went with YAML.

wow I had no idea you could do that. this changes everything for me.

persona: party delegate in a rural province who doesn't want to be there

gamechanger

>for coding, only the best model available is usually sensible to use otherwise it's just wasted time.

I had the opposite experience. Gave a small model and a big model the same 3 tasks. Small model was done in 30 sec. Large model took 90 sec 3x longer and cost 3x more. Depending on the task, the benchies just tell you how much you are over-paying and over-waiting.


If you use the models like we execute coding tasks, older models outperform latest models. There's this prep tax that happens even before we start coding, i.e., extract requirements from tools, context from code, comments and decisions from conversations, ACs from Jira/Notion, stitch them together, design tailored coding standards and then code. If you automate the prep tax, the generated code is close to production ready code and may require 1-2 iterations max. I gave it a try and compared the results and found the output to be 92% accurate while same done on Claude Code gave 68% accuracy. Prep tax is the cue here

oh? I used it in t3 chat before, with traits `concise` `avoid unnecessary flattery/affirmation/praise` `witty` `feel free to match potential user's sarcasm`

and it does use that sarcasm permission at times (I still dislike the way it generally communicates)


> I find Qwen useless for anything but coding tasks because if its insufferable sycophancy

We use Qwen at work since 2.0 for text/image/video analysis (summarization, categorization, NER, etc), I think it's impressive. We ask for JSON and always ask "do not explain your response".


You can replace Sonnet and Opus with local models, you just need to run the larger ones.

Apparently, there is no scientific evidence that ANC is or is not causing tinnitus.

ANC reduces background noise, which typically allows users to listen at lower volumes, thereby reducing total sound exposure to the ear. So if the user adapts their volume, that would lead to less risk of tinnitus. This works for me :)

But there are lots of people on forums suggesting that there is a link between tinnitus and ANC. One reason could be that ANC headphones allow you to listen very accurately to inner auditory signals, and if you already had some tinnitus, you might start to notice it.


What you can do depends highly on your skill set, your network, and your willingness to spend effort on this.

If you feed this into a decent chatbot, or in an Ask HN, you might be surprised.


For the uninitiated: Interestingly, it is not advisable to take this to the extreme and set temperature to 0.

That would seem logical, as the results are then completely deterministic, but it turns out that a suboptimal token may result in a better answer in the long run. Also, allowing for a little bit of noise gives the model room to talk itself out of a suboptimal path.


I like to think of this like tempering the output space. With a temperature of zero, there is only one possible output and it may be completely wrong. With even a low temperature, you drastically increase the chances that the output space contains a correct answer, through containing multiple responses rather than only one.

I wonder if determinism will be less harmful to diffusion models because they perform multiple iterations over the response rather than having only a single shot at each position that lacks lookahead. I'm looking forward to finding out and have been playing with a diffusion model locally for a few days.


Yup. I think of it as how off the rails do you want to explore?

For creative things or exploratory reasoning, a temperature of 0.8 lends us to all sorts of excursions down the rabbit hole. However, when coding and needing something precise, a temperature of 0.2 is what I use. If I don’t like the output, I’ll rephrase or add context.


Setting the temperature to zero does not make the llm fully deterministic, although it is close.


Ooh, let's spend next weekend doing this with my acoustic piano!


The author introduces the term "Supervision Paradox", but IMHO this is simply one instance of the "Automation Paradox" [1], which has been haunting me since I started working in IT.

Interestingly, most jobs don't incentivize working harder or smarter, because it just leads to more work, and then burn-out.

[1] https://en.wikipedia.org/wiki/Automation#Paradox_of_automati...


Phrases like: "identity crisis", "burnout machine", "supervision paradox", "acceleration trap", "workload creep" are just AI slop.


You seem to be right. The author is pumping out one such article per day. I think I've spent more time in forming my comment than they did in generating the article. Oh well :)


> single bit neural networks are decision trees.

I didn't exactly understood what was meant here, so I went out and read a little. There is an interesting paper called "Neural Networks are Decision Trees" [1]. Thing is, this does not imply a nice mapping of neural networks onto decision trees. The trees that correspond to the neural networks are huge. And I get the idea that the paper is stretching the concept of decision trees a bit.

Also, I still don't know exactly what you mean, so would you care to elaborate a bit? :)

[1] https://arxiv.org/pdf/2210.05189


Closest thing I found was:

Single Bit Neural Nets Did Not Work - https://fpga.mit.edu/videos/2023/team04/report.pdf

> We originally planned to make and train a neural network with single bit activations, weights, and gradients, but unfortunately the neural network did not train very well. We were left with a peculiar looking CPU that we tried adapting to mine bitcoin and run Brainfuck.


> I still don't know exactly what you mean

Straight forward quantization, just to one bit instead of 8 or 16 or 32. Training a one bit neural network from scratch is apparently an unsolved problem though.

> The trees that correspond to the neural networks are huge.

Yes, if the task is inherently 'fuzzy'. Many neural networks are effectively large decision trees in disguise and those are the ones which have potential with this kind of approach.


> Training a one bit neural network from scratch is apparently an unsolved problem though.

It was until recently, but there is a new method which trains them directly without any floating point math, using "Boolean variation" instead of Newton/Leibniz differentiation:

https://proceedings.neurips.cc/paper_files/paper/2024/hash/7...


Nice!


Unfortunately the paper seems to have been mostly overlooked. It has only a few citations. I think one practical issue is that that existing training hardware is optimized for floating point operations.


>Many neural networks are effectively large decision trees in disguise and those are the ones which have potential with this kind of approach.

I don't see how that is true. Decision trees look at one parameter at a time and potentially split to multiple branches (aka more than 2 branches are possible). Single input -> discrete multi valued output.

Neural networks do the exact opposite. A neural network neuron takes multiple inputs and calculates a weighted sum, which is then fed into an activation function. That activation function produces a scalar value where low values mean inactive and high values mean active. Multiple inputs -> continuous binary output.

Quantization doesn't change anything about this. If you have a 1 bit parameter, that parameter doesn't perform any splitting, it merely decides whether a given parameter is used in the weighted sum or not. The weighted sum would still be performed with 16 bit or 8 bit activations.

I'm honestly tired of these terrible analogies that don't explain anything.


> I'm honestly tired of these terrible analogies that don't explain anything.

Well, step one should be trying to understand something instead of complaining :)

> Single input -> discrete multi valued output.

A single node in a decision tree is single input. The decision tree as a whole is not. Suppose you have a 28x28 image, each 'pixel' being eight bits wide. Your decision tree can query 28x28x8 possible inputs as a whole.

> A neural network neuron takes multiple inputs and calculates a weighted sum, which is then fed into an activation function.

Do not confuse the 'how' with 'what'.

You can train a neural network that, for example, tells you if the 28x28 image is darker at the top or darker at the bottom or has a dark band in the middle.

Can you think of a way to do this with a decision tree with reasonable accuracy?


> Training a one bit neural network from scratch is apparently an unsolved problem though.

I don't think it's correct to call it unsolved. The established methods are much less efficient than those for "regular" neural nets but they do exist.

Also note that the usual approach when going binary is to make the units stochastic. https://en.wikipedia.org/wiki/Boltzmann_machine#Deep_Boltzma...


Interesting.

By unsolved I guess I meant: this looks like it should be easy and efficient but we don't know how to do it yet.

Usually this means we are missing some important science in the classification/complexity of problems. I don't know what it could be.


Perhaps. It's also possible that the approach simply precludes the use of the best tool for the job. Backprop is quite powerful and it just doesn't work in the face of heavy quantization.

Whereas if you're already using evolution strategies or a genetic algorithm or similar then I don't expect changing the bit width (or pretty much anything else) to make any difference to the overall training efficiency (which is presumably already abysmal outside of a few specific domains such as RL applied to a sufficiently ambiguous continuous control problem).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: