Even cool projects can learn from others. Maybe they missed something that could benefit the project, or made some interesting technical choice that gives a different result.
For the readers/learners, it's useful to understand the differences so we know what details matter, and which are just stylistic choices.
But it isn't the OP's responsibility to compare their project to all other projects. The GP could themselves perform the comparison and post their thoughts instead of asking an open ended question.
It isn't, but such information will be immensely helpful to anyone who wants to learn from such projects. Some tutorials are objectively better than others, and learners can benefit from such information.
Well, the person who asked the question, for one. I'm sure they're not the only one. Best not to assume why people are asking though, so you can save time by not writing irrelevant comments.
The Matrix style human pods: we live in blissful ignorance in the Matrix, while the LLMs extract more and more compute power from us so some CEO somewhere can claim they have now replaced all humans with machines in their business.
I was thinking more of the season 3 episode of Doctor Who titled Gridlock where everyone lives in flying cars circling a giant expressway underground, while all the upper class people on the surface died years ago from a pandemic.
Does autoresearch work for projects that are not llm based? Eg in karpathy's example he is optimizing the nanogpt. What if I wanted to improve a Unet for image segmentation?
Tobi from Shopify used a variant of autoresearch to optimize the Liquid template engine, and found a 53% speedup after ~120 experiments: https://github.com/Shopify/liquid/pull/2056
How much did this cost? Has there ever been an engineering focus on performance for liquid?
It’s certainly cool, but the optimizations are so basic that I’d expect a performance engineer to find these within a day or two with some flame graphs and profiling.
He used Pi as the harness but didn't say which underlying model. My stab-in-the-air guess would be no more than a few hundred dollars in token spend (for 120 experiments run over a few days assuming Claude Opus 4.6 used without the benefits of the Claude Max plan.)
So cheaper than a performance engineer for a day or two... but the Shopify CEO's own time is likely a whole lot more expensive than a regular engineer!
The gist of these things is you point them at an eval metric and say 'make it go better.' so, you can point it at anything you can measure. The example in the blog post here is bonding boxes on wood cut images.
Yes, that's the real strenght of it. The structure is dead simple so you just have to switch the goal metric.
I used it on a data science project to find the best rules for achieving a defined outcome. At first, for fun, then I actually used some of its insights (and it caught a sampling issue I overlooked, oops)
I used it to speed up an codecompass-like repo from 86 files per second to 2000. Still haven't used the repo in production, so maybe it secretly broke things, but the ability to say: "optimize this benchmark and commit only if you pass these tests" is nice
The CBC is reporting the analysis of The World Happiness Report - it's not coming to its own conclusions. Maybe you should read the article and original source yourself before making hasty comments.
This is not true. The hack did not affect Stryker products sold to hospitals and clinics, it only impacted Stryker employees work and personal devices. Yes 50tb of data was exfiltrated and it remains to be seen what that data is and how it might impact products down the line.
Medical equipment reps often play a pretty active role in patient care. Can't get in touch with a rep to put a device into its MRI safe mode? No MRI for you. Can't get a rep in to help the surgeon with the type in hardware they were going to install? No surgery for you.
People's AICDs aren't going to start exploding, but I'm pretty confident this will hamper care for many patients.
reply