I mean.. I'm one of the staunchest skeptics of LLMs as agents, but they're amazing as supercharged autocomplete and I don't see anything wrong with them in that role. There's a position between handwritten and slopped that's pareto.
It's actually quite easy to spot if LLMs were used or not.
Very few total number of commits, AI like documentation and code comments.
But even if LLMs were used, the overall project does feel steered by a human, given some decisions like not using bloated build systems. If this actually works then that's great.
The first commit was 17k lines. So this was either developed without using version control or at least without using this gh repo. Either way I have to say certain sections do feel like they would have been prime targets for having an LLM write them. You could do all of this by hand in 2026, but you wouldn't have to. In fact it would probably take forever to do this by hand as a single dev. But then again there are people who spend 2000 hours building a cpu in minecraft, so why not. The result speaks for itself.
> The first commit was 17k lines. So this was either developed without using version control or at least without using this gh repo.
Most of my free-time projects are developed either by my shooting the shit with code on disk for a couple of months, until it's in a working state, then I make one first commit. Alternatively, I commit a bunch iteratively, but before making it public I fold it all into one commit, which would be the init. 20K lines in the initial commit is not that uncommon, depends a lot on the type of project though.
I'm sure I'm not alone with this sort of workflow(s).
Can you explain the philosophy behind this? Why do this, what is the advantage? Genuinely asking, as I'm not a programmer by profession. I commit often irrespective of the state of the code (it may not even compile). I understand git commit as a snapshot system. I don't expect each commit to be pristine, working version.
Lot of people in this thread have argued for squashing but I don't see why one would do that for a personal project. In large scale open source or corporate projects I can imagine they would like to have clean commit histories but why for a personal project?
I do that because there's no point in anyone seeing the pre-release versions of my projects. They're a random mess that changed the architecture 3 times. Looking at that would not give anyone useful information about the actual app. It doesn't even give me any information. It's just useless noise, do it's less confusing if it's not public.
I don't care about backing up unfinished hobby projects, I just write/test until arbitrarily sharing, or if I'm completely honest, potentially abandoning it. I may not 'git init' for months, let alone make any commits or push to any remotes.
Reasoning: skip SCM 'cost' by not making commits I'd squash and ignore, anyway. The project lifetime and iteration loop are both short enough that I don't need history, bisection, or redundancy. Yet.
Point being... priorities vary. Not to make a judgement here, I just don't think the number of commits makes for a very good LLM purity test.
you should push to a private working branch- and freqently. But, when merging your changes to a central branch you should squash all the intermediate commits and just provide one commit with the asked for change.
Enshrining "end of day commits", "oh, that didn't work" mistakes, etc is not only demoralizing for the developer(s), but it makes tracing changes all but impossible.
> I don't expect each commit to be pristine, working version.
I guess this is the difference, I expect the commit to represent a somewhat working version, at least when it's in upstream, locally it doesn't matter that much.
> Why do this, what is the advantage?
Cleaner I suppose. Doesn't make sense to have 10 commits whereas 9 are broken half-finished, and 10 is the only one that works, then I'd just rather have one larger commit.
> they would like to have clean commit histories but why for a personal project?
Not sure why it'd matter if it's personal, open source, corporate or anything else, I want my git log clean so I can do `git log --short` and actually understand what I'm seeing. If there is 4-5 commits with "WIP almost working" between each proper commit, then that's too much noise for me, personally.
But this isn't something I'm dictating everyone to follow, just my personal preference after all.
Fair enough. Thanks for the clarification. Personally, I think, everything before a versioned release (even something like 0.1) can be messy. But from your point I can see it that a cleaner history will have advantages.
Further, I guess if author is expecting contributions to the code in the future, it might be more "professional" for the commits to only the ones which are relevant.
My own projects, I consider, are just for my own learning and understanding so I never cared about this, but I do see the point now.
Regardless, I think it still remains a reasonable sign of someone doing one-shot agent-driven code generation.
One point I missed, that might be the most important, since I don't care about it looking "professional" or not, only care about how useful and usable something is: if you have commits with the codebase being in a broken state, then `git bisect` becomes essentially useless (or very cumbersome to use), which will make it kind of tricky to track down regressions unless you'd like to go back to the manual way of tracking those down.
> Regardless, I think it still remains a reasonable sign of someone doing one-shot agent-driven code generation.
Yeah, why change your perception in the face of new evidence? :)
Regarding changing the perception, I think you did not understand the underlying distrust. I will try to use your examples.
It's a moderate size project. There are two scenarios: author used git/some VCS or they did not use it. If they did not use it, that's quite weird, but maybe fine. If they did use git, then perhaps they squashed commits. But at certain point they did exist. Let's assume all these commits were pristine. It's 16K loc, so there must be decent number of these pristine commits that were squashed. But what was the harm in leaving them?
So these commits must have been made of both clean commits as well as broken commits. But we have seem this author likes to squash commits. Hmm, so why didn't they do it before and only towards the end?
Yes, I have been introduced to a new perception but it's the world does not work "if X, then not Y principles." And this is a case where the two things being discussed are not mutually exclusive like you are assuming. But I appreciate this conversation because I learnt importance and advantages of keeping clean commit history and I will take that into account next time reaching to the conclusion that it's just another one-shot LLM generated project. But nevertheless, I will always consider the latter as a reasonable possibility.
> I guess this is the difference, I expect the commit to represent a somewhat working version,
On a solo project I do the opposite: I make sure there is an error where I stopped last. Typically I put in in a call to the function that is needed next so i get a linker error.
6 months later when I go back to the project that link error tells me all I need to know about what comes next
Or first thousand commits were squashed. First public commit tells nothing about how this was developed. If I were to publish something that I have worked on my own for a long time, I would definitely squash all early commits into a single one just to be sure I don't accidentally leak something that I don't want to leak.
For example when the commits were made. I would not like to share publicly for the whole world when I have worked with some project of mine. Commits themselves could also contain something that you don't want to share or commit messages.
At least I approach stuff differently depending if I am sharing it with whole world, with myself or with people who I trust.
Scrubbing git history when going from private to public should be seen totally normal.
Hmm I can see that. Some people are like that. I sometimes swear in my commit messages.
For me it's quite funny to sometimes read my older commit messages. To each of their own.
But my opinion on this is same as it is with other things that have become tell-tale signs of AI generated content. If something you used to do starts getting questioned as AI generated content, it's better to change that approach if you find it getting labelled as AI generated, offensive.
If you have for example a personal API key or credentials that you are using for testing, you throw it in a config file or hard code it at some point. Then you remove them. If you don't clean you git history those secrets are now exposed.
Hello not the poster but I am BarraCUDA's author. I didn't use GIT for this. This is just one of a dozen compiler projects sitting in my folder. Hence the one large initial commit. I was only posting on github to get feedback from r/compilers and friends I knew.
The original test implementation of this for instance was written in OCaml before I landed on C being better for me.
The project owner is talking about LLVM,a compiler toolkit, not an LLM.