Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The biggest problem with commit messages: They lie.

The main reason for this is that most people tend put more than one atomic change into a single commit, but in the message they often only mention the "main change" which they introduced with the commit. This in turn makes it much harder to find out which commit introduced a given problem into the code base since people will usually read the (incomplete) message of a given commit and think "the problem can't be here because this commit only did X" while it actually did Y & Z as well. So, personally, I think we could as well abandon writing commit messages entirely and instead make sure that individual changes are small enough so that we can figure out what happened by looking at the actual code instead.

As an example, if you look at individual commits in the Django project (such as this one: https://github.com/django/django/commit/e7e8d30cae9457339eb4...), it is often much easier to figure out what happened by looking at the file diff than the commit message.

A tool that could summarize changes in source code using advanced (semantic) diffing instead of line-by-line diffs would make this much easier of course.



With magit, it's easy to only stage individual 'hunks' so that you can have only the relevant parts of a file staged. And so this problem goes away.

I also imagine it's easy to do this in fugitive as well.

It's probably also possible on the cli, but I imagine that would be too 'hard' or time consuming, hence you will probably start to be lazy about doing it correctly.


>It's probably also possible on the cli, but I imagine that would be too 'hard' or time consuming, hence you will probably start to be lazy about doing it correctly.

I do it all the time, you just use `git add -p` and it steps through hunks letting you stage them or not. As long as you don't make too many changes without committing, it's not particularly tedious.


This was the 'killer app' that compelled me to switch to Git from Subversion back in the day, rather than anything else about Git intrinsically. The ability to organise my commits logically rather than temporally (i.e. not just as a stupid log of change over time) was like night and day.


> ability to organise my commits logically rather than temporally

I'm just getting started with Git; how do you do that? I googled it but nothing jumped out at me.


He referred to what the GP said - use "git add -p". Here's some introduction (I haven't watched it myself and can't comment on the quality):

http://johnkary.net/blog/git-add-p-the-most-powerful-git-fea...


I recommend setting up a shell alias to do this. I have all of my frequently used git commands aliased to two- or three-letter mnemonics.


Sweet! I never bothered to check how to do it on the cli, and instead of being interactive, I just assumed you would have to reference the hunks by line number or something. Naive of me.


It also works with reset and checkout, BTW.


Wow thank you. I had no idea that was something you could do!


You can simply use `git add -p` to stage hunks individually (or `git reset -p` to unstage some hunks).


"git add -p" can be fun, but it's a little scary because it's super easy to end up making commits that don't actually stand on their own. Super simple to miss an import here or a new field there. If you do a whole series of them, it might actually be worse for others to come back to (or bisect in) if they don't realize the original developer never actually compiled and tested each commit as-is.


Well ideally you could use `git stash -k -u` (-keep index, stash -untracked) to set your working directory to the state you're actually committing, and then run some tests.

For the most part though, I just try to avoid having so many hunks to step through that this is even an issue.


And `git checkout -p` to erase hunks from the working directory. (i.e. delete without saving!)


That's a little dangerous! I use git stash -p instead, and only drop it after I'm absolutely sure.


> It's probably also possible on the cli, but I imagine that would be too 'hard' or time consuming, hence you will probably start to be lazy about doing it correctly.

It's extremely easy. git add -p and then you can select to add the hunk or not.


The UI for this in SourceTree is also really nice. Actually my favorite thing about ST is that I learned so much more about git from using it.


This, in my opinion, is one of the very compelling reasons to use a GUI for these types of operations.

As an example, refactoring something that touches many files often leads to several related/required changes that aren't part of the main refactor. When you first do the change, you're not 100% sure it will stay around, and committing at this point can be a pain later. As a result, you can end up with many files changed and several logical units of work done, and some may be [parts of] a single file, while some may be [parts of] many files. For 5 or 6 hunks, git CLI is usable. Beyond that, for say, a hundred, a UI where you can jump around is basically essential in order to make usable commits.

I know there are still people that snobbily look down on and dismiss GUI tools, but some things lend themselves well to GUI, so I'd suggest giving them a try.

SourceTree in particular works seamlessly with CLI. When I first started with it (being used to git CLI), I jumped back and forth quite a bit with no issue. Now I really only use git CLI for remote branch operations or viewing reflog, and occasionally for a 'git commit -am' if I happen to already be in a shell.


The other thing for me is that git is inherently very very stateful. There's tons of detail to keep in mind as you execute commands - your branch, what's staged or isn't, the state of the remotes, whether you have anything stashed, etc. To me that's a recipe for a tool that should be used through a GUI.


most people tend put more than one atomic change into a single commit,

That's a people problem. Correct it through proper training and mentoring, not letting it continue just because "that's what people do."

Historically, some people were afraid of "wasting" commit numbers (CVS, SVN, mock revision numbers in hg), but git has no concept of an incremental commit number, so you can burn through as many commits as you want without feeling guilting about running up an auto-incrementing counter.


How do you "waste" commit numbers? They're just numbers. I believe the concern is about log space. (And git has a log too.)

Hopefully someone reads that log and they shouldn't be bothered by a thousand trivial changes, the reasoning goes.

Which is true to some extent, it's just that everyone doesn't get it right ... and that's where your comment about mentoring and training comes in. It must also be ok to make mistakes as to not try to hide slip-ups in the next commit.


I have worked with people who are obsessive about keeping auto increment numbers in databases "tidy". It's obviously nonsense but some people aren't logical. The same thing applies to some projects fears of actually following semantic versioning. Numbers are infinitesimally cheap, there should be no fear about burning them.


If you're using pull requests and reviewing them before they're merged, you could make inaccurate or incomplete commit summaries grounds for rejection of the request. Ask them to fix it using interactive rebase.


I'm curious how many of us interactive rebase every patchset before merging? For me it's critical because I tend to commit too frequently.


For what is worth, microcommits + rebase for fast-forward merges are the basis of the workflow used in the GNOME project.

As far as I know, it's more or less also what it is used by kernel people before hitting the tree of a maintainer (and it's non-ff merges from there).


I would think that scanning through commits is a pretty inefficient way to find out which commit introduced a problem anyway.

"git bisect", "git blame" and "git log -S" are my tools of choice.


You don’t need a magic tool to do this - what we need is a way to block commits that don’t have corresponding documentation of the changes. A tool that made you write a comment message for each code change block would get the job done - if you commit 10 changes then you would have to write 10 messages to explaining each of the changes.


> more than one atomic change into a single commit

There is no such thing as an "atomic change". Sometimes, fixing a single bug ,adding a single feature requires the edition of multiple files or even complex changes. I personally don't like these projects with 1 commit per file change ,that's ridiculous and it's noisy.


I think he mean a single feature.

Like a single commit "Fix bug XYZ", that in reality also contains "Fix typo in error messages", "Change rendering of status page", "Fix test framework DSL to prevent infinite loop".

Naively you could say that each of those should go in their own commit, but reality is that they may actually be quite small and necessary and may not even be seen as a feature by the author or reviewer. Only 6 months down the line, you read the code and wonder why the f*ck there is a change in the test DSL in order to fix bug XYZ.


>Sometimes, fixing a single bug ,adding a single feature requires the edition of multiple files

Congrats, that's one atomic change. He didn't say "one commit per file change", he said "one commit per atomic change", and yes, sometimes those atomic changes can spread across multiple files.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: