What makes good/bad commit messages?
I am bad at writing commit messages.
I am a worried Person when it comes to other people's opinion about me and my work. I want something to be good enough - perfect would be an exaggeration - so that others can work with my source code and they don’t think that I suck at what I am doing.
I guess most of us developers want to be seen as worth the money that other people pay us for our work. Because of those reasons, I am very reluctant to commit something that isn’t working properly or that I have doubts about.
But not commiting leads to a much bigger problem in the end. If I then have to commit I commit a whole bunch of changes of which I most of the time have forgotten why I have done them or even have forgotten that I have done them. In the end I either write a whole novel or I write two sentences which is barely enough.
So to quit with bad commit messages I try to force myself to commit regularly and write a decent amount to Inform my colleagues and myself about the changes that were made.
But why bother?
When it comes to commit-messages I was never very much concerned about the message itself. Our team rarely looked on the commit messages because we knew who committed something and who was working on something, so why bother right?
I am afraid I was wrong
The longer the commit history gets the harder it gets to keep track. If commit messages are bad or don’t follow a set of styling guidelines this gets even more true. Good books are structured in a way that makes reading easy and source code is structured in a way that makes it human readable. Humans seem to prefer a structure when it comes to information. So if we want to keep track of the changes in our source code we have to write good structured and meaningful commit messages, this means we should bother how the commit messages are written¹.
There is more to developing than source code, sometimes I wondered why source code was written in a particular manner and I had to get the Information by talking to the person who wrote it.
An Example conversation
There are situations where you find a very bizarre "check and conversion" combination in the source code which seems to be unnecessary. When asking the developer who wrote this weird check he tells you:
"Ok listen, a customer employs a blind person who uses this special device and this device sends every message with a trailing paragraph symbol(§). This symbol is there because the device has to send messages to another program by another company and the developers came up with this weird protocol. Long story short, the other company wouldn’t fix it and I had to check for it so that I don’t send it to a library in our backend which for some reason only supports the ASCII charset. But never ever change this check because the customer is extremely important otherwise we would have never made all that nonsense"
What do we conclude from this conversation?
A commit message with this information would have been very helpful. Of course, I could always ask the developer responsible for this change but what if he isn’t employed in the company anymore? Well, this would mean I am screwed.
I think I speak for most developers when I say, I don’t like being screwed.
So now that we have established at least one good reason to write commit messages, and that is to preserve important information that you wouldn’t expect to see in the source code itself, let’s look at the distinction between good and bad commit messages.
I structured this in the form of three distinct factors. Sometimes those factors overlap but I think we can nonetheless observe them one at a time.
The environment factor
There is a fourth factor but it seems to me, that this one is present in all the three other ones so I will add it to the end and let you decide if its a factor on its own.
When searching for the term "good commit messages" you can not get around the 50/72 Rule. This rule states the form which a commit message should have. A very brief description would be:
It must have a 50 character summary as its first line
then a blank line
followed by the description of the commit. Each line shouldn’t be wider than 72 characters so that in a terminal, which is normally 80 characters wide, a commit message can be read without scrolling to the left and right side.
Tim Pope(this rule originated on his blog) explains the last point on his blog as follows:
On an 80 column terminal, if we subtract 4 columns for the indent on the left and 4 more for symmetry on the right, we’re left with 72 columns.
I like the Idea of a headline or a summary at the beginning of the commit messages so that it is possible to skim the messages to find the commit I am searching for. But I am opposed to the 50 and 72 character rule. The Rule applies to Git (NOTE: To be fair, the title of the blog post is A Note About Git Commit Messages ) but there are more version control systems around. I remember a time when SVN was the way to go, I think TFS is a valid alternative in the Microsoft world and Mercurial, although it is not as widely used as Git, is still used.
Having this in mind I think there is no clear form factor. I would also keep in mind that different tools show the commit messages in a different way. If your log tool differs from the standard one then this might also change the way a good formatted commit message looks like to you.
This means I would argue that there is no clear form factor which makes a commit message better than another BUT if it is possible to write a short summary and a more detailed description you should do so. It gives the reader the chance to decide if the commit message is relevant or not.
The Tools / The Environment
The tools you use make a big difference in how you provide information. Some tools provide the option to write a summary others don’t. Sometimes you have great tooling with a very pleasant way of showing the message and its subject line and sometimes you just have tools that use the Terminal/cmd/Powershell.
Nonetheless, am I a strong advocate of tool-agnostic commit messages. If the team I am working in would change the tools, should this break our past commit messages? No, it shouldn’t! By tool, I don’t mean the Software that is the underlying system like Git, SVN or TFS. If you change that system you certainly will have a break in the commit history. What I mean is, if you for example use SVN and are used to Tortoise SVN on windows, then you are used to a graphical user interface which shows you the messages in a GUI. If you are not used to Tortoise or you prefer to use the cmd/powershell than the formatting of the message might be different but it shouldn’t be.
So we have three options at hand.
Write commit messages in the way, that all tools will show them properly and don’t use tool specific formatting.
Stick to one tool and use it throughout your whole team
Write your own tool or commit message styler for the terminal
All of those three options are valid but I would prefer Option 1 or 3. The reason why option 2 is my least favourite one, is because I encountered Teams who changed their tools and had some problems in reading the old messages.
Let's come to the single most important Factor when writing commit messages, what Informations should a good commit message contain and what makes a message bad.
I will dive into the bad commit messages first because this will help us understand later why a commit message is good.
Bad commit message
Bad commit messages are at best equivocal or (and this is mostly the case) have no findable meaning at all.
|Fixed an error||Here you can see that something is missing. You shouldn’t be left with questions that the commit message should solve. The first example doesn’t provide a short description of what error and why.|
|Cleanup code||The second also doesn’t give us any further informations why the cleanup was necessary and what Cleanup means. Did the person remove commented code, did she fix spelling or did he fix the formatting?|
|The third commit message isn’t even there. The person didn’t provide anything, it's like he is saying "I did something and your first objective is to figure out what, because I am a lazy ass and I don’t care about you, your time and the company" I know that the message is saying this because this is one commit message I wrote. The reasons are pleanty I had to go home/ I had to catch a bus/ I had an important appointment/ you name an excuse. The fact of the matter is, I was simply to lazy.|
|Changed naming removed bad behavior||Changed naming and removed bad behavior are also bad examples because they don’t tell anything about the reasoning.|
|What have I done?||The last one at least shows us the same question we had in the first place. Reading this commit message I feel a certain connection with the person who wrote the message. This doesn’t help but it gives me a warm feeling. Jokes aside(I am a german I am inherently not funny neither do I have warm feelings) all of those commit messages lack the “why” something was changed and a brief description of what was changed.|
Good commit messages
Now that we have had a small insight into bad commit messages lets see what a commit message should have
A good commit message…
tells us why we changed something
Good commit messages, first of all, contain the "why". What was the reason behind a change that was made. Software doesn’t change without reasons. Either your Boss wanted a new feature, you found a bug, a colleague found a bug or you got an error report/a ticket, etc. There is a reason behind a change and you should name it.
provide a simple description what was changed
It is good to let the person reading the commit know what you have done in simple words. The important part is, that it is written simply and it summarizes what changes you have made. If you made a tricky change to the software, you could ask yourself what is it that the other developers switching to this commit should know?
Some developers prefer this information in the source code as a comment. It depends on the way you and your team handle those kinds of matters where it is placed best, which leads us also the the last tipp
stick to the guidelines
Good commit messages stick to the guidelines provided by the company someone is working for or the author(s) of an open source project. Guidelines produce uniformity and make it easy to read commit messages. Guidelines also make it easier for the person writing the commit message. The person writing commit messages doesn’t have to think about how to write the message but rather concentrate on the information.
The previous three points are sufficient to write a good commit message but there are some options that teams might consider.
If you have a Ticket system and you want to inform your reader what tickets the commit has fixed then you can write this in the commit message. Although this sounds like a very good Idea there are still some points to consider.
Where to place the ticket number? I have seen messages with the ticket number in front of the summary line, at the end of the summary line or at the bottom of the description.
This question is so tricky, because of the form and the environment factor. If your team sticks to a fixed length summary, then you limit yourself to an even less descriptive summary. This is especially true if your commit fixes two, three or even more tickets!
If your team switches Ticket systems then the old ticket number is stuck in the commit message.
If the ticket number is something that is absolutely required then I would place it at the bottom of the description with a prefix like Fixes:. If you then need to know which commit message fixed a ticket then you can at least filter for that information.
Information for the tester
Your Team has testers and they need to see what was changed? If you don’t have an extra system for that then it might be possible to provide this information in the Ticket system. I wouldn’t recommend it but I have at least heard that this is done by some companies.
The audience is a detail that extends the information factor and form factor.
The Audience is a huge point when it comes to commit messages.
It begins with the language you write your commit messages in. If your company is based in just one country and this country is a non english speaking country, then chances are high that your commit messages are written in that language.
If your audience or you don’t speak English very well, then you should stick to the language that they or you speak. I know this is controversial because most developers want their messages to be in English but the point of the matter is the amount of provided information. You should be able to write proper messages and you should be able to understand the messages of other developers.
Speaking of developers, is your audience solely constructed out of developers? I work for a company where designers, testers, managers and wood technicians have access to the version control and write commit messages. You should write your commit message with your audience in mind. Did you change something that affects the wood technicians? Don’t expect that everybody knows what the "Pull up field" or “Pull up method” refactorings are, instead write “Moved functionality from one class to another to enable reuse in other parts of the application”. If your team consists solely of developers than please write “Pull up method” because this is shorter and provides the same amount of information to the appropriate audience.
Whats your Workflow?
How often do you commit? As I already stated, I commit far too rare which makes my commits too big and I have to provide too much information and sometimes forget what all I have done. Even though I fail at this tipp, I would encourage everybody to commit often and incorporate it into the daily workflow.
I hope this gave you some new insights. Have a great day and hopefully until next time
In my early years in programming, a colleague of mine justified bad commit messages and even bad naming in sourcecode with the phrase that "Words are just sound and air...", you could write whatever and could mislead people with the written word, so it is a bad medium to provide useful information and it all depends on the knowledge you can find in the working of the sourcecode.
I can find much truth in his saying but even so much misleading guidance. We have to consider what our alternatives are to provide information. We don’t have any other way of delivering information besides the written word, of course, words can mislead but without any meaningful pieces of information in our source code or commit messages we would be searching endlessly. We Would compile the source code and see how it behaves to find what the meaning is. This would be the most reliable form of information but also absolutely nonpractical.