Wednesday, May 4, 2016

Code is the language, formatting is the dialect.

When working a team environment with a large codebase, it quickly becomes apparent that the code itself is the primary manner in which the team communicates on a day-to-day basis. The code embodies the ideas and thoughts of the author. On the Google Chrome team, no change is ever committed until it is reviewed by the "owners" of the code. Owners are defined on a per directory basis such that any change to a file within that directory, must be approved by one of the owners.


While working on the Embarcadero RAD Studio team, code reviews were also a critical part of the process, albeit not quite as strictly enforced as is on the Chrome team. Because of that, moving to the Chrome team has felt familiar and natural.

I've always been a strong advocate for ensuring and enforcing a strict code formatting style across the entire codebase. Unfortunately, over the years I've encountered many developers who felt that consistent code formatting merely slowed them down or they felt their style was "superior" and resisted any efforts to conform to the team's style. In many cases there will always be some "wiggle-room" in a style standard in order to allow for some variance or personal preference, but those usually exist only on the fringe and don't interfere with the style as a whole.

So why should a team adopt a consistent style? To the compiler, the format is largely irrelevant (except for languages such as Python), right? While that is strictly true, the compiler isn't the only consumer of the code. As the title of this post says, the code is the language through which the team communicates. It's not the only way the team will or even should interact, but it is one of the most significant communication mediums. I remember countless times where team members are discussing a feature, or idea and eventually someone would just say, "I don't really understand. Can you show me some code?" Then, once presented with an example piece of code or even a working prototype, the discussion then proceeds with everyone on the same page.

If you can agree that code is one of the critical ways in which a team communicates, then the formatting of that code is just as important and can be seen as the "dialect" of that language. By strictly enforcing a common coding style, the team members can now focus on what the code is saying, rather than how it is saying it. Within the last several years, I've found more evidence of this very concept in, ironically, the open-source community. I can say, without reservation, that most open-source projects enforce and require strict adherence to specific coding style. The Google Chromium project is clearly one such project. I can also report than even throughout the non-public Google code, a very strict coding style and standard is kept; this is true for C++, Java, Go and even Python code.

So what is so important about maintaining a common coding style? If you're a good developer, you can read the code no matter what, right? Yes, I've had people tell me that. Almost without exception, this person has never worked on a large enough team or codebase. However, that's not always the case.

I remember a specific case while at Borland that happened on the C++ team. It was an interesting event such that many of the old-timers still talk about it today. Essentially, what happened was that a developer was working on a rather large, complicated feature for the C++ compiler. When this developer finally went to commit his/her changes, they had also reformatted most of the codebase into his/her preferred coding style/format. This was even for files for which no other changes had been made!

Imagine the next developer coming along and pulling down the latest changes from the source control system and trying to merge them into their own local changes. Because of all these code-format-only changes, it became nearly impossible to merge any changes without going through every conflicted file and painstakingly reconcile the changes. Given the time when this happened, source-control systems weren't nearly as good as they are today; there was no Subversion, Git, Mercurial, or even StarTeam or Perforce. Source code merging was avoided at all costs. Reverting a change (as teams would do today) was as painful if not more so than the original commit. Had there been a formal code-review process, the reviewers could have pushed back on all the formatting changes.

It's been a bit of role-reversal to be on the receiving end of remarks in a code-review that admonish me to correct my code style and formatting. Those remarks are taken just as seriously as any other comment or question about the actual functionality of the code. While I'm still getting used to the formatting and style for the Chrome codebase, it's refreshing to see that the whole team is committed to ensuring that it consistently applied.