Monday, April 30, 2007

Parallel or Serial...

Steve Trefethen has been posting some great information about our use of the Subversion source code control system and Cruisecontrol.NET.  One thing that I've noticed is that over the years version control systems are becoming more and more sophisticated and have finally begun to realized that version control is not just about files.  It is also about configurations.  They're also starting to realize that files are islands unto themselves.  How many larger projects have you worked on where adding a feature, fixing a defect, or refactoring operation affected only one file?  Project changes come in the form of “change-sets” or a group of interrelated files and configurations.

Prior to coming to work for CodeGear (and Borland), I was using a stand-alone version of PVCS and was manually doing check-ins and check-outs on a per-file basis.  So I already knew the value of version control systems, even ancient and early systems like PVCS and it's predecessor, RCS.  One of my first tasks on the day I arrived at work, was to “check out” the development tree from the PVCS archives.  What was different was that there were a whole suite of internally developed tools and wrappers around the PVCS command-line with other command-line tools.  These tools allowed “scripting” of the check-in process whereby the developer would fill in a “script” text file that contained the references to all the files to be checked in along with a space for a comment associated with each one.  There was also a global section for adding a summary comment that was a suppose to describe in higher-level terms what was included in this check-in.  This information was appended to a running log file that was used by QA to assist them in figuring out what R&D was actually checking in.

The one key point to all of this was that we never employed any kind of locking model that seemed to be so prevalent among the vast majority of version control systems, including PVCS.  So I learned the modify-at-will and merge-when-needed mode of working with the source trees.  I remember getting into philosophical discussions with folks regarding the overall merits of the lock-modify-checkin modal compared to the modify-at-will model.  I came to refer to these models as the Serial and Parallel models, respectively.  Over the years, there was a lot of effort put into layering tools over the existing version control systems in an attempt to “fix“ many of their drawbacks.  When working in a parallel model, nobody is ever blocked from making changes, but that does require a little bit more from the tools.  The issue of an atomic checkin becomes way more important since the archives can be changed by one person while another is in the midst of a bulk checkout.  The chances of getting a partial check-in has greatly increased.  Some folks tend to use these “drawbacks“ as a reason for staying in the stone ages of lock-modify-checkin.  However, with the proper discipline, toolsets, and overall mindset, the parallel model is very liberating for a team.

It seems that after many years of debate, the parallel model is beginning to hit more mainstream.  It kind of started with the introduction of CVS.  However, CVS suffered from one major problem; the lack of an atomic check-in model.  It also suffered from not being able to handle binary files in a very efficient manner.  Subversion was introduced to solve these (and other) issues.  In fact, Subversion's main charter was to supplant CVS as the dominant open source (or otherwise), version control system.  I must say that after using Subversion on the Developer Studio team for a little over a year, they have a real shot at retiring CVS.

So, how does your team work?  Serial or Parallel and why?  I do understand some of the virtues of the lock-modify-checkin model.  But as your team grows and/or becomes more geographically disperse, you may begin to notice that team members are forcefully breaking locks in order to get their stuff checked in.  One bit of advise I can give you and your team should you ever move to a parallel model is to get yourself a very good source code differencing/merge tool such as Beyond Compare or Araxis Merge.  Make sure *all* your .dfm files are checked in as text (this allows easy diff'ing and merging of changes).

So, there's your assignment.  Let me know how your team works, what version control system you use and why?  One's choice of VCS systems can sometimes elicit nearly the same passionate responses as one's choice of development tool.  I'm looking for objective analysis and an understanding of why you came to that descision.  Things like, “I use XXX because everything else sucks!” are simply not helpful. 

I'll start with why we chose Subversion.  It closely matched our overall development process, had true atomic check-ins, extensive and growing number of supporting tools, it also handled renames, moves, and deletes very well.  Every check-in is represented by a single revision number which can be used to check-out the entire source-code tree by using only one number.  No need to use error-prone dates or create a tag/label for every checkin.  It was also a natural fit for our extensive use of CruiseControl.NET continous integration server.