I recently read some academic papers that quantify some release engineering practices in open source communities. Very interesting.
The first one, “An Empirical Study of Build Maintenance Effort” (PDF), looks at how the time spent maintaining the build impacts the developers in a project. It examines build coupling which refers to the how often changes in source code require changes in build code, as well as build ownership, which is the proportion of developers on the team who are responsible for maintaining the build. The projects studied were ArgoUML, Hibernate, Eclipse (SDK), Jazz, GCC, Git, Linux, Mozilla, PLlot and PostgresSQL.
Here are some snippets I found interesting in this paper. Note that when they refer to “Eclipse-core” they mean the Eclipse project’s build. In this paragraph, they are talking about build coupling and how to reduce it
“In Eclipse-core, the coupling is reduced to 16% and in Jazz the observed coupling is a mere 4%. Eclipse-core and Jazz both leverage the automated Eclipse Plugin Development Environment (PDE) build technology. ….. Since the developer must only maintain the high-level build.properties file (via the IDE), the daily build maintenance is reduced”.
Thank you PDE family. Dynamically generating Ant scripts at build time to compile and package our bundles makes us look good when compared with other open source projects.
“Other researches have found that the Linux build engineers have spent a considerable amount of time to make their build system as simple as possible for developers, as the expense of a very complex and hard to maintain core build system machinery. “
People always assume that writing software is hard, but for some reason building software should be easy. Build systems are complex beasts and obfuscating the complexity for the user is just as difficult as it might be when writing a full scale application. Writing good software is difficult, and so is constructing an elegant and effective build system. It’s just a different skill set.
“Up to 79% of source code developers and 89% of test code developers are significantly impacted by build maintenance, yet investment in build experts can reduce the proportion of impacted developers by 22% of source code developers and 24% of test code developers.”
So it looks like if you hire a release engineer and the productivity of your developers will increase. You would buy a new machine to make the build faster, why not hire a fantastic release engineer and make the build better? The numbers indicate an great return on investment.
Image ©algonquin_college, http://www.flickr.com/photos/algonquin_college/4971004199/in/photostream/ licensed under Attribution-ShareAlike 2.0 Generic (CC BY-SA 2.0)
The second paper of is interest is entitled “The Evolution of Java Build Systems” (PDF).
Again it looks at open source build systems that use either Ant (ArgoUML, Tomcat, JBoss, Eclipse-core) or Maven (Hibernate and Geronimo). The authors find that as the number of source lines of code being built (SLOC) is strongly correlated with the number of build lines of code (BLOC).
“Similar to Lehman’s first law of software evolution, build system specifications tend to grow over time unless explicit effort is put into refactoring them.”
“The Halstead complexity of a build system is highly correlated with the build system’s size (BLOC), indicating the BLOC is a good approximation of build system complexity“.
Lots of build code means increasing complexity. If you are only building a few bundles, your build is easier to understand. Makes sense.
From their analysis, they concluded found that both Ant and Maven based builds evolved in similar fashion.
I like this sentence
“Despite the crucial role of build systems and their non-trivial maintenance effort, software engineering research rarely focuses in them“.
I’d like to thank the authors for conducting this research and look forward to reading more in the future. Build systems are complex systems and I welcome the efforts of these researchers to quantify way they can be improved. And it’s extra special when they look at the projects that we work on every day!
It will never work in theory
Queens University Software Analysis and Intelligence Lab
Great post! I agree that more companies should invest in build and release engineers!
Makes a lot of sense. I remember the days before PDE build when we really did just have a few plugins to build. Then we all hand coded the compile-time class path. Got it wrong several times then finally got it working only to have to add some new dependency. PDE Build certainly has grown with time. In most cases though, due to specific needs.
I also agree that people most often see build as “the easy part”. How hard could it be. Compile a few classes, JAR them up. Voila! That's why for years I have challenged other build tech teams (e.g., Maven) to do a complete build of a product sdk with features, fragments, source, signing, zipping, p2 publishing, version management, tagging, … Its great to see that Tycho is pretty darn close now from what I hear and bnd build does some of the stuff. Its great for releng teams to have a choice of great technologies.
Anyone can compile a JAR. Few can build a system. My hat is of to those that do.
Thanks Jeff – I really liked your comment. I agree that different build technologies are great but at the same time they must prove that they can handle all our requirements.
Hope all is well on the west coast and you are enjoying your new job and city 🙂