One of the plan items for Eclipse 3.7 I’ve assigned is to migrate the Eclipse and Equinox build to run on the Hudson installation at eclipse.org. So I’ve spent a lot of quality time with Hudson lately. (Soon to be called Jenkins).
Today, I attended a webinar about optimizing Hudson in production hosted by Kohsuke Kawaguchi of CloudBees. Since my handwriting is similar to the last slide in this Oatmeal cartoon, I thought it would be better to jot my notes down in a blog post. As I was listening to his suggestions, I thought about how his seven best practices apply to our use of Hudson at Eclipse.
- Backup your Hudson install to protect from disaster and accidental configuration changes.
Image © davemorris, http://www.flickr.com/photos/davemorris/2395980976/sizes/z/in/photostream/ licensed under Creative Commons by-nc-sa 2.0
- $HUDSON_HOME on the master stores everything that needs to be backed up. The slaves doesn’t need to be backed up.
- Live backups are fine as configuration changes are atomic
- In addition to the backup of the configuration, a backup of the filesystem is necessary to provide consistent snapshot.
- Backups are great but can you restore? On a regular basis, try restoring your hudson configuration and starting up a test hudson instance with the restored hudson.war file.
- Prepare for disk utilization growth
- Space is more important than speed when ordering new drives
- Plan for expandable disk volumes
- If it’s too late, symlink is your friend
3. Use native packages
- For instance, *.deb, *.rpm. Easy to update and install, and they include init scripts.
- Configuration is stored in /etc/default/hudson and /etc/sysconfig/hudson
- Existing $HUDSON_HOME can be migrated
- Native Windows package is a work in progress
@Eclipse: Backups and disk configuration are managed by the webmasters, along with upgrading and installing hudson.
4. Distributed builds
- You will grow beyond a single system
- Increasing load on a single machine is not the only issue
- It’s cheaper to grow horizontally (additional machines) versus vertically (more memory and CPUs on a single machine)
- It’s also good to have isolation between build processes in case you want to diversify the test and build platforms
- Let master launch slaves via SSH or DCOM. This makes it easier to keep the cluster up and running.
- SSH public key authentication makes it easy to connect to slaves on disparate platforms.
@Eclipse: Yes, we have distributed builds and several platforms available for test purposes.
Treat build machines like livestock, not like pets
Image © davegroth, http://www.flickr.com/photos/davegroth/9107888/sizes/m/in/photostream/ licensed under Creative Commons by-nc-sa 2.0
- Build machines should be interchangeable
- Don’t tie builds to a specific build machine, unless there are platform limitations. This approach will yield better resource utilization and less downtime. Queued builds can use the next available machine instead of waiting for a specific machine.
- When using labels for your Hudson slaves, name them to reflect their capability. For instance, if the slave is Mac or Windows hardware, the label should reflect this.
- If there is a real reason to limit build to run on a single machine (such as wanting to run platform specific tests), tou can use boolean expressions with labels. For instance, this Hudson job shouldn’t run the slaves running Mac or Windows
@Eclipse: We use label our build machines as livestock. No pets for us. However, I’m quite envious that the Mozilla foundation has adopted red pandas (firefoxes).
6. Invest in a good URL
- By default Hudson runs on port 8080. This is difficult for users to remember.
- Use the host’s alias instead of the machine name. Thus if the machine names changes, the users won’t have to go to a new URL.
- Share port 80 with other applications using Apache reverse proxy with which will also allow you to run Hudson as a non-root user.
@Eclipse: https://hudson.eclipse.org is short and simple.
7. Try to keep the build records under control.
- Discard old build records if you can
- Removing old builds reduces start-up time and memory usage.
- Settings are on a per-project basis. For instance, here are settings for one of my Eclipse builds.
@Eclipse: Yes, we do clean up old builds. Also, the webmasters have created a cron job to send a friendly email every week listing of the Hudson disk utilization to remind the release engineers to clean house.
The slides are available here and obviously provide a lot more detail than my short summary. If you’re interested the best practices for configuring Hudson for a production environment that has room to grow, this presentation is provides very useful suggestions.