The use of bleeding-edge technology in the enterprise can be a daunting prospect. There are bugs to deal with, nuances to learn and third party libraries to overcome. Our team has been dealing with all of these issues over the past few months since one of our clients decided to use node for an upcoming project.

Node provides an event-driven JavaScript engine for the development of server-side applications. If the git logs are to be believed, the first commits to node were made on the 16th of February, 2009. That’s about a year and a half ago as I write this. As such, the surrounding community is still quite young. It’s this immaturity that has presented us with some of our biggest challenges while using node.

One early issue we ran into was template engines: it became apparent rather early on that our choice of template engines for node was somewhat limited. Although there are plenty of projects dedicated to delivering template engines for node, many are of a similar style: clones of HAML, cTemplate, ERB/EJS and a few other weird variants.

Our team started out using json-template, but found it cumbersome to structure our template data in the same way it was to be rendered (amongst other things). We then tried various other template engines and encountered similar issues. The nail in the coffin was that the client wasn’t too keen on the idea of using EJS or HAML style templates.

So we took matters into our own hands and churned out the initial version of jazz over the course of a few evenings.

Jazz is a simple, but powerful templating engine that compiles down to JavaScript at runtime. It supports many features expected of modern templating engines with simple syntax and there are plenty of examples of how to use it. Feedback from the client has been positive, with jazz “just working” where other template engines were causing grief.

Of course, this isn’t to say that there is something wrong with all existing template engines for node: many people on the node mailing list are happy with things like mustache.js and the many HAML-esque engines out there. Our needs/wants are just a little different to what is currently popular in the node community.

So if you’re using node and the more popular template engines weird you out, consider giving jazz a try. You can get the (GPLed) source code using the following git command:

$ git clone git://github.com/shinetech/jazz.git jazz

If you like jazz, please help us make it better! We’re always looking for patches, so drop me a line (thomas [dot] lee [at] shinetech.com) with any improvements.

Session 3: Build In Enterprise, Ant, Maven, Dependency Management


  • Better builds with Maven has misled a lot of developers, it encouraged bad practice of using child modules. It was suggested that the book should be renamed to ‘Broken builds with Maven.’ There’s a new Maven book being written about using Maven the right way.
  • Good thing about Maven is that it encapsulates procedural steps in a declarative manner.
  • Teams that use Ant successfully tend to employ a certain level of standard and convention.
  • Easy Ant - Ant with standards.
  • If Ant works for you, there is no reason for adopting Maven.
  • It’s always easier to figure out a Maven POM rather than trying to understand an Ant build script.
  • Build is never easy, it looks easy, but it’s not.
  • You want everyone in your team to be able to modify code, but you don’t want everyone to modify build.
  • You shouldn’t avoid maintaining build script, the technical debt will be too much at the end.
  • The quality of artifacts available from Maven central repository can be improved by enforcing tests to be part of the development culture, a la Perl testing and CPAN.
  • Maven central repository needs to improve metadata quality, lots of broken POM.
  • There’s an opportunity for an open quality dashboard system in continuous integration marketplace.
  • How you build your product is often part of the audit process, specially when another company wants to acquire your product.
  • Sonar is handy to calculate your technical debt.
  • Crap4j measures code quality using an algorithm.
  • Avoid setting latest/SNAPSHOT as a dependency to avoid crap release creeping in to your build, use version or version range.

Session 4: Continuous Integration Tools

  • The pricing of CI tools can be roughly grouped into developer budget (i.e. free ones like Hudson, CruiseControl, Luntbuild, Buildbot), manager budget (i.e. not free but not expensive like Atlassian Bamboo), and director++ budget (e.g. Anthill Pro, IBM Rational BuildForge).
  • The price gap between manager budget and director budget is pretty wide, this is because you can’t sell moderately expensive software.
  • Interestingly, AnthillPro finds projects which already use Hudson to be a good starting point to migrate from.
  • Deployment should be scripted, use the same script for all environments, use the script as early as possible (in dev environment).
  • When you have any doubt about Hudson scalability, add more slaves.
  • You’ll be surprised at how there are still projects out there without any proper build script.
  • Git bisection is handy to identify point of failure and rollback.

Session 5: Continuous Deployment

  • IMVU deploys to production fifty times a day.
  • Flickr deploys 10+ times per day.
  • When you deploy to production on each commit, you care less about zero defect, but you care more about having really good tests. And if there’s a defect, you handle each defect exactly once by adding more tests around the defect.
  • Deploy on commit allows developer to concentrate on changes in small, more manageable, chunks.
  • Feature flags give you the opportunity to push bug fixes to production with half-baked new features disabled. You can then enable those features in the future when they’re ready.
  • Feature flag is also handy for doing A/B testing
  • Deploy straight to production requires more care when there are database changes involved, specially on backward patch.
  • Continuous deployment is not to be confused with cowboy development. Cowboy development is not about the frequency, but more on the lack of quality. Continuous deployment care about having good tests.
  • Large corporate culture of having management signing off every little change can be a blocker to employing continuous deployment. One solution to this is to have the management sign off the process, instead of the change itself.
  • Continuous deployment is not for all projects. It won’t work for legacy system lacking good tests. It won’t work for projects with large data that requires time-consuming post-deployment processing.
  • It’s questionable how well the concept of continuous deployment is going to work in finance industry where a single error can have a legal implication.

Day 2 ended with a retrospective session where many of us highlighted how well the open space format worked, and how good the quality of the discussions is. My favourite sessions were the ‘Evolution of Agile and Continuous Integration’ talk facilitated by Jeff, and ‘Continuous Deployment’ talk facilitated by Nigel McNie. I learned a great deal from those discussions, I had my a-ha moments, and there were a number of ideas I could pass on to others in my organisation/team.

I believe that the main reason why open space worked really well was the participants, everyone who was there, they were there because they cared, and somehow the level of passion was higher compared to other events I attended in the past. CITCON rocked!

Day 2 started at 9am with the early attendees ‘refactoring’ the talks to fit into the schedule, this involved spreading the most voted talks into larger rooms, combining talks around similar topics, and at the same time making sure that no facilitator ended up with multiple sessions running on the same time period. There were about 30 proposed talks with 20 time slots, 4 sessions running in parallel per time period.



Refactoring session.


It was raining all day, a typical Wellington weather I was told.

And here are some short notes from the sessions I attended:

Session 1: ATDD and BDD


  • A good way to make developers realise the value of BDD is by involving them in a project they’re not familiar with, that already have well defined tests/behaviours. Some open source projects can be a good example.
  • BDD can be seen as a subset of ATDD, ATDD involves implementing requirements into executables.
  • FitNesse works really well for some teams, others totally dislike it.
  • 100% control of large test data (e.g. in finance industry) is often not possible, specially when the data can’t be easily copied (e.g. due to legacy system restriction), or when there are multiple teams utilising the same set of data.
  • Some people strongly believe that web testing should be about testing the presentation layer only and hence can be made minimal, time is better spent on heavily testing the data/model. Others disagree and believe that a complete end to end web testing is always needed, nothing can replace humans.

Session 2: Evolution of Agile and Continuous Integration

  • Check out Rapid Development by Steve McConnell, published in 1996, you’ll soon realise that there are still many companies which development practices are only 15 years behind.
  • Pre & post chasm diagram, having started employing CI in 2002/3, did that make me an early adopter?
  • At the beginning, most people identified agile as XP “Yea, we do XP!” Now, most people are able to differentiate between Scrum and XP.
  • Use CI Maturity Model to identify where your organisation is at in terms of CI practices.
  • That maturity model is also an excellent resource to convince others in your organisation that some existing policies need to be updated to allow transition to the next level.
  • Back then, only developers adopted agile practices. By now, QA is already involved. Next up is operations and infrastructures. My personal take is that it will be 3-5 years from now.
  • Nowadays agile is _understood_ by development team and QA, the problem is with governance.
  • Jeff emphasized that the future will be in lean management. The key to lean is to reduce cycle time.
  • Sometimes a rubber chicken is all that’s needed to do continuous integration.
  • Would be nice to have a push-button deployment to production.
  • Someone from operations group must be brought into the team, be involved in the iterations/cycles/sprints, just like QA/testing is now.
  • Broken builds often get ignored, this is where team must figure out what works for them, whether it be lava lamps, flashing light, IRC notification, or other feedback mechanisms.
  • The main problem with CI in enterprise is when it gets centralized and development teams suddenly lose visibility. Instead of teams running their own CI tools (e.g. a build server under someone’s desk), it’s now the build engineers who control those tools and often restricting access.
  • When operations team is against the idea of transitioning CI to the next step, start with empathy and try to look at the situation from their point of view. Try showing them the business value behind the transition, involve someone from higher up in management.
  • The culture of an organization is often a blocker to further transition in the maturity model.
  • Know when to stop. When the culture resists further transition and you can’t change the culture, stop, you’re fighting a losing battle.

Continue reading CITCON Australia/New Zealand 2010 - Day 2 (Part 2).

I attended CITCON A/NZ 2010 in Wellington, New Zealand. CITCON is an unconference on Continuous Integration and Testing, it uses an open space format (which worked really really well), and CITCON rocked! (more about this later)




CITCON breaks Wellington :)

My flight arrived at 3.30pm, and day 1 activities started at 6.30pm. Paul Julius and Jeffrey Fredrick kicked-off the show by introducing what CITCON was about and how it was going to work, the rest was then up to us, the attendees.




PJ and Jeff.

The attendees then suggested the topics they would like to facilitate and voted on the ones they would like to attend. Given that it was my first open space conference, I wasn’t sure how open space could possibly work out. I read about how it’s _supposed_ to work, but I still wondered “really?”




I was introduced to the awesomeness of Sharpie.

The voting then continued during social hour at Foxglove. I had a chance to talk to Bruce Chapman who suggested an informal Hudson BOF session on the topic of the pain and pleasure of using Hudson plugins. In summary we agreed that Hudson core is solid, but there’s a need for better orchestration on those myriad of plugins, the question is who’s going to do it because clearly leaving it to the plugin maintainers haven’t really worked out (myself included, guilty).

Does a plugin need to have five features when most people only need one? Should a plugin do one thing only and leave it to the users to combine the plugins? There’s also clearly a need for better plugin information/metadata, it’s currently hard to find out which plugins to use if you need to do certain things. Perhaps a set of Hudson recipes need to be documented and shared on the wiki. Those were the questions and ideas that I would like to raise with others in Hudson community.

That’s it for day 1, we basically prepared the topics followed by some informal discussions over drinks.
Continue reading CITCON Australia/New Zealand 2010 - Day 2 (Part 1)

On an early Rails project I elected to use instantiated fixtures. My rationale was that it would make my test code cleaner and easier to understand. Sure, test_helper.rb warned me that they would be slow, but how bad could it be?

Well, pretty bad as it turns out. The average execution time of my test suite (with 1052 tests) using instantiated fixtures was around 590 seconds. That’s almost 10 minutes.

Suspecting that it might be the instantiated fixtures that were slowing things down, I modified a couple of tests to see if they could be sped up by switching to standard fixtures. The results were promising, so I switched everything.

The average execution time fell to 123 seconds. That’s a factor of 4.77 difference. In other words, my test time dropped from almost 10 minutes to just over two minutes.

The funny part is that instantiated fixtures didn’t make the code that much cleaner anyway, the reason being that model names would often end up embedded in fixture names. For example, if I had a User model, I might end up referencing a fixture as follows:

@user_in_more_than_one_group

The fixture was named this way because it wouldn’t make much sense if the name wasn’t prefixed with ‘user’. But when you think about it, there’s not much difference between this and the following:

users(:in_more_than_one_group)

So when I shifted away from instantiated fixtures, my code got only slightly more verbose, but the test execution time dropped by almost 80%.

Now it could be argued that these aren’t particular good fixture names, and that if they are for testing specific scenarios, I’d be better off using something like Factory Girl or Machinist. That’s a good point, but unfortunately I’m not in a position to make that change at the moment.

The bottom line is that I got caught out by just how slow instantiated fixtures are. Your results may differ, but it’s definitely worth investigating if your tests are slow and you’re using instantiated fixtures.

You’ve probably heard it somewhere already: NoSQL is the new hotness. There are a growing number of weirdly named storage engines out there purporting to be part of the NoSQL movement. This post is the first of a small series about some recent work we’ve been doing with CouchDB. The project is still ongoing, but now seems like an appropriate time to cover what I’ve learned so far. If I’ve made any newbie mistakes, please feel free to flame me in the comments.

Anyway, let’s get on with it then. In this post, I’ll talk a little about the bits of CouchDB that have instantly appealed to me. The good bits.

CouchDB fundamentals are easy to understand

People rave about performance vs. traditional RDBM systems and its powerful replication features, but the thing that really stands out for me is how easy it is to understand and start using CouchDB. I love CouchDB’s simplicity — and I’m not just talking about its neat little web-based user interface, Futon. If you can largely comprehend JSON, JavaScript and HTTP, you’re well on the way to understanding the basics of CouchDB.

In a single CouchDB instance you can have any number of databases. Each database acts as an isolated namespace for a collection of documents. Each document is represented in CouchDB as a JSON object. Documents are not associated with one another in any meaningful way as they might be in an RDBMS. CouchDB databases may also have any number of “design documents”, which are special documents where you may write functions for views and validation functions, among other things.

Users define views in CouchDB by writing functions that map a document to one or more key/value pairs and, optionally, functions to somehow combine the aggregate values associated with each key. This approach is more popularly known as Map/Reduce. Out of the box, CouchDB supports JavaScript for writing Map/Reduce functions using Mozilla SpiderMonkey.

With all this talk about JSON and JavaScript, you might be thinking that most of your interactions with CouchDB are going to involve JavaScript in one form or another — and you probably wouldn’t be far off the mark. By default, CouchDB supports JavaScript for Map/Reduce functions, but is capable of supporting any language for which a query server implementation exists. See CouchDB’s documentation on the topic for supported languages.

CouchDB’s API is HTTP and JSON

Another thing that makes CouchDB great for developers is its RESTful HTTP API and the use of JSON objects to represent documents and collections of documents. This gives us a lot of flexibility when choosing tools — flexibility that we probably wouldn’t have if we were working with custom, binary or closed protocols and/or data formats.

For example, I recently ran into something that many mainstream CouchDB client libraries seem to struggle with: stream-based parsing of JSON arrays. It was a simple matter to quickly hack together a solution to my particular problem using Jackson, JRuby and plain old HTTP. Running into this kind of limitation in a library supporting a closed protocol or even an open-but-unfamiliar binary protocol (as in MongoDB) would have required significantly more effort on my part.

No Architectural Lock-in

CouchDB doesn’t force you into any arrangement of nodes. All CouchDB instances are equal, with no explicit master/slave relationships. Any CouchDB instance can push to any other CouchDB instance. This means you are free to pick an arrangement of Couch nodes that works for your particular application. This is probably to be expected, after all: NoSQL is about choice.

Easy Replication

Replicating data from one CouchDB database to another is a snap. Even if the target database is over a network. Further, replication is very easy to set up and use. I’ve not yet had the opportunity to use this feature in the work we’re doing beyond a few simple tests, but it’s already quite clear that CouchDB’s support for replication is going to make our lives a lot easier when it comes time to scale out.

The Stuff You Get For Free

Append-only writes ensure that data corruption is not an issue.

Automatic conflict resolution ensures that peers can replicate between one another without needing too much hand-holding.

There’s a lot of stuff in CouchDB that you probably don’t experience directly, but that you will often hear touted as benefits. Some of these features are really “behind the scenes” and don’t necessarily jump out to slap you in the face. As a result, I think to a degree we take a some of these more subtle CouchDB features for granted.

A Summary of the Good Bits

The system is conceptually simple. The JSON/HTTP API is beautiful and easy. Replication between nodes is a walk in the park. As a result, you can arrange CouchDB nodes in a way that suit you and your problem. Then, of course, there all the little things that CouchDB does which you will likely take for granted until they save your skin.

There’s a lot to like about CouchDB, but it hasn’t been all roses. In the next part of this series, I’ll talk a little about the aspects of the system that have caused us some pain on a real-world project.

Late last December, Shine allowed me to spend a couple of days working on some Hudson contributions. I was planning to finish 3-4 plugins, but used up too much time trying various approaches and testing various scenarios. I actually ended up with unfinished implementations back then.

Fast forward to last week, I finally managed to spend a little bit of my own time to finish the plugin(s), and I also accidentally found a bug with Recorder#perform’s return value being ignored from build status determination. Contacted the dev list and Kohsuke has since fixed it in r29836, the fix will be included in Hudson 1.354.

The first plugin I worked on was SiteMonitor Plugin, it’s basically used for monitoring a web site’s up/down status. At work we used to do this by using Ant Conditions Task, I know other people who use wget and various tools to achieve the same result. I just thought it would be much simpler for a Hudson user to be able to install a plugin and add some URLs to monitor without the need to implement any script.

Here’s how SiteMonitor Plugin looks like:

the report

job configuration

global configuration

This plugin is still at an early stage and currently only reports response code and up/down status of a web site. It doesn’t have any roadmap per se, but I think future versions of the plugin should have more Pingdom-like features and reports. If you have any request or suggestion, feel free to leave a comment on this blog or raise a JIRA issue and assign it to me.

The rest of my December effort went to JSLint reporting, which ended up as such a simple enhancement to Violations Plugin. It was initially implemented as a standalone plugin, first, it attempted to include a simple Java wrapper for JSLint, a parser for JSLint’s text output, and a report generator. The initial implementation was ditched when I found out that there’s already jslint4java.

So I decided to leave the JSLint handling up to the project being built itself, and the plugin implementation only takes care of output parsing (jslint4java provides a nice XML output on top of JSLint’s default text output) and generating a report page. I used Checkstyle Plugin as a base for my implementation. But this effort was also then scrapped after I found out that Violations Plugin already provides all the groundwork necessary for code violations (duh!) reporting.

Adding JSLint as another type of violations was way too easy in Violations Plugin.

Here’s how JSLint report looks like:

All in all, Hudson’s large array of plugins is a proof of how extensible it is as a platform. Creating new plugins isn’t that hard and Hudson API is quite pleasant to work with. I hope these two little contributions could be useful for someone out there.

Those who have completed the SCJP exam would recall some of the challenges faced while tackling weird looking code problems aimed at thoroughly testing your understanding of the Java language. While the SCWCD exam has minimal code related questions, which are easier to interpret, it does have its own series of challenges to overcome. Preparation is one of them and can be quite daunting for those new to JEE.

Before I begin, I would like to highlight some of the advantages I experienced after obtaining the certification. Firstly, it provides you with an in-depth knowledge of the JEE platform. Your mind may not retain much of what you studied over the passing months but the strong fundamentals do help you to question best practices when working on a project. Secondly, it also allows you to formulate solutions based on what JEE has to offer and thirdly, the fundamentals could also aid you to spot similarities with other web technologies on the market and eventually become proficient in them.

To ensure that your preparation is effective, it is very important to choose a good study guide. For this, I chose the Head First: Servlets and JSPs book. There are two reasons that made me use this book. Number one, it was an advice from my colleague Glen Worsley, who had already taken and passed the exam. Number two, Head First books market themselves as “Brain Friendly” guides. I cannot stress enough the importance of having a brain friendly guide for something rather complex like JEE. Most textbooks are crammed with words and too much details. The simple diagrams and humorous examples (it’s good to be idempotent!) do help the reader to retain important bits of information. There is plenty to remember and selecting an exam date is as tough as hitting a moving target as I came to learn later.

Progress was slow. The original plan was to complete a chapter a week. An ideal plan it was, but I only managed to read through about 2 or 3 chapters within a month. I dutifully did the questions in the book after each chapter and found them rather tough which is for my own good. I did reasonably well on the first few but I was having a really difficult time when I got to the chapters on expression language and JSTL. There are several ways to accomplish the same thing and one of the difficulties is in trying to remember the JEE APIs taught in the earlier chapters as well as the various JSTL/action tags, which exist for similar goals. A handy tip would be to memorise and understand the various possible tags within the deployment descriptor by heart. Doing so would mean that you have studied a few portions of several chapters in the book. Six months had passed by the time I finished the book and the chapter questions.

While searching the Internet for good SCWCD training tools, I eventually came across Enthuware’s software. Its a really affordable exam simulator with about 8 mock exams and various questions for each chapter. Questions are categorized by their level of difficulty. I attempted the questions for all the chapters and found them not as tough as the ones in the book. Nevertheless, those questions were good practice as it helped me to gauge my strengths and weaknesses. These were presented as graphs and they are a good indicator of which chapters to re-study.

The mock exam questions found at the back of the Head First book is the penultimate gauge to indicate your level of preparedness. It should be done only once and when you truly feel that you know JEE kung-fu. Doing the mock exam too many times can lead to inaccurate results and a false sense of security which may lead to a severe defeat on exam day. Another technique I developed while revising for the exam is to tackle questions from random chapters in the book. This indicates that you are well trained to handle any surprise attacks which you may not see coming. It also means that you have kept the study material from both the early and late chapters close to your heart.

I eventually picked a good date to sit for the exam and passed with a good score. As usual, I marked the unsure questions and did the ones I knew. I had about an hour of extra time by the time I finished my first run of the questions. I then used the remaining time to ponder on the marked questions. Months of preparation had finally paid off and I was able to enjoy a healthy balance of work and personal life once again. As with most exams, the preparation does require some amount of sacrifice to personal time, but the returns and benefits are rewarding.

This is a short one, but we’ve had to do it a couple of times so I thought I’d put it up.

As you may be aware, Rails 2.2 introduced a new format for test names. Where you once might have had:

def test_should_do_stuff
  ...
end

You can now have:

test 'should do stuff' do
  ...
end

We’ve found this much easier to read and type - especially when your test names start to get big, or would be more readable if they contained characters that aren’t valid in Ruby method names.

However, if you’ve got lots of existing tests and want to shift them across to this format, it can be a pain to do it manually. Unless you’ve got a Ruby script, of course:

directory = 'unit'
Dir.mkdir("#{directory}.new")
Dir.new(directory).entries.each do |entry|
  original_filename = "#{directory}/#{entry}"
  File.open("#{directory}.new/#{entry}", 'w+') do |new_file|
    File.open(original_filename).readlines.each do |line|
      new_file.puts(line.gsub(/def test_*(.*)/) do |match|
          "test '#{$1.gsub('_', ' ')}' do"
        end)
    end
  end unless File.directory?(original_filename)
end

..which you’ll also find in this gist.

If you go into your ‘test’ folder and run this script, it’ll create a new directory called ‘unit.new’, that contains copies of all of your original unit tests, converted to the new format. Change ‘unit’ to ‘functional’, and it’ll do the same for your functional tests. Note that this will not do tests in subdirectories. YMMV. Feel free to steal and modify as you see fit.

Not that long ago I gave a demo in which I showed how it was possible to control iTunes from my laptop using native Ruby code.

This was all possible because of a great little gem called rb-appscript.

rb-appscript is a really neat Ruby to AppleScript bridge that gives you the power to effortlessly control any AppleScript aware applications from within your very own Ruby applications.

Launching iTunes and the DVD Player via irb

Assuming you have already installed the rb-appscript gem open up a Terminal shell and lauch irb. What we are going to do is in a few lines of code open up iTunes from ruby.

>> require ‘appscript’
=> true (notice above we don’t need to prefix the gem with rb-)
>> it = Appscript.app(’iTunes’)
>> it.run

Now how easy was that! iTunes just started up as soon as we ran it.run. Obviously if you already had iTunes running you would be disappointed because nothing would have happened.
You can use the above code for any application you could easily have just done

>> Appscript.app(’Dvd Player’).run

If you wanted to launch the dvd player.

Let’s tell iTunes what track we want to play

In the following example I am going to show you how you can retrieve a list of tracks from your Music library and tell iTunes to play our selection.

>> require ‘appscript’
=> true (notice above we dont need to prefix the gem with rb-)
>> it = Appscript.app(’iTunes’)
>> it.run

Print out our tracks we have available to us

>> track_count = 0
>> it.playlists["Music"].tracks.get.each do | track |
?> puts “#{track_count += 1}. #{track.artist.get} - #{track.name.get}”
>> end

Ok play the 3rd track

t.playlists["Music"].tracks[3].play

Magic happens the song starts playing!

Script Editor + ASTranslate is your friend

Now the examples I have given you are very basic however if you want to do more then you should launch Script Editor and load the application dictionary that gives you access to all commands and properties you can change for that application.

ASTranslate is a great developer tool that allows you to write normal AppleScript commands and it will attempt to translate this into the rb-appscript equivalent!  So if you come across any cool AppleScripts out there you may be lucky and be able to translate them into ruby code!

I was amazed at how easy it was for me to command iTunes to do what I wanted it to do all from within my ruby application.

Useful references

http://appscript.sourceforge.net
http://www.apeth.com/rbappscript/10examples.html