Archive for the Tools Category

The use of bleeding-edge technology in the enterprise can be a daunting prospect. There are bugs to deal with, nuances to learn and third party libraries to overcome. Our team has been dealing with all of these issues over the past few months since one of our clients decided to use node for an upcoming project.

Node provides an event-driven JavaScript engine for the development of server-side applications. If the git logs are to be believed, the first commits to node were made on the 16th of February, 2009. That’s about a year and a half ago as I write this. As such, the surrounding community is still quite young. It’s this immaturity that has presented us with some of our biggest challenges while using node.

One early issue we ran into was template engines: it became apparent rather early on that our choice of template engines for node was somewhat limited. Although there are plenty of projects dedicated to delivering template engines for node, many are of a similar style: clones of HAML, cTemplate, ERB/EJS and a few other weird variants.

Our team started out using json-template, but found it cumbersome to structure our template data in the same way it was to be rendered (amongst other things). We then tried various other template engines and encountered similar issues. The nail in the coffin was that the client wasn’t too keen on the idea of using EJS or HAML style templates.

So we took matters into our own hands and churned out the initial version of jazz over the course of a few evenings.

Jazz is a simple, but powerful templating engine that compiles down to JavaScript at runtime. It supports many features expected of modern templating engines with simple syntax and there are plenty of examples of how to use it. Feedback from the client has been positive, with jazz “just working” where other template engines were causing grief.

Of course, this isn’t to say that there is something wrong with all existing template engines for node: many people on the node mailing list are happy with things like mustache.js and the many HAML-esque engines out there. Our needs/wants are just a little different to what is currently popular in the node community.

So if you’re using node and the more popular template engines weird you out, consider giving jazz a try. You can get the (GPLed) source code using the following git command:

$ git clone git://github.com/shinetech/jazz.git jazz

If you like jazz, please help us make it better! We’re always looking for patches, so drop me a line (thomas [dot] lee [at] shinetech.com) with any improvements.

You’ve probably heard it somewhere already: NoSQL is the new hotness. There are a growing number of weirdly named storage engines out there purporting to be part of the NoSQL movement. This post is the first of a small series about some recent work we’ve been doing with CouchDB. The project is still ongoing, but now seems like an appropriate time to cover what I’ve learned so far. If I’ve made any newbie mistakes, please feel free to flame me in the comments.

Anyway, let’s get on with it then. In this post, I’ll talk a little about the bits of CouchDB that have instantly appealed to me. The good bits.

CouchDB fundamentals are easy to understand

People rave about performance vs. traditional RDBM systems and its powerful replication features, but the thing that really stands out for me is how easy it is to understand and start using CouchDB. I love CouchDB’s simplicity — and I’m not just talking about its neat little web-based user interface, Futon. If you can largely comprehend JSON, JavaScript and HTTP, you’re well on the way to understanding the basics of CouchDB.

In a single CouchDB instance you can have any number of databases. Each database acts as an isolated namespace for a collection of documents. Each document is represented in CouchDB as a JSON object. Documents are not associated with one another in any meaningful way as they might be in an RDBMS. CouchDB databases may also have any number of “design documents”, which are special documents where you may write functions for views and validation functions, among other things.

Users define views in CouchDB by writing functions that map a document to one or more key/value pairs and, optionally, functions to somehow combine the aggregate values associated with each key. This approach is more popularly known as Map/Reduce. Out of the box, CouchDB supports JavaScript for writing Map/Reduce functions using Mozilla SpiderMonkey.

With all this talk about JSON and JavaScript, you might be thinking that most of your interactions with CouchDB are going to involve JavaScript in one form or another — and you probably wouldn’t be far off the mark. By default, CouchDB supports JavaScript for Map/Reduce functions, but is capable of supporting any language for which a query server implementation exists. See CouchDB’s documentation on the topic for supported languages.

CouchDB’s API is HTTP and JSON

Another thing that makes CouchDB great for developers is its RESTful HTTP API and the use of JSON objects to represent documents and collections of documents. This gives us a lot of flexibility when choosing tools — flexibility that we probably wouldn’t have if we were working with custom, binary or closed protocols and/or data formats.

For example, I recently ran into something that many mainstream CouchDB client libraries seem to struggle with: stream-based parsing of JSON arrays. It was a simple matter to quickly hack together a solution to my particular problem using Jackson, JRuby and plain old HTTP. Running into this kind of limitation in a library supporting a closed protocol or even an open-but-unfamiliar binary protocol (as in MongoDB) would have required significantly more effort on my part.

No Architectural Lock-in

CouchDB doesn’t force you into any arrangement of nodes. All CouchDB instances are equal, with no explicit master/slave relationships. Any CouchDB instance can push to any other CouchDB instance. This means you are free to pick an arrangement of Couch nodes that works for your particular application. This is probably to be expected, after all: NoSQL is about choice.

Easy Replication

Replicating data from one CouchDB database to another is a snap. Even if the target database is over a network. Further, replication is very easy to set up and use. I’ve not yet had the opportunity to use this feature in the work we’re doing beyond a few simple tests, but it’s already quite clear that CouchDB’s support for replication is going to make our lives a lot easier when it comes time to scale out.

The Stuff You Get For Free

Append-only writes ensure that data corruption is not an issue.

Automatic conflict resolution ensures that peers can replicate between one another without needing too much hand-holding.

There’s a lot of stuff in CouchDB that you probably don’t experience directly, but that you will often hear touted as benefits. Some of these features are really “behind the scenes” and don’t necessarily jump out to slap you in the face. As a result, I think to a degree we take a some of these more subtle CouchDB features for granted.

A Summary of the Good Bits

The system is conceptually simple. The JSON/HTTP API is beautiful and easy. Replication between nodes is a walk in the park. As a result, you can arrange CouchDB nodes in a way that suit you and your problem. Then, of course, there all the little things that CouchDB does which you will likely take for granted until they save your skin.

There’s a lot to like about CouchDB, but it hasn’t been all roses. In the next part of this series, I’ll talk a little about the aspects of the system that have caused us some pain on a real-world project.

Late last December, Shine allowed me to spend a couple of days working on some Hudson contributions. I was planning to finish 3-4 plugins, but used up too much time trying various approaches and testing various scenarios. I actually ended up with unfinished implementations back then.

Fast forward to last week, I finally managed to spend a little bit of my own time to finish the plugin(s), and I also accidentally found a bug with Recorder#perform’s return value being ignored from build status determination. Contacted the dev list and Kohsuke has since fixed it in r29836, the fix will be included in Hudson 1.354.

The first plugin I worked on was SiteMonitor Plugin, it’s basically used for monitoring a web site’s up/down status. At work we used to do this by using Ant Conditions Task, I know other people who use wget and various tools to achieve the same result. I just thought it would be much simpler for a Hudson user to be able to install a plugin and add some URLs to monitor without the need to implement any script.

Here’s how SiteMonitor Plugin looks like:

the report

job configuration

global configuration

This plugin is still at an early stage and currently only reports response code and up/down status of a web site. It doesn’t have any roadmap per se, but I think future versions of the plugin should have more Pingdom-like features and reports. If you have any request or suggestion, feel free to leave a comment on this blog or raise a JIRA issue and assign it to me.

The rest of my December effort went to JSLint reporting, which ended up as such a simple enhancement to Violations Plugin. It was initially implemented as a standalone plugin, first, it attempted to include a simple Java wrapper for JSLint, a parser for JSLint’s text output, and a report generator. The initial implementation was ditched when I found out that there’s already jslint4java.

So I decided to leave the JSLint handling up to the project being built itself, and the plugin implementation only takes care of output parsing (jslint4java provides a nice XML output on top of JSLint’s default text output) and generating a report page. I used Checkstyle Plugin as a base for my implementation. But this effort was also then scrapped after I found out that Violations Plugin already provides all the groundwork necessary for code violations (duh!) reporting.

Adding JSLint as another type of violations was way too easy in Violations Plugin.

Here’s how JSLint report looks like:

All in all, Hudson’s large array of plugins is a proof of how extensible it is as a platform. Creating new plugins isn’t that hard and Hudson API is quite pleasant to work with. I hope these two little contributions could be useful for someone out there.

The following is a write-up of the highlights during the Sun Developer Day which I just attended. The early day kicked off with the usual registration and light refreshments before moving on to the ballroom for the opening keynote by Sun’s Director of Technology Outreach, Reginald Hutcherson.

The keynote addressed the possibilities of JavaFX in the non-PC arena such as televisions and mobile phones or according to Reginald, “any screen you will ever come across”. While it is a bold claim for a new technology in a world with existing competitors such as Flex, Silverlight, GWT …etc, it would be interesting to see how JavaFX performs in 2009. The keynote wrapped up with the advocacy of using open-source technology which was not surprising.

The demo shootout showcased the more interesting bits of the event. The first demo demonstrated the capabilities of JavaFX such as widget animation, a Flickr demo where click-able images are downloaded on the fly while floating across a canvas and finally, a video puzzle game with the video playing in the background. Performance was great and seamless considering the fact that they were all done on MacBook Pros. The next demo, showcased a compiz/beryl-like desktop for open Solaris and virtualization using VirtualBox which is now owned by Sun. Next up, is a quick 5 minute walkthrough of using NetBeans to produce a JavaFX app. This walkthrough shows off the tools within Netbeans such as drag and drop code generation and on the fly coding and previewing which is really impressive from a usability point of view. A feature which allows importing of Illustrator-created graphics was also mentioned, although it was never demonstrated.

The next bit discussed on the direction Java SE is heading and covered topics such as closure, Java Modules (JAM - JSR 277), and the usage of annotations in Swing, such as event handling…etc. While the morning covered the main highlights, the rest of the day was spent on code demonstrations on various topics such as REST, more JavaFX, SOA, MySQL, Dtrace and xVM (Virtual Box). Also, we got to see many of the features within Netbeans which were used during the demos. Sun is trying to market Netbeans to a wider range of developers (PHP, Ruby…etc.), which may be a challenge where there are many who have already got used to the Eclipse environment.

In general, the demos were well conducted and interesting. However, there were a few that were done in a hurry or needed more time for explanation as they were heavy topics (eg. SOA with OpenESB and Java CAPS). Unfortunately, there was not enough time for a Q&A session. Its understandable Q&As can potentially take up more time, but I’m sure there is always time for 1 or 2.

Personally, my main aim for attending the event was to learn more about JavaFX. I was impressed with what I saw and I appreciate the fact that Sun is stepping up their game in the IDE market. This is important, if they are to compete with the likes of Microsoft’s Visual Studio. However, one itching question remains. Is JavaFX able to compete with other web technologies in terms of performance on older hardware or will it be another slow Swing app?

One of my pet hates in software development is repetitive tasks: a complicated deployment process, tricky configuration of an application, repetitive editing motions that are just a little too messy for a find/replace. All of the above have, in the past, killed my concentration, dulled my senses and otherwise numbed my brain. I quickly realized that learning the ins and outs of a few powerful tools can take away a lot of the drudgery that sometimes crops up in my day-to-day work. Further, combining these tools makes it possible to automate just about everything that I need to do on a regular (or even not-so-regular) basis.

What’s important about everything that follows isn’t necessarily the process/tools itself, but what you get out of learning them. This is just what works for me, what has accumulated over a few years.

0. Pick a Good Editor (or IDE)

This first point is a half-hearted gesture because I don’t actually expect anyone to listen to me.

Use vim.

In all seriousness, if your editor doesn’t support basic macros and/or keyboard recording you’re going to encounter situations where you get slowed down by dull, repetitive editing. Even in an IDE that supports refactoring, macros will save you time and brain-dead copy/pastes. Automating repetitive tasks in editing source code is one of the simplest ways you can help yourself.

1. Learn Your Shell (or Learn Your IDE)

I think the Pragmatic Programmer said it well before I even knew what a “shell” was: the command-line is powerful.

Becoming very familiar with a powerful shell is one of the best things you can do for yourself as a developer. Yes, the syntax can be funky and the semantics confusing but over time you come to love it. For me, it’s the Unix shell: in *nix, the shell becomes the glue between all the other tools you use to get work done — including custom scripts and programs that take care of the heavy lifting for you. Windows users might be interested in PowerShell, which is a similar offering from Microsoft — you have the power of the .NET framework at your fingertips.

I understand, too, that folks accustomed to IDEs tend to get a little irked by shell programming. I still strongly recommend giving it a go, but scripting or extending your IDE to automate certain tasks you perform on a regular basis is a great start. If you find you lose a lot of time uploading and modifying configuration files to various servers, you may want to write an Eclipse plugin to make that process as simple as the click of a button.

Whatever path you choose — learning the more exotic aspects of a shell or the plugin API of your favourite IDE — learn it well. The only way to learn it well is to use it all the time.

2. Learn a Dynamically Typed Programming Language

If your shell scripts start to look like a train wreck when you need to perform complex operations, you may want to consider falling back to a scripting language: Perl, Python and/or Ruby are all great choices — bonus points if you mix the three to meet your needs. Languages like these will give you a more structured environment in which to express what you’re trying to achieve (at the cost of it being slightly more difficult to invoke external processes than, say, Bash — a standard Unix shell).

I think it’s important to note that I mention “dynamically typed” languages here because they generally don’t have a separate compilation step: bytecode compilation occurs as part of the execution of a Perl/Python/Ruby program. Thus, a small change doesn’t require an ant/make call to pull your code together again. Remember: you’re trying to automate a process, not create more process. Build management is overhead.

By the same token, if your program needs low-level access to hardware or raw speed for processing large quantities of data — don’t be afraid to fall back to a language like C or Java if you’re sure of your reasons for doing so. Needless to say, situations such as this are rare but not impossible.

3. Appreciation of the Supporting Cast

ssh, scp, rsync, svn/cvs/git, diff, patch, wget, sed, grep, find … the list of handy *nix command-line tools goes on and on. These are smaller tools that — with a good shell — can easily be combined to automate more complicated process. The output of these programs can also be piped into a script of your own doing to perform more complicated operations; then, the output of that program can be piped into another and so on.

This is the big thing I miss when working in Windows: even with PowerShell in play to alleviate the pain that is DOS, it has terrible tool support out of the box. Having the .NET framework on hand is great, but even that’s no substitute for the *nix command-line tools mentioned above. Rant aside, there are ways around most of the shortcomings in Windows and it’s possible to emulate some of the more useful command-line *nix tools. It just means you may need to put in more effort than a *nix counterpart at times.

For folks interested in automating with their IDE, your “supporting cast” might mean third party Java/.NET libraries which your automation plugin(s)/scripts can use to perform more complicated operations.

4. Invest Time in Learning Your Tools

Really, it all comes down to learning the tools from steps 1-3 backwards. Windows, Linux or OSX — you should be an expert in how your tools work and ideally use them on a daily basis. Really getting to know your tools might mean stepping out of your comfort zone a little to learn the nuts and bolts of how your shell ticks, or taking some time out of your day to learn certain esoteric API calls, or the semantics of some new construct in your scripting language of choice.

The goal of such practice is to surround yourself with an easily extensible suite of tools where every action feels like it’s an obvious choice. These tools will help to take away some of the unavoidable “noise” on the fringes of any software development project, and let you get on with getting your job done.

5. Write Lots of Scripts to Automate Common Tasks

This is at the core of automation: combine the above tools to get things done. Copy files to a server using scp, have your script extract and install it via ssh, manipulate configuration files using sed, grep and perl before finally starting the application and storing the process ID in a file with a quick shell command. All without you typing much more than “deploy”.

Automated tasks should be easy to kick off, require little-to-no input from you and — ideally — be fast, so you don’t have to wait long to determine if anything went haywire.

Automate at the First Sign of Resistance

Once you become accustomed to your tools — and again, by no means am I implying that those described here are the best or the only — you’ll step back when you start feel some resistance performing a task throughout your day:

Hmm. I don’t want to have to go through that 12-step deployment process all over again.

It becomes second nature to whip up a script that not only does the complicated deployment but verifies that the application is up and running as expected, checking a process exists and scanning the logs for hints indicating that everything’s fine. Over time you should find your automated tasks are easier to write and “feel” much more reliable than any manual process.

And, hopefully, you’ll be happier for it. :)

This was just too painful! Unfortunately I would have to recommend to people to stay away from Ganymede if you use Subversion for version control until they sort out connectors, update sites etc. After installing on 4 machines (3 macs, 1 linux) i’ve finally got 3 out of 4 working - but one of them (one of the identical macs!) just wont fix and the others were such hit and miss affairs. In theory, all you need to do is add in the update site as detailed on the Polarian site

Update Sites - Direct your Eclipse update manager to both the following update sites…
* Subversive plug-in update site :
http://download.eclipse.org/technology/subversive/0.7/update-site/

* Subversive SVN Connectors update site
http://www.polarion.org/projects/subversive/download/eclipse/2.0/update-site/

I realise that its to do with license issues etc, but Eclipse really has to “just work” out of the box with Subversion soon. The update manager really needs work too. Whoever put the “Close” button where they did while hiding off the “Install” button needs to go on a design course - so fed up clicking “close” just when i’ve chosen “install”….!