Author Archive

On an early Rails project I elected to use instantiated fixtures. My rationale was that it would make my test code cleaner and easier to understand. Sure, test_helper.rb warned me that they would be slow, but how bad could it be?

Well, pretty bad as it turns out. The average execution time of my test suite (with 1052 tests) using instantiated fixtures was around 590 seconds. That’s almost 10 minutes.

Suspecting that it might be the instantiated fixtures that were slowing things down, I modified a couple of tests to see if they could be sped up by switching to standard fixtures. The results were promising, so I switched everything.

The average execution time fell to 123 seconds. That’s a factor of 4.77 difference. In other words, my test time dropped from almost 10 minutes to just over two minutes.

The funny part is that instantiated fixtures didn’t make the code that much cleaner anyway, the reason being that model names would often end up embedded in fixture names. For example, if I had a User model, I might end up referencing a fixture as follows:

@user_in_more_than_one_group

The fixture was named this way because it wouldn’t make much sense if the name wasn’t prefixed with ‘user’. But when you think about it, there’s not much difference between this and the following:

users(:in_more_than_one_group)

So when I shifted away from instantiated fixtures, my code got only slightly more verbose, but the test execution time dropped by almost 80%.

Now it could be argued that these aren’t particular good fixture names, and that if they are for testing specific scenarios, I’d be better off using something like Factory Girl or Machinist. That’s a good point, but unfortunately I’m not in a position to make that change at the moment.

The bottom line is that I got caught out by just how slow instantiated fixtures are. Your results may differ, but it’s definitely worth investigating if your tests are slow and you’re using instantiated fixtures.

This is a short one, but we’ve had to do it a couple of times so I thought I’d put it up.

As you may be aware, Rails 2.2 introduced a new format for test names. Where you once might have had:

def test_should_do_stuff
  ...
end

You can now have:

test 'should do stuff' do
  ...
end

We’ve found this much easier to read and type - especially when your test names start to get big, or would be more readable if they contained characters that aren’t valid in Ruby method names.

However, if you’ve got lots of existing tests and want to shift them across to this format, it can be a pain to do it manually. Unless you’ve got a Ruby script, of course:

directory = 'unit'
Dir.mkdir("#{directory}.new")
Dir.new(directory).entries.each do |entry|
  original_filename = "#{directory}/#{entry}"
  File.open("#{directory}.new/#{entry}", 'w+') do |new_file|
    File.open(original_filename).readlines.each do |line|
      new_file.puts(line.gsub(/def test_*(.*)/) do |match|
          "test '#{$1.gsub('_', ' ')}' do"
        end)
    end
  end unless File.directory?(original_filename)
end

..which you’ll also find in this gist.

If you go into your ‘test’ folder and run this script, it’ll create a new directory called ‘unit.new’, that contains copies of all of your original unit tests, converted to the new format. Change ‘unit’ to ‘functional’, and it’ll do the same for your functional tests. Note that this will not do tests in subdirectories. YMMV. Feel free to steal and modify as you see fit.

I don’t understand why you’d use Interface Builder to create a UI for an iPhone application.

When I started building my first iPhone application at Shine, my colleagues advised me to avoid using Interface Builder. They’d tried using it when they were first starting out, but found that it just got in the way and made the learning curve steeper than it needed to be.

The problem for me was that many of the available books and tutorials for iPhone development used Interface Builder. Many of the sample apps on Apple’s site use it as well. I was having trouble figuring out how to not use it, so I took a deep breath and gave it a go.

I got confused pretty quickly. After thrashing around for a day or so, a colleague took pity on me and showed me how to bootstrap an iPhone user interface in code. I ditched Interface Builder and never looked back.

Sure, I probably would have figured it out, but why make life harder than it needs to be? As a new iPhone developer, I was already trying to get my head around Objective-C, Cocoa and XCode. Why add Interface Builder and NIB files to the list, for very little apparent benefit?

My Theory

Perhaps all the iPhone books and tutorials have been written by people who already had experience developing with Interface Builder and Nibs for Mac OS X.

Don’t get me wrong - I don’t have a problem with Interface Builder. It’s just that I wonder whether Interface Builder is more suited to it’s original purpose: building complex user interfaces that are to be used on a desktop computer.

iPhone applications, on the other hand, have a very limited set of widgets and layouts to choose from. Furthermore, there’s a limited amount of stuff you should put on a single screen.

Consequently it seems like overkill to crack out Interface Builder for an iPhone application.

More controversially, in my experience with GUI builders I’ve found that as soon as you try and build anything non-trivial, you’re going to have to code it by hand anyway. Furthermore, if an interface is so simple that you could build it with a GUI builder, I’ve found that it’s probably quicker to code it yourself. I’m not sure that Interface Builder is any exception to this observation.

To support these assertions, I’d like to point out that one of the more complex (and useful) sample iPhone applications that Apple provide - ‘TheElements‘ (which navigates the periodic table) - doesn’t use NIB files.

How to do it

So how does one bootstrap an iPhone interface without a NIB file? It turns out that it’s very easy to do, but there aren’t many examples out there on how to do it. So for the sake of knowledge-dissemination, here’s how you write a main.c that does it:

#import <UIKit/UIKit.h>

int main(int argc, char *argv[]) {
    NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
    int retVal = UIApplicationMain(argc, argv, nil, @"MyAppDelegate");
    [pool release];
    return retVal;
}

The key part is that you provide the name of the AppDelegate you want to use to the UIApplicationMain method, instead of leaving it as nil.

You’d then just code your AppDelegate to bootstrap the UI however you see fit:

#import "MyAppDelegate.h"

@implementation MyAppDelegate

- (void)applicationDidFinishLaunching:(UIApplication *)application {
	UIWindow *window = [[UIWindow alloc] initWithFrame:[[UIScreen mainScreen] bounds]];
	...
              Setup your controllers and views in here.
        ...
	[window addSubview:myViews];
	[window makeKeyAndVisible];
}

Finally, remove the property with the key ‘Main nib file base name’ (the raw key name is ‘NSMainNibFile’) from your Info.plist file.

What do you think?

Of course, as a newby to iPhone development, perhaps I’m missing something here.
If you’re new to iPhone development, have you found Interface Builder useful? If so, I’d like to hear about it. I can only speak from my own experiences (and those of my colleagues), so would be interested in hearing about the experiences of others.

As a relative newby to the world of Cocoa programming (on the iPhone in particular), I have spent some time trying to understand if and when you’d use a run-loop instead of launching a separate thread. I was unable to find any definitive answer on the web, so ended up joining the dots myself. What follows is my understanding of when you’d want to use one or the other. Cocoa experts are welcome to comment if I’ve got it wrong.

The Problem

Touches aren’t the only source of input to an iPhone application. For example, another source can be a socket - sometime you want to listen to a socket for data. But you don’t want the UI to lock up whilst it’s listening - you still want input from the user to be dealt with promptly. Similarly, you might want events to be triggered automatically at certain time intervals, but without locking up the application in the interim.

Coming from other UI frameworks, you might think that the way to deal with this is to to use a separate thread. That way, the thread can block on the socket or sleep for a particular time interval. However, as we all know, the introduction of multiple threads immediately introduces a bunch of potential defects that are difficult to reproduce and fix.

The Solution

Enter run loops. Or more specifically, the run loop - each iPhone application has one by default and for our purposes, this is all we need.

So what exactly is a run loop?

Well, first consider this assertion:the vast majority of the time that your Cocoa application is running, it’s doing nothing. More specifically, it’s waiting for input. However, as soon as you touch the screen, an event gets triggered, which may in turn result in some of your code being executed. If some data comes into a socket, or a timer fires, the same applies.

The key things is that once this code has been executed, the application goes back to waiting for input. Furthermore, in many cases the execution time of your code will be very small relative to the time the application spends waiting for input.

I think of run loops as a mechanism that exploits this fact.

A run loop is essentially an event-processing loop running on a single thread. You register potential input sources on it, pointing it to the code that it should execute whenever input is available on those sources.

Then when input comes into a particular source, the run loop will execute the appropriate code, then go back to waiting for input to come in again to any of it’s registered sources. If input comes into a registered source whilst the run-loop is executing another piece of code, it’ll finish executing the code before it handles the new input.

The upside of this is that whilst you mightn’t know exactly what order things are going to come in, at least you know that they’ll be processed one after the other instead of in parallel. This means that you avoid all of those nasty multi-threading issues that were described earlier. And that’s why run loops are useful.

Run loop scheduling in action

By default, all touch events received by an iPhone application are queued for processing by the application’s main run loop, so there’s nothing special you need to do for UI components. However, other sources of input require additional coding.

To schedule an NSInputStream on a run loop, you’d do something like this:


[iStream setDelegate:self];
[iStream scheduleInRunLoop:[NSRunLoop currentRunLoop]
forMode:NSDefaultRunLoopMode];

This code sets it up so that whenever input is available on ‘iStream’, a ’stream:handleEvent’ message will be sent to ’self’. Note that the stream could be from any sort of source, including a socket.

Another object that can be scheduled on a run loop is a timer. For example:

[NSTimer scheduledTimerWithTimeInterval:2.0
target:self
selector:@selector(doStuff)
userInfo: nil
repeats:YES];

will schedule a timer on the current run loop to send a ‘doStuff’ message to ’self’ every two seconds.

When not to use a run loop

So when wouldn’t you use a run loop? Well, if you had some event-handling code that was going to take a long time to execute (for example, performing some CPU-intensive calculation), then everything else in the event-handling queue won’t get handled until it’s finished. This would cause your application to become unresponsive until the processing has finished. In that sort of scenario, you might want to consider using a separate thread to do the processing.

However, for the vast majority of cases, our code for handling events - be they from the screen, sockets or timers - takes a very short time to execute. And that’s why it’s easier (and safer) to just use the main run loop to handle those events.

The trade-offs

The only downside to using a run loop instead of a thread is that instead of just whacking a thread around a whole section of code that you know will block in one or more places, you have to go to each potential blocking point, register the source on the run loop, and implement a callback to process events that are generated from that source.

Whilst this may seem like some effort, it pales in comparison to the pain that can result from poorly-considered threading. So next time you’re tempted to use a thread to read from a blocking input source, consider taking the time to use a run loop. It could well save you a lot of time in the long run.

A while back Mark expressed interest to me in using git and git-svn for version control on his own machine, against a remote Subversion repository. However, when I followed up with him recently, he admitted that in the small amount of time he’d spent looking at it, he hadn’t really got his head around how it was all going to hang together.

I can relate. I found Git to have a steep learning curve. It took me a while - and some assistance (thanks Tom) - to figure out the magical incantations to accomplish what I wanted. But now that I know them, I’ve found Git very useful for local version control.

I was going to walk Mark through my rough git-svn workflow, figuring it’d either get him started or scare him off for good ;) Then I decided I might as well share it with the world.

I’ve covered my motivations for using git for local version control in a previous post, so I won’t repeat them here. Nor am I going to provide a tutorial on git and git-svn - there’s plenty of those out there already. Instead, I’m going to run through an example taken from my day-to-day work to try and show how I use git and git-svn in real-life.

The Basics

  1. The first step is something I normally only do very occasionally: create a local Git clone of my remote Subversion repository. So say that I want to create a git clone of the ‘remote_maintenance’ project on the Shine Subversion repository:
    git svn clone svn+ssh://subversion.shinetech.com/home/svn/remote_maintenance/trunk remote_maintenance

    This checks out the contents of the ‘trunk’ branch into a local directory called ‘remote_maintenance’. This local checkout is known as a ‘working tree’.

    The most significant thing I can say about ‘git svn clone’ is that it will take a while if your SVN repository has a big history, as git will create an entire replica of this history. This might sound like a drag but can be very useful later if you’re working offline.

  2. Having cloned the repository, I merrily jump into the code in the working tree and start modifying, adding, deleting and moving files. Note that if I want to move a file and have git track a move, I use ‘git mv’. For example, to rename the ‘README’ file to ‘README.txt’, I’d do:
    bent:remote_maintenance bent$ git mv README README.txt

    I could just move it using ‘mv’, but git wouldn’t be able to track the change.

  3. When I’m curious to know what my changes to the working tree have been since my last commit, I use ‘git status’. For example, having renamed a file and added a new file, I’d get the following output:
    bent:remote_maintenance bent$ git status
    # On branch master
    # Changes to be committed:
    # (use “git reset HEAD …” to unstage)
    #
    # renamed: README -> README.txt
    #
    # Untracked files:
    # (use “git add …” to include in what will be committed)
    #
    # LICENCE.txt

    If I want to look at my changes in more detail, I run:

    git diff

    Which gives me a line-by-line diff.

  4. When I’m ready to commit my changes to git, I first run:
    git add .

    This will add any new files to the local index. I’ll cover what this means in more detail in the next step. After I’ve done run this command, running ‘git status’ again will yield the following:

    bent:remote_maintenance bent$ git status
    # On branch master
    # Changes to be committed:
    # (use “git reset HEAD …” to unstage)
    #
    # new file: LICENCE.txt
    # renamed: README -> README.txt
    #
  5. Next, I commit all my changes by running:
    git commit -a -m ‘Miscellaneous changes’

    The ‘-a’ means that git will automatically add any changed files to the local index before doing the commit. For git newbies, git differs from Subversion in that shepherding changes into a git repository is a two-stage process - first you add a changed file to a local index, then you commit the changes to the repository itself. ‘git commit -a’ is a convenient way of combining these two steps into one.

    Unfortunately, if I have added brand new files to my working tree, ‘git commit -a’ won’t pick them up. Thus the need to explicitly add them to the repository using ‘git add’ in the previous step.

  6. Having done my commit, I can return to step 2 and repeat as often as I want: make changes to my working tree, add them to the local index, and commit them.

Branching

  1. So what if I want to branch? Well firstly, let’s examine the history of my commits by using ‘git log’:
    bent:remote_maintenance bent$ git log
    commit 76b1ebf06586843e990a29423df39f2def2492b5
    Author: Ben Teese
    Date: Tue Jan 6 12:04:49 2009 +1100

    Miscellaneous changes

    commit 21873ad305a06b97504c0c2270549a9a20238596
    Author: Ben Teese
    Date: Wed Jan 7 16:09:34 2009 +1100

    Made it that ground server binds TCP server to provided IP address only, not all network interface

    commit 69b5b7421bc989538689711198510702f129d8f6
    Author: Ben Teese
    Date: Wed Jan 7 14:06:05 2009 +1100

    Made it that transfer service TCP server only binds to IP address provided. Added logging.

  2. To see what branches we’ve got at the moment, we use ‘git branch’:
    bent:remote_maintenance bent$ git branch
    * master

    We see that we currently only have one branch called ‘master’ - the default one that you get when you create a new repository.

  3. Say that I want to try an alternate approach to the change I just committed. To do that, I’d create a create a local branch from the previous commit and check it out. I can do this in one step using:
    git checkout -b new_branch 21873ad305a06b97504c0c2270549a9a20238596

    The long string is a SHA that represents the commit we want to branch from. There are other, shorter, ways to refer to commits, but I find it just as easy to copy the SHA from the output of ‘git log’ directly into the ‘git checkout’ command.

  4. Now let’s use ‘git branch’ to see what branches we’ve now got:
    bent:remote_maintenance bent$ git branch
    master
    * new_branch

    We see that out new branch appears and that it has an asterisk next to it. This means it is the currently checked-out branch.

  5. Now, if I do a ‘git log’:
    commit 21873ad305a06b97504c0c2270549a9a20238596
    Author: Ben Teese
    Date: Wed Jan 7 16:09:34 2009 +1100

    Made it that ground server binds TCP server to provided IP address only, not all network interface

    commit 69b5b7421bc989538689711198510702f129d8f6
    Author: Ben Teese
    Date: Wed Jan 7 14:06:05 2009 +1100

    Made it that transfer service TCP server only binds to IP address provided. Added logging.

    I see that my new branch only goes as far as the previous commit. You could now commit changes to this branch, and they wouldn’t appear in the ‘master’ branch.

  6. We’ll switch back to the ‘master’ branch to continue with this demo:
    bent:remote_maintenance bent$ git checkout master
    Switched to branch “master”

    Synchronizing with Subversion

  1. When I want to synchronize my local git branch with the remote Subversion repository, I run:
    git svn rebase

    It’ll spit out something like this:

    M test/unit/csdb_inventory_content_test.rb
    M test/unit/resource_entity_test.rb
    M test/unit/ground_server_test.rb
    r3743 = 64471395cf084217f7eab91e07abb03297492c83 (git-svn)
    M test/unit/csdb_inventory_content_test.rb
    M test/unit/resource_entity_test.rb
    M test/unit/ground_server_test.rb
    r3744 = 1828baa8e7b91cd8861dfaaecc26eb66b0f9264e (git-svn)
    First, rewinding head to replay your work on top of it…
    Applying: Made it that transfer service TCP server only binds to IP address provided. Added logging.
    Applying: Miscellaneous changes
    /Users/bent/NetBeansProjects/remote_maintenance/.git/rebase-apply/patch:31: trailing whitespace.

    A rebase temporarily winds back the commits that you’ve made on the branch since the last time you rebased, applies the commits from Subversion to the branch, then reapplies your commits to the branch. The great thing about this is that git-svn keeps track of the last rebase you did, so you never have to tell it that you only want to rebase from a particular point. This avoids some of the problems that Subversion has when you repeatedly merge from one branch to another.

  2. But what if somebody has changed a file that I’ve changed and we have a merge conflict? Git will report the problem as it is trying to reapply the local commits:
    Auto-merged test/unit/command/reader_test.rb
    CONFLICT (content): Merge conflict in test/unit/command/reader_test.rb
    Failed to merge in the changes.
    Patch failed at 0002.

    When you have resolved this problem run “git rebase –continue”.
    If you would prefer to skip this patch, instead run “git rebase –skip”.
    To restore the original branch and stop rebasing run “git rebase –abort”.

  3. To resolve the merge conflict, I open up test/unit/command/reader_test.rb, resolve the merge conflicts, then run:

    git add test/unit/command/reader_test.rb
  4. I repeat step the previous step for any other file that reported a conflict during the re-application of that particular commit. When I’m done, I run:
    git rebase –continue
  5. Additional merge conflicts may occur when git reapplies later commits, in which case I repeat steps 3 and 4.
  6. Now I have a look at my log on the current branch:
    bent:remote_maintenance bent$ git log
    commit 21873ad305a06b97504c0c2270549a9a20238596
    Author: Ben Teese
    Date: Wed Jan 7 16:09:34 2009 +1100

    Made it that ground server binds TCP server to provided IP address only, not all network interface

    commit 69b5b7421bc989538689711198510702f129d8f6
    Author: Ben Teese
    Date: Wed Jan 7 14:06:05 2009 +1100

    Made it that transfer service TCP server only binds to IP address provided. Added logging.

    commit 1828baa8e7b91cd8861dfaaecc26eb66b0f9264e
    Author: danielw
    Date: Tue Jan 6 22:13:34 2009 +0000

    converted the first letter of the test names to be lower case

    git-svn-id: svn+ssh://svn.shinetech.com/home/svn/remote_maintenance/trunk@3744 25b05753-7f2e-0410-

    We can see that the changes in my local git repository have been applied on top of the latest change from Subversion.

    Squashing Changes Together

  1. Now I want to squash together the first 2 of the changes I’ve made to my local git respository into a single commit. To do this, I run:
    bent:remote_maintenance bent$ git rebase -i git-svn

    My local terminal editor will appear with the following:

    pick 69b5b74 Made it that transfer service TCP server only binds to IP address provided. Added logging.
    pick 21873ad Made it that ground server binds TCP server to provided IP address only, not all network interfaces.
    pick 76b1ebf Added dvd stub that responds correctly to ProtocolRequest from iPhone.

    # Rebase 1828baa..76b1ebf onto 1828baa
    #
    # Commands:
    # p, pick = use commit
    # e, edit = use commit, but stop for amending
    # s, squash = use commit, but meld into previous commit
    #
    # If you remove a line here THAT COMMIT WILL BE LOST.
    # However, if you remove everything, the rebase will be aborted.
    #

  2. To squash the second commit into the first, I edit the text as follows:
    pick 69b5b74 Made it that transfer service TCP server only binds to IP address provided. Added logging.
    squash 21873ad Made it that ground server binds TCP server to provided IP address only, not all network interfaces.
    pick 76b1ebf Miscellaneous changes

    # Rebase 1828baa..76b1ebf onto 1828baa
    #
    # Commands:
    # p, pick = use commit
    # e, edit = use commit, but stop for amending
    # s, squash = use commit, but meld into previous commit
    #
    # If you remove a line here THAT COMMIT WILL BE LOST.
    # However, if you remove everything, the rebase will be aborted.
    #

    and save and exit the editor.

  3. git immediately pops up another editor that allows me to merge the comments for these two commits that I have said I want to squash together:
    # This is a combination of two commits.
    # The first commit’s message is:

    Made it that transfer service TCP server only binds to IP address provided. Added logging.

    # This is the 2nd commit message:

    Made it that ground server binds TCP server to provided IP address only, not all network interfaces.

    # Please enter the commit message for your changes. Lines starting
    # with ‘#’ will be ignored, and an empty message aborts the commit.
    #
    # Committer: Ben Teese
    #
    # Not currently on any branch.
    # Changes to be committed:
    # (use “git reset HEAD …” to unstage)
    #
    # new file: lib/dvd_stub.rb
    # modified: lib/ground_server.rb
    # modified: lib/rmu_stub.rb
    #

  4. Which I do accordingly:
    Made it that TCP servers only uses IP address provided. This ensures it only binds to the network inte
    rface of that IP address, not all network interfaces. Added logging.
    # Please enter the commit message for your changes. Lines starting
    # with ‘#’ will be ignored, and an empty message aborts the commit.
    #
    # Committer: Ben Teese
    #
    # Not currently on any branch.
    # Changes to be committed:
    # (use “git reset HEAD …” to unstage)
    #
    # new file: lib/dvd_stub.rb
    # modified: lib/ground_server.rb
    # modified: lib/rmu_stub.rb
    #

    and save and exit. Git will then squash the two commits together.

  5. If we do a ‘git log’, we see that git has merged the two commits into one, complete with a new message:
    bent:remote_maintenance bent$ git log
    commit 6a29acd801f280360699ac5278fcb63bb9c8744a
    Author: Ben Teese
    Date: Tue Jan 6 12:04:49 2009 +1100

    Miscellaneous changes

    commit 1fb107af4c765d2220d6674d61edca3fb13f5dc2
    Author: Ben Teese
    Date: Wed Jan 7 14:06:05 2009 +1100

    Made it that TCP servers only uses IP address provided. This ensures it only binds to the network

    commit 1828baa8e7b91cd8861dfaaecc26eb66b0f9264e
    Author: danielw
    Date: Tue Jan 6 22:13:34 2009 +0000

    converted the first letter of the test names to be lower case

    git-svn-id: svn+ssh://svn.shinetech.com/home/svn/remote_maintenance/trunk@3744 25b05753-7f2e-0410-

Commiting to Subversion

Having squashed some of our commits together, commiting to Subversion is relatively straightforward:

bent:remote_maintenance bent$ git svn dcommit
Committing to svn+ssh://svn.shinetech.com/home/svn/remote_maintenance/trunk …
A lib/dvd_stub.rb
M lib/ground_server.rb
M lib/rmu_stub.rb
Committed r3745
M lib/ground_server.rb
M lib/rmu_stub.rb
A lib/dvd_stub.rb
r3745 = fd5fbaf8ad9a71ad997f61e39fc2f3446c47b848 (git-svn)

This will create one commit in Subversion for each git commit.

Keeping Branches Synchronized

So what if you’ve made changes in one branch and you want to get them into another branch? The key piece of advice I can give is to use the Subversion repository as the transport mechanism. Once you’ve finished making changes in one branch, check them into the Subversion repository. Then switch to the other branch and rebase from the Subversion repository.

But what if you want to move code from one branch to another, without commiting to Subversion? Well, I have never done a direct git merge from one branch to another, and the ‘Caveats’ section of the git-svn man page recommends against it if you’re using git-svn. To be perfectly honest, it’s never been an issue for me. I tend to work on distinct features in separate branches so there is little code overlap. If there is some code overlap, I just have to fix the merge conflicts when I rebase from Subversion. However, given that I made both sets of changes, this usually isn’t too difficult.

Conclusion

From this demo you’ll see that my day-to-day usage of git and git-svn essentially boils down to 10 commands. I’ve listed them below, along with a quick summary of what they do:

  • git status :See an overview of the files I’ve changed
  • git diff: See a diff of the changes I’ve made
  • git add: Add new or merged files to the index.
  • git commit -a: Add changed files to the index and commit
  • git log: See the history of commits I’ve made.
  • git branch: See what branch is currently checked out.
  • git checkout -b: Checkout a new branch from a particular point.
  • git rebase -i: Squash together a bunch of git commits.
  • git svn rebase: Get the changes in the Subversion repository.
  • git svn dcommit: Send git commits to the Subversion repository.

Once I figured these out, I never looked back. I hope that you find them as useful as I did when using git for local version control with a remote Subversion repository.

The Problem

A while back we had a Flex client that needed to be able to display search results received from a server. The server was designed RESTfully, returning XML results to the client. The Flex client would display these results nicely to the user, and when the user clicked on a result, their browser would be directed to a HTML page for editing the actual item.

The problem was that, when a client clicked on a result, we were manually constructing the resource URLs in the Flex client from the IDs buried in the search results. The Flex code looked a bit like this:

navigateToURL(new URLRequest(’/item/’ + result.id +”/edit”),”_self”)

Furthermore, we had an impending need to support lots of different types of resources, so our logic for assembling URLs was about to become rather convoluted.

A Solution

Put the URLs into the search results. It’s blindingly simple (when you think about it, we do it all the time with pure HTML sites), but requires a subtle shift in mindset when you’re writing a fat client.

The solution was prompted by an article on REST Anti-Patterns I had read on InfoQ. - in particular, an anti-pattern called ‘Forgetting hypermedia’:

The first indicator of the “Forgetting hypermedia” anti-pattern is the absence of links in representations. There is often a recipe for constructing URIs on the client side, but the client never follows links because the server simply doesn’t send any.

It goes on to suggest that:

a client should have to know a single URI only; everything else…should be communicated via hypermedia, as links within resource representations.

You wouldn’t want to take this to an extreme, but it’s worth aiming for.

This approach helped simplify our code. We tweaked our server to write the resource URL into our XML, then had the Flex client simply navigate to the URL.

navigateToURL(new URLRequest(result.edit_path),”_self”)

Furthermore, this solution was somewhat polymorphic - a link to any sort of resource could be passed back and the client would always be able to navigate to it.

Caveats

The only catch was that we had to tweak the server to manually write the paths into the XML. As our server was a Rails app, we found it easiest to just add a ‘.xml.erb’ file to our views directory:

xml.results do
  @results.each do |result|
    xml.result do
      xml.tag! :name, result.name
      xml.tag! :description, result.description
      xml.tag! :edit_path, edit_polymorphic_path(result)
    end
  end
end

Note the use of edit_polymorphic_path to ensure that the path to any sort of resource can be written out.

The Problem

As I work, I like to be able to do very regular checkins to our Subversion server (ensuring the tests pass beforehand, of course). This gives me an easy fallback position if I make some change that causes test breakages. Sometimes it’s just easier to rollback and start again than to try and figure out how I broke it. This is especially the case when refactoring.

We use Crucible for our code reviews, connected to our Subversion server. The problem is that when I want somebody to review a new feature or bug fix, the reviews might need to span a bunch of these fine-grained changesets. Crucible allows you to have multiple changesets in a review, but I don’t want people to have to review every little changeset - I want them to be able to just look at the difference between the start point and the end point.

A Solution

To get around this, I’ve been using Git to track my local changes, a custom script to bundle my changes into a single changeset, then git-svn to commit them to my Subversion server when I am ready.

I’m not going to give you a tutorial on Git here, but here are the basic steps involved:

  1. Get Git and git-svn.
  2. Clone your remote SVN repository into a local Git repository. If your SVN repository has a lot of history, this may take a while - but this is a one-off event. Go get a coffee whilst you wait.
  3. Once it’s done, start coding. At this point it might be worth reading some of the introductory Git documentation to get an understanding of how it works. Commit your changes to your local Git repository as per normal Git usage.
  4. Periodically rebase to the remote SVN repository using
    git svn rebase

    . This will ensure you’ve got the latest changes in the SVN repository.

  5. Repeat steps 3 and 4 until you’re happy with your changes. Create branches if you want, but don’t merge between them (we’ll discuss why later).
  6. When you’re ready to check-in to SVN, first use a custom bash script that was written by my colleague Tom Lee. Why do this? Well, to commit from Git to Subversion, you’d normally just use:
    git svn dcommit

    However, the problem with this is that’ll it commit a new changeset to SVN for each Git commit you’ve done. We don’t want this. Instead, we want to bundle everything up into a single commit first.

    The way to do is to create a temporary Git branch, and merge all of your changes into it. This will squash them into a single changeset, which you can then commit to SVN in one hit. Whilst you can do this manually if you want, Tom’s script does this for you automatically.

    Put the script in the same directory as your other Git executables and make it executable too. You can then run it as follows:

    git-prepare-svn-commit -m ‘Added some new feature’

    It’ll bundle everything up into a single changeset with the provided commit message. Note that this script has only been tested with Ubuntu - your mileage may vary.

  7. Now you can check it in:
    git svn dcommit

    and it’ll just commit a single bundled changeset to your Subversion repository.

Additional Benefits of using Git

Git has a number of features that are very appealing. One that stood out for me is the ability to very quickly and easily create branches locally and then instantly switch between them. Often I would have a number of branches going at the same time, each one for different code enhancements or defect fixes.

In theory I could probably do all of this using Subversion branches, but Git offered one additional advantage - I don’t require a network connection to use it. I regularly work with it offline, only having to find a network connection when I’m ready to upload my changes to Subversion. I also have the added bonus that I don’t end up with a bazillion branches hanging around on the Subversion server.

It’s worth noting that I’ve never merged between local Git branches that originated from the Subversion server - indeed, the git-svn documentation recommends against it (see the ‘Caveats’ section of the git-svn Manual Page). However, this wasn’t ever a problem for me; by committing finished work to Subversion, switching to a Git branch containing unfinished work and then rebasing against the Subversion repository, code could easily be transmitted between branches.

Drawbacks of using Git
I found Git to have a steep learning curve. In particular, it’s two-stage approach to commits took a little getting used to. This problem was exacerbated by the fact that I was using git-svn as well. The best example of this was the extreme bewilderment I experienced when I first encountered merge conflicts during a rebase against the Subversion repository.

The best piece of advice I can give you if you encounter such a conflict during a rebase is to do what Git tells you. Git will provide a number of ‘what you can do next’ instructions when it encounters a conflict that it needs you to resolve. Read these instructions carefully and follow them. I tried to short-cut them and ended up hopelessly tangled up.

Finally, there’s not much Git tool support out there at the moment, so you’re pretty much gonna be doing it all from the command-line.

Credits
Thanks to Tom Lee for breaking a trail to this solution - I knew what I wanted and thought Git might provide a solution, but Tom was the one who actually figured it out and put it all together. He also pulled me out of a few holes that I dug myself into during the learning process.

Following on from Tom’s post on Handling AJAX Errors using Prototype & Rails, I thought I’d take a look at how to ensure that all Ajax calls are subject to decent error handling - or any other sort of handling you want to do.

Towards the end of his post, Tom showed how he overrode form_remote_tag to include his error handling hooks. The only catch is that form_remote_tag isn’t the only way to trigger an Ajax call - there’s also form_remote_for, link_to_remote and a bunch other methods in ActionView::Helpers::PrototypeHelper.

You could override all of them if you wanted to, but it might just be easier to override the granddaddy of them all: remote_function. As far as I can tell, all of the PrototypeHelper methods use it, and you might even have to use it directly yourself for particularly sticky problems (that’s why it’s part of the public API). It’d look something like this in your helper class:

def remote_function_with_user_feedback(options) # .. Do extra stuff... remote_function_without_user_feedback(options) end alias_method_chain :remote_function, :user_feedback

Of course, that does beg the question: what exactly is the ‘extra stuff’?

Well, you could setup a callback that notifies the user if a server error occurs:

options[:failure] = (options[:failure] || '') + 'alert("An error has occured on the server. Please try again later."); '

Or you could even try out the more sophisticated network-outage detection code in Tom’s original blog entry.

Alternately, have you ever clicked a button in an Ajax app and been left wondering why nothing seems to be happening? Wouldn’t it be good if something started spinning to tell you that ’stuff’s happening’? Well here’s how you could do it:

options[:before] = (options[:before] || '') + "$('spinner').src = '#{image_path "#active-spinner.gif"}';" options[:complete] = (options[:complete] || '') + "$('spinner').src = '#{image_path "#inactive-spinner.gif"}';"

Note that in each case, it’s important not to overwrite any existing Javascript callbacks that may have been provided to remote_function - instead we just concatenate our own Javascript code to them. Thanks to Tom for pointing this out.

It’s also worth remembering that you’re extending a pretty important Rails method here so make sure you get it right…and by that I mean make sure you’ve got some tests :)

I recently did a presentation to the Melbourne Ruby Users Group entitled ‘Flex and Rails’. It demonstrates how we’ve used Flex to build a rich web interface for a real-world Rails application. It includes a demo and code snippets. I’ve made a streaming screencast of the presentation - you’ll find it here (requires QuickTime). Check it out and let me know what you think.

A while ago Mark blogged on his experiences learning Rails using screencasts. Around this time he also confessed to me that he accomplished many things in Rails simply by doing a Google search for it. I thought I’d try a similar approach and it worked remarkably well.

For example, I might type in ‘how do I use fixtures when my table name is different from my model name?’. Google would usually figure out which words had the most weight and I’d get lots of useful results.

To me, it’s the ultimate answer to the question we pose when we say ‘Somebody must have had this problem before’. I even found myself using a phrase to describe it: ‘Google Programming’.

A related phenomenon is what I call ‘Google Debugging’: say my Rails application crashes with some mystifying error message from Ruby or Rails. In the past, I would have cranked up a debugger or started digging through documentation and code to try to figure out what’s going on. No more - now I just Google for the error message. I literally copy and paste the whole thing into Google (minus any stuff specific to my app) and search for it. If I don’t find anything, I strip it down until I do.

The most serious consequence of Google Programming is Cargo Culting - doing something you don’t actually understand. This can lead to crufty code or even bugs. 37signals’ Jamis Buck recently railed against it (no pun intended), but it’s another way of describing an old problem - ‘best-practices’ applied indiscriminately and without consideration, leaving a trail of systems over-engineered in all the wrong places and under-engineered everywhere else. My favorite example of this was JEE Blueprints.

So how does one avoid the perils of Cargo-Culting whilst experiencing the productive joy of Google Programming? I’ve found that the best way for me is to strike a balance between doing and learning. As I work, I do whatever I can to get things going. If I find myself asking ‘why?’ about something, I make a mental note of it (or even an actual note). When I’ve got a couple of minutes of downtime between tasks , I grab the appropriate reference book and seek the answer to the question. If I’m short of time, I’ll take the book to lunch or take it home, and make a point of reading it there.

It only takes minutes at a time, but when I do this over a period of months I end up accumulating a bunch of knowledge. Furthermore, the learning process has been driven by real-world experience - the best way to learn. So by all means embrace Google Programming, but balance it with learning if you want to avoid being a member of a Cargo Cult.