Sunday, August 2, 2009

Patching Mercurial and TortoiseHG for CUWebAuth

For a few months now I've been using a modified version of Mercurial (hg) to push/pull/clone repositories protected with Cornell's CUWebAuth. CUWebAuth is a single sign-on system implemented with Kerberos 5 which is primarily used to allow non-Kerberos-aware web browsers to use a trusted site (something like a web front end for kinit) to generate Kerberos credentials and store them in a cookie. The upshot of this arrangement is that an end user's credentials (username and password) aren't passed to the service needing the identifying information, but instead are only sent to the trusted server. The trusted server sends back Kerberos credentials to the browser, which sends them to the service needing the identification.

Ideally, I'd generate Kerberos credentials in python at the client (hg/tortoisehg), but I don't yet have the code in place to do that. What I did build was a urllib2 RedirectHandler which observes the CUWebAuth2 HTTP redirect, prompts the user for their username and password, sends these to the trusted server, acquires the Kerberos credentials and then replays the original HTTP request. A combination of Python features allowed this to be done relatively simply. Its simple, consistent module and namespacing system and the fact that it is interpreted allowed me to replace a function in one of hg's core libraries at runtime:

import mercurial.url
def opener(ui, authinfo=None):
....
handlers.extend((urllib2.HTTPBasicAuthHandler(passmgr),
mercurial.url.httpdigestauthhandler(passmgr),
cornell.auth.CUWA.CUWARedirectHandler(passmgr, cookiejar)))
opener = urllib2.build_opener(*handlers)
....
return opener

It works even though it isn't the ideal solution (that being generating credentials locally). Cornell folks can get a copy of this library, and the prebuilt executable cuhg which you use in place of hg, from http://code.cals.cornell.edu.

One problem with this solution is that not all of my team members use the command line version of Mercurial. Some use TortoiseHG on Windows. Before I could deploy CUWebAuth protected Mercurial HTTP repositories for my group I needed my CUWebAuth patch to work with their client.

I spent about 10 hours over the past few months getting the TortoiseHG installer to build in one of my Windows VMs. It has a lot of dependencies (Free downloads), but is documented reasonably well with only a few rough edges. Most of that time was spent really getting to understand each step of the build process including PyGTK, Inno Setup, py2exe and several other tools. In the end I was able to build a version of the installer for TortoiseHG 0.8.1 with Mercurial 1.3.1 which functioned properly on a fresh Windows install.

But, it turns out I didn't need to go through all that hassle! The GUI components of TortoiseHG interact with Mercurial by behaving as if one were executing hg commands at the command line. As part of the build process py2exe compiles .pyc files of all the project's dependent libraries (including Mercurial itself) into a zip file called Library.zip. Adding Library.zip/mercurial/CUWA.pyc and updating Library.zip/mercurial/url.pyc with my modified opener() function was all it took to get my CUWebAuth redirect handler injected into TortoiseHG's HTTP client library!

I'm reminded of the old tale of the engineer and the factory. There's this factory which produces special boxes for packing delicate scientific equipment. Every hour the manufacturing line is not running the company loses thousands of dollars. They have staff on-site for dealing with day-to-day cleaning, lubricating and other maintenance tasks, but when something really goes wrong they have to call up the company that built the machine for support.

So one day, after the regular maintenance has been performed for that day, the machine wouldn't start up again. The foreman is stressing over how much money they're losing by not being able to operate so he calls the manufacturer to send over an engineer. The engineer inspects the line, then walks over to a maintenance panel. She opens it and turns an adjustment screw a quarter turn. The line starts back up and everything is back to normal.

A few weeks later the foreman gets a bill for $10,000. Furious, he calls up the engineer and asks why it cost $10,000 for a half hour of her time. He demands an itemized bill. A few weeks later the itemized bill arrives. It has two line items on it:


.5 hour labor: $150
knowing where to put the screwdriver and turn: $9,850

The point is that sometimes knowing exactly where to turn the screwdriver can be valuable both in terms of time saved and money spent.

1 comments:

jdwcornell said...

UPDATE:

Mercurial 1.4 includes a mechanism where extensions can hook into the urllib2 opener creation mechanism. Therefore, I've repackaged what's described here as an official mercurial extension. Details at http://code.cals.cornell.edu. The process for patching TortoiseHG was similar but not identical to that which was described in the original post.