On the topic of reviews...

Over on the OpenStack mailing list, this article from IBM Developer Works has been making the rounds.

The primary objection to the article is the prescriptions it makes on code size for review. No more than 200 to 400 lines of code is said to make a manageable review size. This measurement of "lines of code" (abbreviated LOC) harkens back to the bad old days of programming back when a developer might find herself paid on the number of lines of code she wrote. These were very early days.

Over the last intervening century we've attempted to introduce more sophisticated ways of measuring what LOC measures. We tried function point analysis for a time. Since the process of creating function points and then performing analysis on these involved lots of math there was a presumption of meaningfulness around these numbers. The problem was that in order to perform a real function point analysis you had to be able to measure something called "business utility" and that was something that was often very hard to measure. It was a bit like trying to measure "beauty" or "mood" and there have been equally number driven approaches at measuring both of these and nothing anyone would point to as definitive.

So LOC has stuck around in programming circles because it is easy to count and at some point the numbers make sense. Is 200 or 300 lines too many? Maybe. Is 2,000 or 3,000 lines too many? Almost definitely, but there's an outside chance that it's not. This highlights the problem with hard and fast rules. Only those who aren't really aware yet of what they are actually doing really need them, once you know what you are doing and why there's a high chance you'll end up sticking to 200 or 300 line reviews just because it happens to serve the same purpose.

In the end, LOC is an indicator that kind of works but for the wrong reasons. It is inevitable that more complex ideas take more characters to express. What can't be ascertained is the direct correlation between lines, characters, and ideas. In code review, that's the real metric we're trying to manage ... ideas per review... concepts per patch.

The one thing I do agree with from this article on reviews is a review should take no more than 60 to 90 minutes. Note I'm not saying a review must take 60 minutes. Just that when you prepare your code for review, think of how long it will take someone to comprehend what you've done, consider it, and offer constructive criticism. If the process takes more than 60 minutes then you may have too big of a review. That's a signal to you to break it up.

Denser languages are going to convey more concepts per line than other languages. That may mean that a 10 line patch might take a good 30 minutes to fully evaluate in some very expressive languages. Then, there are reviews which might alter a method signature across an entire code base and 90% of the review is just checking that the method signature has changed, so a few thousand lines is no big deal. (Albeit, that a change like that speaks to other rather gross issues in your code but let's not dwell on that.)

One of my college professors was fond of a quote I've repeated often to my own students and colleagues through the years:
A good programmer is a good communicator.

I often view a failure to understand a change, not as the failure of the reviewer, but a failure on my part for either setting up the reviewer for success or a failure to break up the patches/changes in such a way that they are easy to comprehend. In the last OpenStack review cycle I had a perfectly working 10 line patch that we grew to nearly 100 lines to make it more clear that the effect in the code was not accidental, but deliberate. In the end we changed no outward function in the code but only how it was stated.

These things happen. It's important to know why. It is important to make intention clear. And, that is what writing good code is about.


OpenStack + vSphere 5.0 - how to solve the WSDL parsing issue

If you're attempting to use the OpenStack nova-compute driver for vCenter or ESXi and you're running version 5.0 of the API's on your VMware products, you might see a stack trace like this...

Example trace:
2013-08-09 19:12:36.184 CRITICAL nova [-] imported schema (urn:reflect) at (, failed
2013-08-09 19:12:36.184 TRACE nova Traceback (most recent call last):
2013-08-09 19:12:36.184 TRACE nova   File "/usr/local/bin/nova-compute", line 10, in 
2013-08-09 19:12:36.184 TRACE nova     sys.exit(main())
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/cmd/compute.py", line 68, in main
2013-08-09 19:12:36.184 TRACE nova     db_allowed=False)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/service.py", line 260, in create
2013-08-09 19:12:36.184 TRACE nova     db_allowed=db_allowed)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/service.py", line 142, in __init__
2013-08-09 19:12:36.184 TRACE nova     self.manager = manager_class(host=self.host, *args, **kwargs)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/compute/manager.py", line 395, in __init__
2013-08-09 19:12:36.184 TRACE nova     self.driver = driver.load_compute_driver(self.virtapi, compute_driver)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/virt/driver.py", line 1003, in load_compute_driver
2013-08-09 19:12:36.184 TRACE nova     virtapi)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/openstack/common/importutils.py", line 52, in import_object_ns
2013-08-09 19:12:36.184 TRACE nova     return import_class(import_value)(*args, **kwargs)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 399, in __init__
2013-08-09 19:12:36.184 TRACE nova     super(VMwareVCDriver, self).__init__(virtapi)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 173, in __init__
2013-08-09 19:12:36.184 TRACE nova     api_retry_count, scheme=scheme)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 483, in __init__
2013-08-09 19:12:36.184 TRACE nova     self._create_session()
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 520, in _create_session
2013-08-09 19:12:36.184 TRACE nova     raise exception.NovaException(excep)
2013-08-09 19:12:36.184 TRACE nova NovaException: imported schema (urn:reflect) at (, failed

The key trace here is:
NovaException: imported schema (urn:reflect) at (https://<IP_ADDRESS>/sdk/reflect-messagetypes.xsd), failed

... this issue might crop up in Grizzly or Havana releases.

The issue is with a very picky Python parser. Fortunately the vSphere 5.1 based WSDL have been carefully curated so they work with Python, but for 5.0 there's nothing to be done except to work around the problem.

So, here's how to prevent the problem from popping up.

The driver has had a WSDL location parameter in it since the early days of Grizzly. That parameter is still present today. You just have to set the WSDL location to a local file system cache of the XML files that compose the WSDL. Some very clever people are posting work-arounds involving Tomcat and Apache servers, but seriously, we don't need to be quite that smart.

What we're going to do is pull down the WSDL files from the vSphere server and put them in a local directory. Then we're going to add a couple missing files. Next, we'll set wsdl_loc for the driver in nova.conf and finally, we'll start nova watching for parse errors (if there are any) and we'll clean the files by hand as needed (you can usually skip the last part).

If you don't feel like doing this manually, you can always download the vSphere 5.0 SDK and pull out the WSDL files, hosting them locally.

Okay, so here's what we do step-by-step.
$ cd ~/devstack
$ source localrc

... should echo your vCenter or ESXi host IP address ...

Now we need a local directory to store the WSDL in.
$ mkdir -p /op/stack/wsdl
$ cd /opt/stack/wsdl

Copy the WSDL files off of the server (assuming $VMWAREAPI_IP holds the correct IP address)
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/vimService.wsdl
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/vim.wsdl
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/core-types.xsd
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/query-messagetypes.xsd
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/query-types.xsd
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/reflect-messagetypes.xsd
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/reflect-types.xsd
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/vim-messagetypes.xsd

Next you'll have to create the missing files reflect-messagetypes.xsd and reflect-types.xsd in the /opt/stack/wsdl directory. If even a *single* character is out of place (even invisible characters) then the parser will fail with a cryptic error like :2:79 where is the filename the parser *thinks* its looking at (may not be real), the line number and the column number where the parser failed.

Contents for reflect-messagetypes.xsd and reflect-types.xsd ...
<?xml version="1.0" encoding="UTF-8"?>
<schema targetNamespace="urn:reflect" xmlns="http://www.w3.org/2001/XMLSchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">

Finally! We're ready to use the override in /etc/nova/nova.conf and how you set it depends on how new your driver is. The latest generation driver uses a
configuration section in the driver.
Add to the end of nova.conf pre-Havana
vmwareapi_wsdl_loc = file:///opt/stack/wsdl/vimService.wsdl

Add to the [vmware] section of the nova.conf post-Havana
wsdl_loc = file:///opt/stack/wsdl/vimService.wsdl


The patch dance - OpenStack style.

If you are working on patches for OpenStack and you find yourself building patches that rely on other patches... what do you do? Well, you could maintain all your patches in one big directory and hope not to screw something up or "cross the files" but that would be down right lazy.

Seriously... lazy in the bad way... not at all lazy in the good way.

 The right way to do things is a git branch per bug (or blueprint related patch) and then to rebase each "down-stream" bug from your appropriate "up-stream" bugs. That means you need to create a kind of "bug dependency" chart maintaining a tree like relationship between the various bugs.

As a prerequisite, I'll assume you followed:

  • http://docs.openstack.org/developer/nova/devref/development.environment.html 
  • https://wiki.openstack.org/wiki/GerritWorkflow

If you were a dummy and you managed to submit your first review on master (oh you fool!) then how do you fix this? Assuming your patch is safely up in gerrit for review already...

  1. roll back your master to head
  2. create a new local branch for your bug number
  3. use the "patch" link for your patch in gerrit
  4. after applying the patch, commit again and paste your old message in as the new commit message
The trick that makes this work is the special Change-Id:xxxxxxxx line at the end of the original message. This will tell gerrit to take your incoming patch as a new version of the old change. 

What happens if you need to rebase your patch?
  1. git check out the master
  2. git fetch --all
  3. check out your bug branch
  4. git rebase master
 What happens if you need to have bug B based on bug A?
  1. git checkout master
  2. git fetch --all
  3. git branch bug/A
  4. git checkout bug/A
  5. do work for bug/A and possibly submit it to git review
  6. git checkout bug/A
  7. git branch bug/B
  8. git checkout bug/B
  9. do work for bug/B and possibly submit separately
Need to make sure the master hasn't drifted under bug A and B?
  1. git checkout master
  2. git fetch --all
  3. git checkout bug/A
  4. git rebase master
  5. ... conflicts better result in your changing only code touched in bug/A or see the trick for how to "move" a patch to a branch above ... 
  6. git checkout bug/B
  7. git rebase bug/A
The key is to keep the dependencies clean, feeding the master, then sliding up the branch dependency tree one bug at a time pulling forward the rebase from the last branch. Keeping each branch and patch separate should make other aspects of your life easier.


OpenStack Meeting times

If you are interested in joining in on the OpenStack discussions the meetings are documented here. They are conducted online in IRC forums or alternatively in Valhalla. I've created this Google Calendar feed from the wiki docs... but I make no guarantee that this feed will stay up to date other than this is the one I personally use to keep on top of things. Also, Valhalla is a bit of a schlep.


OpenStack and you...

I'm only a few weeks old in the OpenStack community and I can tell you that this is by far the most democratic open-source project I've ever seen. Typically there is a project owner who for better or worse is the guy you have to please. Let's call him "benevolent dictator for life" or BDL for short.

In most of my experience the trick is to get the BDL's attention and approval. If the BDL doesn't like you for whatever reason, you can GTFO. That's not how OpenStack works.

The OpenStack community has a lovely welcome guide (warning: pdf) and they also give you free candy. In the community process, anyone can bring proposals to the design summit and the community both online and physically present discuss change proposals and enhancements for the coming release. What's special about this process is anyone could propose a design change to OpenStack.

When I mean anyone, I mean anyone. You could represent a billion-dollar behemoth or you could be from a local shop that runs its own datacenter which just happens to want to donate developer time.

When I've been lucky enough to get to do open-source work as my day-job that's the kind of situation I would have been in myself: the day job is deliver web-site X but to do that you shave the yak of open-source bug-fixing. Much of my Grails work came out of me using the framework during my day job (living on the JVM and wanting python-like good-ness) and shaving the yaks of qrcode support, database migrations, SSL certificate issues, audit-log creation, and so on. That's how a lot of the smaller open source contributions (like mine) get done.

That kind of rubber-meets-the-road contributions keeps an open source project "real" and keeps it serving users where they live. For the first time I'm working full-time on the vSphere Nova Compute Driver. I'm taking over for Sean Chen and I want to stay on top of issues.

I'm officially soliciting feedback and input from the OpenStack + vSphere community. Let me know what's going on and what I can do to help you. And in open-source spirit, I don't care if you are a tiny shop or a huge one... let's see your code.


Attending OpenStack Summit 2013 in Portland, Oregon

I've officially joined the OpenStack effort here at VMware. To that end I'll be shifting my OpenSource efforts to support OpenStack related tools. That means I'll be shifting my posting habits and be talking about more OpenStack related work here in this blog.

If you are interested in bumping into me in Portland, here's my Summit Profile. It's early days for me yet on OpenStack and there is a lot of momentum building. I'm sure we're in for a wild ride!


Groovy and SOAP and the JVM

As I'm working out SOAP + Langauge du jour problems I continue to be impressed with the utility embedded in the JVM.

Rarely do you get to develop software in a completely greenfield project. Most of us have to integrate our software with other systems and that often means dealing with SOAP. I asked about what people are doing with Groovy + SOAP over at Google+ and had a few side conversations with folks.

It turns out that when forced to deal with the abject insanity that is SOAP, most Groovy users drop back and pick up those Java tools like JAX-WSIt's tools like these that keep developers coming back to the JVM and that Java to Groovy integration story is still one of the best stories going for Groovy.

In part, you have to wonder if ... in order to understand SOAP's madness you don't need a little madness to get the job done. This insanity might have to continue to exist in practical long-lived projects for quite some time. Keeping that madness bottled up in some pre-baked libraries is honestly the best thing you can do for your own continued sanity and the sanity of your own projects.