2013-08-26

On the topic of reviews...

Over on the OpenStack mailing list, this article from IBM Developer Works has been making the rounds.

The primary objection to the article is the prescriptions it makes on code size for review. No more than 200 to 400 lines of code is said to make a manageable review size. This measurement of "lines of code" (abbreviated LOC) harkens back to the bad old days of programming back when a developer might find herself paid on the number of lines of code she wrote. These were very early days.

Over the last intervening century we've attempted to introduce more sophisticated ways of measuring what LOC measures. We tried function point analysis for a time. Since the process of creating function points and then performing analysis on these involved lots of math there was a presumption of meaningfulness around these numbers. The problem was that in order to perform a real function point analysis you had to be able to measure something called "business utility" and that was something that was often very hard to measure. It was a bit like trying to measure "beauty" or "mood" and there have been equally number driven approaches at measuring both of these and nothing anyone would point to as definitive.

So LOC has stuck around in programming circles because it is easy to count and at some point the numbers make sense. Is 200 or 300 lines too many? Maybe. Is 2,000 or 3,000 lines too many? Almost definitely, but there's an outside chance that it's not. This highlights the problem with hard and fast rules. Only those who aren't really aware yet of what they are actually doing really need them, once you know what you are doing and why there's a high chance you'll end up sticking to 200 or 300 line reviews just because it happens to serve the same purpose.

In the end, LOC is an indicator that kind of works but for the wrong reasons. It is inevitable that more complex ideas take more characters to express. What can't be ascertained is the direct correlation between lines, characters, and ideas. In code review, that's the real metric we're trying to manage ... ideas per review... concepts per patch.

The one thing I do agree with from this article on reviews is a review should take no more than 60 to 90 minutes. Note I'm not saying a review must take 60 minutes. Just that when you prepare your code for review, think of how long it will take someone to comprehend what you've done, consider it, and offer constructive criticism. If the process takes more than 60 minutes then you may have too big of a review. That's a signal to you to break it up.

Denser languages are going to convey more concepts per line than other languages. That may mean that a 10 line patch might take a good 30 minutes to fully evaluate in some very expressive languages. Then, there are reviews which might alter a method signature across an entire code base and 90% of the review is just checking that the method signature has changed, so a few thousand lines is no big deal. (Albeit, that a change like that speaks to other rather gross issues in your code but let's not dwell on that.)

One of my college professors was fond of a quote I've repeated often to my own students and colleagues through the years:
A good programmer is a good communicator.

I often view a failure to understand a change, not as the failure of the reviewer, but a failure on my part for either setting up the reviewer for success or a failure to break up the patches/changes in such a way that they are easy to comprehend. In the last OpenStack review cycle I had a perfectly working 10 line patch that we grew to nearly 100 lines to make it more clear that the effect in the code was not accidental, but deliberate. In the end we changed no outward function in the code but only how it was stated.

These things happen. It's important to know why. It is important to make intention clear. And, that is what writing good code is about.

2013-08-09

OpenStack + vSphere 5.0 - how to solve the WSDL parsing issue

If you're attempting to use the OpenStack nova-compute driver for vCenter or ESXi and you're running version 5.0 of the API's on your VMware products, you might see a stack trace like this...

Example trace:
2013-08-09 19:12:36.184 CRITICAL nova [-] imported schema (urn:reflect) at (https://192.168.14.22/sdk/reflect-messagetypes.xsd), failed
2013-08-09 19:12:36.184 TRACE nova Traceback (most recent call last):
2013-08-09 19:12:36.184 TRACE nova   File "/usr/local/bin/nova-compute", line 10, in 
2013-08-09 19:12:36.184 TRACE nova     sys.exit(main())
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/cmd/compute.py", line 68, in main
2013-08-09 19:12:36.184 TRACE nova     db_allowed=False)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/service.py", line 260, in create
2013-08-09 19:12:36.184 TRACE nova     db_allowed=db_allowed)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/service.py", line 142, in __init__
2013-08-09 19:12:36.184 TRACE nova     self.manager = manager_class(host=self.host, *args, **kwargs)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/compute/manager.py", line 395, in __init__
2013-08-09 19:12:36.184 TRACE nova     self.driver = driver.load_compute_driver(self.virtapi, compute_driver)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/virt/driver.py", line 1003, in load_compute_driver
2013-08-09 19:12:36.184 TRACE nova     virtapi)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/openstack/common/importutils.py", line 52, in import_object_ns
2013-08-09 19:12:36.184 TRACE nova     return import_class(import_value)(*args, **kwargs)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 399, in __init__
2013-08-09 19:12:36.184 TRACE nova     super(VMwareVCDriver, self).__init__(virtapi)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 173, in __init__
2013-08-09 19:12:36.184 TRACE nova     api_retry_count, scheme=scheme)
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 483, in __init__
2013-08-09 19:12:36.184 TRACE nova     self._create_session()
2013-08-09 19:12:36.184 TRACE nova   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 520, in _create_session
2013-08-09 19:12:36.184 TRACE nova     raise exception.NovaException(excep)
2013-08-09 19:12:36.184 TRACE nova NovaException: imported schema (urn:reflect) at (https://192.168.14.22/sdk/reflect-messagetypes.xsd), failed

The key trace here is:
NovaException: imported schema (urn:reflect) at (https://<IP_ADDRESS>/sdk/reflect-messagetypes.xsd), failed

... this issue might crop up in Grizzly or Havana releases.

The issue is with a very picky Python parser. Fortunately the vSphere 5.1 based WSDL have been carefully curated so they work with Python, but for 5.0 there's nothing to be done except to work around the problem.

So, here's how to prevent the problem from popping up.

The driver has had a WSDL location parameter in it since the early days of Grizzly. That parameter is still present today. You just have to set the WSDL location to a local file system cache of the XML files that compose the WSDL. Some very clever people are posting work-arounds involving Tomcat and Apache servers, but seriously, we don't need to be quite that smart.

What we're going to do is pull down the WSDL files from the vSphere server and put them in a local directory. Then we're going to add a couple missing files. Next, we'll set wsdl_loc for the driver in nova.conf and finally, we'll start nova watching for parse errors (if there are any) and we'll clean the files by hand as needed (you can usually skip the last part).

If you don't feel like doing this manually, you can always download the vSphere 5.0 SDK and pull out the WSDL files, hosting them locally.

Okay, so here's what we do step-by-step.
$ cd ~/devstack
$ source localrc
$ echo $VMWAREAPI_IP 

... should echo your vCenter or ESXi host IP address ...

Now we need a local directory to store the WSDL in.
$ mkdir -p /op/stack/wsdl
$ cd /opt/stack/wsdl

Copy the WSDL files off of the server (assuming $VMWAREAPI_IP holds the correct IP address)
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/vimService.wsdl
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/vim.wsdl
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/core-types.xsd
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/query-messagetypes.xsd
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/query-types.xsd
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/reflect-messagetypes.xsd
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/reflect-types.xsd
$ wget  --no-check-certificate https://$VMWAREAPI_IP/sdk/vim-messagetypes.xsd

Next you'll have to create the missing files reflect-messagetypes.xsd and reflect-types.xsd in the /opt/stack/wsdl directory. If even a *single* character is out of place (even invisible characters) then the parser will fail with a cryptic error like :2:79 where is the filename the parser *thinks* its looking at (may not be real), the line number and the column number where the parser failed.

Contents for reflect-messagetypes.xsd and reflect-types.xsd ...
<?xml version="1.0" encoding="UTF-8"?>
<schema targetNamespace="urn:reflect" xmlns="http://www.w3.org/2001/XMLSchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
</schema>

Finally! We're ready to use the override in /etc/nova/nova.conf and how you set it depends on how new your driver is. The latest generation driver uses a
[vmware]
configuration section in the driver.
Add to the end of nova.conf pre-Havana
vmwareapi_wsdl_loc = file:///opt/stack/wsdl/vimService.wsdl

Add to the [vmware] section of the nova.conf post-Havana
wsdl_loc = file:///opt/stack/wsdl/vimService.wsdl