2014-07-18

What every developer should know about testing - part 1

The short version of this blog post is that this week and next week on pyVmomi will be spent on radically improved testing code and processes. This testing process will become part of the commit cycle. And, it's about damn time.

Once these measures are in place over the next few days the project will be better able to absorb new commits from interested parties.

Overview 

Over the last three weeks I've been working on pyVmomi's next release. If you've not been following along, it's a python client for a SOAP server and it's a code base dating back to at least Python 2.3 but it has only been in the wild as OpenSource now since December 2013. The pyVmomi library has a long internal history at VMware and has served many teams well over the years. But, it does have problems.

The OpenSource version of the library has *no* unit testing shipping with it this makes it hard for interested third parties to contribue. It's time to change that. But, it's a client library for a network service. How do you test such a beast?

In this post I will cover what unit testing is and what integration testing is and how this impacts the design choices made on libraries. This discussion is directly applicable to the long-term evolution of a library like pyVmomi and is generally applicable to Software Design. I'm bothering to write all this down because I want everyone to be on the same page and I don't particularly want to repeat myself too much. Over the years I expect to be involved with this project I expect to point at this post frequently.

The Problem

Work on pyVmomi has been rather painful. For much of it, I have had to spend vast amounts of time deep in the debugger. Testing on this library involves building a VCSIM and simulating a vCenter environment. This in turn means the creation of a suitable inventory and potentially setting up a suitable Fault to work with. This is a lot of yak shaving to get to the point you can even start considering doing development work.

The root of the specific problem

The problem in specific is that pyVmomi as a library speaks to a server. No other thing can completely simulate all the inputs, outputs, and exposed states that such a thing achieves except the big complex thing itself. This problem is routinely solved by developers in this entire cloud infrastructure space by spinning up virtual environments to create the scenarios they need.

This is a natural inclination since you have a beautiful hammer, why not nail all the bugs with this beautiful hammer? Virtualization is powerful and has transformed our industry. One day I will be an old man and tell stories of how infrastructure and development worked in the bad old days of Dot-Com Boom but this inclination is an example of the hype-cycle in full detrimental effect.

The problem in general

Because client library code development starts at the integration phase the units that end up defined by the client library programmer are inherently integrations. How do you test integrations? With integration tests. But, how do you do integration testing when the thing you are testing isn't even on your build machine? If it's a server (such as our case) you have to fall back to either a simulator or you have to stand up a whole new of your environment just for testing.

Unsurprisingly, this is fairly standard practice for every step of IaaS and PaaS development. You stand up the universe to author a new function then you retest the whole thing on a fresh copy of the universe. Then, you wash-rinse-repeat for the whole integrated system. It's so easy. It's also so very wrong. Because code that is hard to test (or completely untestable) in isolation is poorly designed. If you're defending the fact that it's tested, you're missing the point.



This isn't just a problem with the one library I'm working on now. I've seen this repeatedly in development environments of every kind at huge shops and tiny shops. You build up a pile-o-software that glues systems together and to test it you build a pile-o-infrastructure that you bring to a pile-o-state so you can validate the right calls and responses.

When you test this way (bringing a whole simulated universe into existence to test your new 'if' statement), invariably something's state gets out of sync and what do you do? You have to test the test environment to validate that you don't have false positives for your failure report, then you have to retest and you re-start the whole process which typically grows into hours. This is, frankly, an extremely expensive way to develop software.

And, for the record I've seen this in JEE, Spring, Grails, Python, Bash, Perl, C, C++, projects on Solaris, Linux, Irix, BSD, and now ESX based environments. This is not a problem unique to those crappy developers on that crappy environment. This is an intrinsic integration development problem that crops up when you routinely write code that takes big complex systems and makes them work together. It's a far too easy trap to fall into and a far too difficult of a pit to climb out of.

Unit Testing?

So the story so-far is that we have a library, maybe that library talks to things "not of this machine". Maybe, it speaks over the wire, talks to other things we can't see or directly control. These are things well outside of anything we could define as our unit of code. So if that's our fundamental unit (because what *else* is something like pyVmomi?) How the heck do we test it?


http://youtu.be/G2y8Sx4B2Sk

The term unit is deliberately ambiguous in this context. Did we mean class? Did we mean method? The answer is it depends. Getting the logical border of the unit is hard. It's actually human intelligence hard. It's "why AI do not yet write code" hard. Why is it hard? It's hard in the same what the making beautiful art is hard it's a fuzzy problem that requires aesthetics.

The definition of where a unit is, is hard and simultaneously critical to get right. Define the wrong unit and pay the price. This doesn't mean testing is wrong, it just means testing is a programming-hard problem. Looking for easy answers here just means you don't know what you're doing.
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
        — Brian W. Kernighan and P. J. Plauger in The Elements of Programming Style.

I mock your Mocks!

A simple answer to the problem is to Mock and Stub everything. (Mock objects are not Stubs, and you should know the difference. Edit: this is a whole topic unto itself and I will cover it separately at a later date.) The problem is when you work with a sufficiently complex interactions you will be forced to write sufficiently complex mocks and stubs. You will back your way into the simulator problem I mentioned before. In our specific case this means essentially re-inventing something like VCSIM except all in Python mock objects and that's absurd.

What are you forging, o' library author?

Consider also, where is the unit boundary when it comes to a client library? The library absolutely has a boundary at its public interfaces. If the bit-o-code you just wrote is private why did you write it? The private method is not part of the interface and so therefore it's not part of the library's unit definition. The unit in this context only makes sense as a test-first tested component if it's going to be exposed. By definition a private method isn't exposed so it's an implementation detail and we don't test our programs to make sure implementation details work. We don't test if the 'if' statement works. So where is the detail and where is the interface?

This means your test-first tests should be your surface. To develop a library that you intend on providing to people to interact with you should model sets of expected interactions. Each interaction should then be codified as a set of calls to the library.

Code as little as possible, test as little as possible

Tests are code. The mark of a good programmer is not how much code they write, but how much they can accomplish with how little. If you are following the aesthetic of minimalist code, then this attitude should also follow into your tests. Your tests should cover as many lines as is needed to validate the code and should do this as effectively as possible. Ideally, you should have an efficiency to your tests. No two tests should cover exactly the same unit. Covering a unit multiple times is effectively wasted effort.

This is a much harder philosophy and practice to follow than the lazy 'cover all the lines' strategy. It requires you to understand the functional cases your code can cover. In a rarified ideal world, this might mean you get 100% coverage anyway but the percentage isn't what we care about. You can have 100% code coverage and still have horrible comprehension of what your project even is.

Breaking down your tests to cover all the methods, even the private ones is a horrible idea. If you cover your private methods you will be tying your tests that matter to what (by defining them as private methods) you have decided are implementation details. That equals tight coupling; tight coupling is bad.

How do you test something where it provides very little function other than basic interactions with a service? How do you exercise a library that is arguably mostly private and hidden code?

Introducing Fixtures

The concept of a testing fixture is a very old testing concept. It even predates software as a thing and yet I rarely if ever see a shop using fixtures. The truly sad thing is that for most development languages fixtures are old as the hills. So, WHY are so few projects using them?

A Specific Solution: vcrpy

I reviewed several Python fixture libraries this week and was fairly well impressed with vcrpy for our purposes. The description may mention 'mocking' but function of the library is to provide you with testing fixtures at the socket level. In libraries like pyVmomi we are effectively a skin over a very complex back end web service. This 'skin' nature of ours means that a simple set of library interactions may hide dozens of network conversations.

Manually creating dozens of HTTP interaction mocks to explore a single high-level test can be so painful that you are likely to just not do it. Fortunately tools like vcrpy exist and can record your HTTP traffic. Now you can do the lazy thing and toy with your client-server a bit, record the on the wire interactions, and then later (and more importantly) edit the conversations to represent the larger API cases you want to cover.

With the recorded HTTP fixtures at our disposal we can now work with the binding in much more predictable and controlled ways.


More on that next week...

2014-07-11

pyVmomi and the road to Python3 - part 3

The Good News

It works! And without major surgery!

I currently have a working branch of the pyVmomi code base that runs on python 3.x interpreters. We are working to get this version usable under Python 2.6, 2.7, 3.3, and 3.4 as well and hope to post on that next week. I'd also like to thank Rackspace for offering to host our CI servers. Kudos to Michael Rice for getting that to happen.

The bad bits...

One of the most frustrating aspects of porting pyVmomi on to Python 3.x has been the fact that many of the language changes caused pyVmomi to fail to load or fail silently. This is in part due to design considerations made in the library early on, and in part due to how the language has changed.

In particular the way imports are done were instrumental in basically masking the failures. The try-except-swallow pattern is frequently used and it tended to hide why a vim.* or vmodl.* data type wasn't loading into memory.



Simple porting techniques and tools also helped mask the problems, however, now that they are known I may back off some of the changes in favor of cleaner more generic solutions.

Simple 2 to 3

A number of changes are rather trivial, take the case of exceptions for example...

replacing a ',' with the key word 'as' is enough to 'fix' a python 2 to 3 exception block in some cases.
... there were a number of these trivial cases where tools like the 2 to 3 tool could help. But, in general libraries like six and tools like 2to3 tended to mask underlying problems with more complex issues such as unicode and type inheritance in the library.

There were a number of these kinds of changes

When is a Type not a Type?

Inheriting off of base classes is considered very bad practice. It tends to gain you very little and tends to complicate code unnecessarily. Consider the Link class in pyVmomi's VmomiSupport.py which very naughtily extends the core unicode class in Python.


The Link class here serves the sole purpose of allowing a string to be recognized as a type temporarily so that when we do type mapping during SOAP deserialization we have a type to latch on to in key branching logic.

This is clearly not within the use cases most people are thinking about when they are thinking about Python 2to3 and while the practice is naughty, it does provide an interesting orthogonality to the mapping code.

So, because of that, while it's bad practice to extend a built in type I can't think of something cleaner that wouldn't immediately ugly up the code-base. However, I'm also not sure how that bit of logic currently performs since any use of Link is going to return ostensibly a string (albeit a unicode one) and said string is never going to to be of type Link. So the first point still stands... extending a base class is more confusing than it is useful.

That's just out of scope for strictly Python portability work, so for the time being the class stays in. It also seems mostly benign at the moment. There's a significant lack of testing around the lib and that needs to be addressed before I will be confident doing any major refactoring work. IMHO: The python 3 support work is pushing our luck as it is.

Unicode uncoded

Of particular interest is in testing our Unicode related code no longer make sense in Python 3.
Hopefully, we can do away with the uglier bits I've hacked on here to make things work. We will also need to back-track the assumptions made when the 'unicode' and 'basestring' symbols were used and make sure they still hold.

Special thanks to the friendly folks on #python for their help with understanding the issues around Python and unicode.

Next steps: testing, testing, testing

My next set of tasks will be to clean up and refine the python 3 related changes to try and make them as targeted an small as possible. I can't realistically do this with much confidence if I can't make assertions about the library's operation. While VMware has internal build systems and tests that validate and verify the library against a variety of product builds I can't realistically expose these since they are intimately related to shipping product itself.

The pyVit (pronounced pi-vət) project is a first step toward creating a test suite for pyVmomi that a 3rd party could easily consume to validate the library and their environments. This will help us move the library from its current status of beta to a 'stable' category. I also hope this will enable broader stewardship of the pyVmomi library allowing it to take on a life of its own. Ideally, this will also enable anyone to fearlessly refactor the library.

More on that next week...

2014-07-03

pyVmomi and the road to Python3 - part 2

Short week this week, so a short update ...

I've spent most of my time with my head in the debugger flipping between Python 2.7 and Python 3.4 interpreters. I've been observing the differences in behavior between the two versions and finding ways to make pyVmomi behave with both. Some of the more interesting work has had to do with how Python 3 handles class loading and namespaces. I could probably write a whole post on that alone.

In my current attempt, even with a good debugger I've been forced to resort to bear-skins-and-bone-knives development. The problem being that the critical fault occurs during interpreter load. That has to do with how pyVmomi builds its binding.

At runtime pyVmomi uses a LazyModule class to hold instances of class definitions it needs. This leads to some efficiencies in cases where very little of the actual API is being used. The API is enormous as you can see from the generated documents I posted so this efficiency makes sense. It does mean that the pyVmomi classes that you use in your own code are not python types as you normally experience them. The binding does produce a convincing illusion, however.

I've bumped back the next release milestone from July 15th to July 31st to give us time to validate the changes for Python 3 support since they will be slightly more invasive than I had hoped for.  I want the next post to pypi to include that Python 3 support patch if possible.

Lessons learned from porting pyVmomi to Python 3 will be generally valuable beyond just pyVmomi but, I'm still in the thick of it. So far progress is encouraging but there is still a way to go before declaring the new version stable.

More on that next week...

2014-06-27

pyVmomi and the road to Python3 - part 1

This week in pyVmomi development I've been working on Python 3x support. In my opinion this is the most important move to make in pyVmomi development. While python 2.7 will have a long life ahead of it with support continuing through 2020 most state of practice projects will be moving to python 3 much ahead of that.

As a library, our target audience is the developer working on new products right now so tying the project to Python 2 only means cutting out a majority of our potential adopters. For the general health and longevity of the project we need to make the move.

That means, the first, most invasive, and most volatile change to be made to pyVmomi will be to change the library just enough to allow it to operate both on Python 2 and Python 3.  And I'm already in process of making these changes while also writing a test suite for the library.

In the process of writing these changes I've found that there's more to do than initially thought. Python 3 migration was one of the blockers that initially made me pause when I was evaluating the use of pyVmomi in OpenStack. I went from very optimistic, to the realization that there would be some work to do. Now in my current investigation, I've determined that trivial python 2 to 3 automation may not get us all the way there.

I'll be spending the next week or so investigating how to best make this move. And I'll keep you posted on how this is progressing.  I'll be looking for the simplest, cheapest, and best fit solution for the problem. That doesn't mean I'm looking for the most elegant solution but just the most effective.

Of primary concern to the pyVmomi SDK is maintaining compatibility with the existing tools and toolchains that already use it. That may mean occasionally making decisions that don't look Pythonic. The solution as a whole is more important than any one technical element. That's what we keep in mind when we design solutions like pyVmomi. How do we deliver something that doesn't just work in its beauty, but also in it's practicality.

Here's a nice video to underscore this kind of thinking. It's a thoughtful analysis of why Betamax lost to VHS and it underscores the role of ecosystem. The total ecosystem a product lives within is important and its really those factors that determine which solution is proven.



On other fronts, the pyvmomi-community-samples project is starting to take shape. This project has our lowest bar to entry. If you were just getting started and wanted to contribute that's the best place for you to go. If you want to write something but don't have an idea what to write, there is a significant backlog on the project ready for takers. Some marked with the 'low hanging fruit' tag are likely suitable for beginners.

If you are contributing to pyvmomi-community-samples then you are making a contribution to an Apache 2 Licensed project. That means you get to keep a by-line on your file if you like, but it really belongs to the community after you submit it and it's under that nice Apache 2 License. In my experience, code you post can end up in the darnedest places so you might as well post to a project that will respect your authorship.

If you are a bit more advanced and want to extend pyVmomi then I've set up pyvmomi-tools. The project's purpose is to act as an incubator while we iron out problems like the Python 2 to 3 issue. I've blogged about the project here. You'll need to be a more advanced developer to really help out with that project and I've been hearing from people interested in talking about which way to go with extending pyVmomi beyond it's core vSphere API. There's lots to do there and we'll need to setup some time to discuss between interested parties new feature sets to include.

If you are an IRC user you can hang out with me on freenode on #pyvmomi and #pyvmomi-dev or as always I'm on twitter.

We're also working on a regression testing suite for pyVmomi targeted for public use by teams interested in doing their own integration testing. This is intended to be a test suite for building a public facing CI for pyVmomi contributors, but also it will be a way for private cloud vendors take and adapt the project and create their own in-house IaaS stress testing services.

More on that another time...

2014-06-25

pyvmomi-tools and alarm ack & reset

There are going to be use cases like "Ack & Reset vCenter Alarm implementing hidden API method" these are going to be relatively common and yet reasonably outside the definition of an official API binding. To handle this I've created pyvmomi-tools as a project to distribute these kinds of additions to the official bindings.

We'll record feature requests and after a number of these get implemented we'll provide an official release that you can pull down with `pip`. The pyvmomi-tools project will be freer to explore techniques for working with vSphere that might not be officially supportable or might break between releases.

I will want to come up with a system of warnings to let you know when you are using an API that might not survive between vSphere releases or isn't officially released and therefore isn't covered by the very generous backwards compatibility guarantees that the rest of vSphere is.

But... that's another topic...

2014-06-20

Documentation is live

This week I've been dealing with documentation for pyVmomi. I decided to go with the Google Python Documentation standard for this current set of docs because the Google standard is very human readable and renders reasonably well in GitHub. The docs are live on GitHub now.

However, when it comes to the most useful documentation standard for inline documentation the Sphinx standard has too many benefits to ignore. In particular if you want code completion using tools like Eclipse or IntelliJ you need to have Sphinx docs and those have to be on concrete static classes.

So, for actual inline code comments I want to enforce Sphinx's standard with type hinting. These kinds of hints are very valuable for static code analysis tools and you simply can't get that with any other standard at present.

My next set of tasks will be to build up the testing infrastructure around pyVmomi, once that is done we'll be able to leave behind the timid and slow progress stage we've been stuck in. This will have to happen sooner rather than later. The project will officially "launch" at VMworld coming up in August and we'll need to be ready for the additional attention that will bring.

Part of this week's documentation effort has been trying to find ways to present developer documentation and developer guides to make sure that new developers don't get mired down too badly. I've been years off the speakers circuit now and I think it shows. My presentation is a bit choppy but I hope you'll find it useful.



This video that I've prepared walks through the code contribution process from fork to pull request. It assumes you have a previously setup GitHub account. This is the first time I've done one of these in a good long while and I was surprised by how badly the text blurred. I may update this later.

When you submit a pull request it should only have the commits listed in it that you wrote if it doesn't then something went wrong! You'll have to start over by branching from master and cherry-picking your commits. Git has a learning curve but it's worth learning and it's worth spending time with to get things down properly.

If you have a good contribution and you're really stuck, I've already pulled and cherry-picked some key submissions from other people into my local git repos. I try to preserve the code authorship marks on the patches so that you can find your committer credits on the source repository later.

Now that we know the basics of how to contribute, I'll take some time and work on precisely what we're going to work on and when we'll release it.

More on that next week...

2014-06-13

on the topic of python documentation

I was going to talk about task management and how the pyVmomi API works with that. But, this week was eaten up by documentation work.

If you're not aware there's a number of Python documentation styles out there. In particular the Sphinx style documentation is the most popular in the Python community at large. The problem with Sphinx documentation style is that it's rather dense to read. I plan on hosting these final products on the github project site, so that becomes part of the conversation as well.

Because GitHub does such a good job with markdown files, as part of my documentation work I prepared this markdown version of the documentation. This is a very nice looking set of docs that we can generate procedurally. It's relatively regular and it will be easy for people to edit and maintain. But markdown isn't anywhere near anyone's idea of a documentation standard for Python.

As a matter of human readability I'm very fond of the Google Python Documentation style. It has the benefit of being regular yet very human readable. The problem is it doesn't get you where you need to go 100% of the time. Many tools around Python are built assuming Sphinx style markups.

So... what to do? Plainly I've not decided. I will at minimum need the sphinx version of these documents. With the Sphinx version we should be able to enable IDEs to do code completion and other nifty tricks. But there's a fair bit of work in taking the vSphere HTML documentation and turning it into Sphinx documentation. I'm busy with a tool to finish off conversions to either format.

I've spent the week examining various ways to document the project and how to link these up in people's IDE. I've also been looking at ways to walk people through using the library and I've been busy cataloging new enhancement directions as we move along.

The documentation work makes me very optimistic about switching to static Python classes for a future version of pyVmomi and it might even yield some interesting ways to do a new Java binding using Groovy as a dynamic language base (because Groovy and Python share a lot of language features.)

I've had to drop off here at week's end with about 80% of the solution in hand for the documents.

More next week...