pyVmomi and the road to Python3 - part 1

This week in pyVmomi development I've been working on Python 3x support. In my opinion this is the most important move to make in pyVmomi development. While python 2.7 will have a long life ahead of it with support continuing through 2020 most state of practice projects will be moving to python 3 much ahead of that.

As a library, our target audience is the developer working on new products right now so tying the project to Python 2 only means cutting out a majority of our potential adopters. For the general health and longevity of the project we need to make the move.

That means, the first, most invasive, and most volatile change to be made to pyVmomi will be to change the library just enough to allow it to operate both on Python 2 and Python 3.  And I'm already in process of making these changes while also writing a test suite for the library.

In the process of writing these changes I've found that there's more to do than initially thought. Python 3 migration was one of the blockers that initially made me pause when I was evaluating the use of pyVmomi in OpenStack. I went from very optimistic, to the realization that there would be some work to do. Now in my current investigation, I've determined that trivial python 2 to 3 automation may not get us all the way there.

I'll be spending the next week or so investigating how to best make this move. And I'll keep you posted on how this is progressing.  I'll be looking for the simplest, cheapest, and best fit solution for the problem. That doesn't mean I'm looking for the most elegant solution but just the most effective.

Of primary concern to the pyVmomi SDK is maintaining compatibility with the existing tools and toolchains that already use it. That may mean occasionally making decisions that don't look Pythonic. The solution as a whole is more important than any one technical element. That's what we keep in mind when we design solutions like pyVmomi. How do we deliver something that doesn't just work in its beauty, but also in it's practicality.

Here's a nice video to underscore this kind of thinking. It's a thoughtful analysis of why Betamax lost to VHS and it underscores the role of ecosystem. The total ecosystem a product lives within is important and its really those factors that determine which solution is proven.

On other fronts, the pyvmomi-community-samples project is starting to take shape. This project has our lowest bar to entry. If you were just getting started and wanted to contribute that's the best place for you to go. If you want to write something but don't have an idea what to write, there is a significant backlog on the project ready for takers. Some marked with the 'low hanging fruit' tag are likely suitable for beginners.

If you are contributing to pyvmomi-community-samples then you are making a contribution to an Apache 2 Licensed project. That means you get to keep a by-line on your file if you like, but it really belongs to the community after you submit it and it's under that nice Apache 2 License. In my experience, code you post can end up in the darnedest places so you might as well post to a project that will respect your authorship.

If you are a bit more advanced and want to extend pyVmomi then I've set up pyvmomi-tools. The project's purpose is to act as an incubator while we iron out problems like the Python 2 to 3 issue. I've blogged about the project here. You'll need to be a more advanced developer to really help out with that project and I've been hearing from people interested in talking about which way to go with extending pyVmomi beyond it's core vSphere API. There's lots to do there and we'll need to setup some time to discuss between interested parties new feature sets to include.

If you are an IRC user you can hang out with me on freenode on #pyvmomi and #pyvmomi-dev or as always I'm on twitter.

We're also working on a regression testing suite for pyVmomi targeted for public use by teams interested in doing their own integration testing. This is intended to be a test suite for building a public facing CI for pyVmomi contributors, but also it will be a way for private cloud vendors take and adapt the project and create their own in-house IaaS stress testing services.

More on that another time...


pyvmomi-tools and alarm ack & reset

There are going to be use cases like "Ack & Reset vCenter Alarm implementing hidden API method" these are going to be relatively common and yet reasonably outside the definition of an official API binding. To handle this I've created pyvmomi-tools as a project to distribute these kinds of additions to the official bindings.

We'll record feature requests and after a number of these get implemented we'll provide an official release that you can pull down with `pip`. The pyvmomi-tools project will be freer to explore techniques for working with vSphere that might not be officially supportable or might break between releases.

I will want to come up with a system of warnings to let you know when you are using an API that might not survive between vSphere releases or isn't officially released and therefore isn't covered by the very generous backwards compatibility guarantees that the rest of vSphere is.

But... that's another topic...


Documentation is live

This week I've been dealing with documentation for pyVmomi. I decided to go with the Google Python Documentation standard for this current set of docs because the Google standard is very human readable and renders reasonably well in GitHub. The docs are live on GitHub now.

However, when it comes to the most useful documentation standard for inline documentation the Sphinx standard has too many benefits to ignore. In particular if you want code completion using tools like Eclipse or IntelliJ you need to have Sphinx docs and those have to be on concrete static classes.

So, for actual inline code comments I want to enforce Sphinx's standard with type hinting. These kinds of hints are very valuable for static code analysis tools and you simply can't get that with any other standard at present.

My next set of tasks will be to build up the testing infrastructure around pyVmomi, once that is done we'll be able to leave behind the timid and slow progress stage we've been stuck in. This will have to happen sooner rather than later. The project will officially "launch" at VMworld coming up in August and we'll need to be ready for the additional attention that will bring.

Part of this week's documentation effort has been trying to find ways to present developer documentation and developer guides to make sure that new developers don't get mired down too badly. I've been years off the speakers circuit now and I think it shows. My presentation is a bit choppy but I hope you'll find it useful.

This video that I've prepared walks through the code contribution process from fork to pull request. It assumes you have a previously setup GitHub account. This is the first time I've done one of these in a good long while and I was surprised by how badly the text blurred. I may update this later.

When you submit a pull request it should only have the commits listed in it that you wrote if it doesn't then something went wrong! You'll have to start over by branching from master and cherry-picking your commits. Git has a learning curve but it's worth learning and it's worth spending time with to get things down properly.

If you have a good contribution and you're really stuck, I've already pulled and cherry-picked some key submissions from other people into my local git repos. I try to preserve the code authorship marks on the patches so that you can find your committer credits on the source repository later.

Now that we know the basics of how to contribute, I'll take some time and work on precisely what we're going to work on and when we'll release it.

More on that next week...


on the topic of python documentation

I was going to talk about task management and how the pyVmomi API works with that. But, this week was eaten up by documentation work.

If you're not aware there's a number of Python documentation styles out there. In particular the Sphinx style documentation is the most popular in the Python community at large. The problem with Sphinx documentation style is that it's rather dense to read. I plan on hosting these final products on the github project site, so that becomes part of the conversation as well.

Because GitHub does such a good job with markdown files, as part of my documentation work I prepared this markdown version of the documentation. This is a very nice looking set of docs that we can generate procedurally. It's relatively regular and it will be easy for people to edit and maintain. But markdown isn't anywhere near anyone's idea of a documentation standard for Python.

As a matter of human readability I'm very fond of the Google Python Documentation style. It has the benefit of being regular yet very human readable. The problem is it doesn't get you where you need to go 100% of the time. Many tools around Python are built assuming Sphinx style markups.

So... what to do? Plainly I've not decided. I will at minimum need the sphinx version of these documents. With the Sphinx version we should be able to enable IDEs to do code completion and other nifty tricks. But there's a fair bit of work in taking the vSphere HTML documentation and turning it into Sphinx documentation. I'm busy with a tool to finish off conversions to either format.

I've spent the week examining various ways to document the project and how to link these up in people's IDE. I've also been looking at ways to walk people through using the library and I've been busy cataloging new enhancement directions as we move along.

The documentation work makes me very optimistic about switching to static Python classes for a future version of pyVmomi and it might even yield some interesting ways to do a new Java binding using Groovy as a dynamic language base (because Groovy and Python share a lot of language features.)

I've had to drop off here at week's end with about 80% of the solution in hand for the documents.

More next week...


pyvmomi-tools: providing library extensions and tools outside of the pyvmomi release cycle

Last week I mentioned how some sample requests on pyvmomi-community-samples were bringing up a few interesting topics. These were issues I've seen elsewhere and desperately wanted to fix.

I've found that new programmers to the vSphere API have a lot of trouble dealing with Tasks, PropertyCollector, PropertyFilter, and Views constructs. These are in particular sticking points I would like to make easier for the programmers that don't want to necessarily deal with those constructs straight off. It is also beneficial to certain projects if the code for handling these kinds of issues is kept independently from the project itself.

Why not just add these functions directly to pyVmomi?
Well ... just how can I do that? The pyVmomi library dynamically generates and loads its class definitions for vim.* and vmodl.* namespaces. I'm currently playing with ways to make this happen statically but that is going to be a big change and I don't want to tackle it until we have more tooling support around the library.

In general, the tactic of developing pyvmomi-tools independently of pyvmomi means we get more latitude in the tools' development. If we identify certain tools as promotion worthy some might get promoted up into pyvmomi itself and others might be broken out into supporting libraries. In general, smaller projects are easier to maintain and having a large number of small projects to maintain means its easier to delegate work and assure an individual module is bug free.

However, a large number of small modules is hard to use when you are a developer. Instead, it's much easier to grab a single library. So, to handle that problem when we get there, pyvmomi-tools should morph into a top-level project in time. When you include pyvmomi-tools in your requirements you would be getting a number of smaller libraries with focuses on any number VMware APIs but you would get to interact with a cohesive whole.

That's the vision at any rate. You can compare what using pyVmomi with and without tools is like by looking at my latest samples on power_cycling virtual machines.

The big things to notice? Look at finding a virtual machine, waiting for a task, and responding to something that can cause a task to hang. Naturally it's possible to do all this well without pyvmomi-tools but I hope you'll agree it's nice to have a well conceived helper tool to simplify some of this work.

Building pyvmomi-tools this week, I actually ended up throwing out a lot of lib work I believed would be necessary based on my work with other vSphere bindings... pyVmomi's internals have some pleasantly surprising side-effects and benefits that I'd love to dissect.

More on that next week...