How to use Conway's Law for good and not evil

If you've been paying attention to our github repository you'll have noticed some noise around the following API additions to pyVmomi...
These are all presently scheduled for the next major release of pyVmomi which is currently set for December of 2014 and will be called 5.5.0-2014.2 barring any major issues.

I'll be working with the various feature teams inside VMware to figure out how best to Open Source their existing API bindings. More importantly, I'll be work with these teams to find a way that allows the effect described in Conway's Law to function beneficially on this set of projects.

In other words, pyVmomi isn't one large project with one BDFL directing the entire API. The reality is that the APIs potentially exposed by pyVmomi are in fact the result of multiple collaborating teams. Each of these teams will tend to be their own unique snowflake.

Couple this with the fact that we will also have the opportunity to leverage contributions from interested third parties, vendors, and partners and you can see that the previous monolithic structure might not survive the strain. I'll be taking some vital time to implement at least one or more extension strategies for the library and have them evaluated. This will probably cost 3 or more weeks worth of time but it will be worth the pain because these are long-lived projects with minimum life-spans on the order of 3 or more years.

As part of this research I've come back time and again to a talk by Doug Hellmann titled Dynamic Code Patterns.

Not to diminish Doug's work at all, but his Stevedore project is an implementation of the Extensibility Pattern which is itself part of the previous generation's Software Design Patterns movement. The design patterns technique came under understandable criticism for not really contributing much to programming and more than occasionally over complicating things. It's these concerns about over complication that are going to slow me down a bit. After all, making things simple is not easy and my goal is to make things as simple as possible and no simpler.

The Stevedore project does offer something to a library like pyVmomi. It is a tool to create extensions. The term 'plugin' is not really accurate when we're talking about modules like SMS. The SMS module is neither a driver, nor is it a hook, but it is an extension. These are new classes and methods that provide new capabilities from the primary facade of the pyVmomi library itself. You can call these plugins but not in the normal sense.

There is also danger in approaching the problem of extending into these new API as a plugin. It quite possibly introduces the problem of turning pyVmomi into a framework and such a transformation would be wholly inappropriate. In particular, I am considering the use case for pyVmomi as a helper library to deliver a driver for SaltStack. What happens to the outer project if pyVmomi itself becomes a framework and does this make those consumer projects unnecessarily more complex?

We want extensibility but not at the cost of adding the need to be aware of more details. We want the library to appear as a unit even though it is actually composed of multiple sub-units. That's something that requires some subtlety. It's not something to bother with if we don't need it so... why do I think we need it?

Over the next 2 weeks I'll be experimenting with SMS support provided in a couple of different ways in the hopes of finding a sustainable, flexible, and robust mechanism to use Conway's law as a means of enforcing a structure of collaboration between teams. In particular there's a set of existing organizational and structural relationships that I feel the current design violates..

In particular, at VMware, I do not personally manage, dictate, or control what the structure or design a given ... feature interface (or sub-API) looks like. It's impractical to think I could. My role is to act as a bridge between VMware and the developer communities interested in building software on-top of the VMware platforms.

Some details about the practical nature of how modules like SMS, SPBM, and EAM for use in pyVmomi are produced directly informs our library design considerations. For example...

  • Feature teams (and their management)
    • design, maintain, and release their own API
    • dictate what API are official, unofficial, and deprecated
    • have their own schedules, priorities, and deadlines
    • are fundamentally disinterested in low-level concerns
Knowing this, any design I create for pyVmomi that dictates that a feature team must talk to me first and then convince me to do things a certain way are very likely to fail. Why? Well, let's take the opposite approach as a thought experiment and see what happens.

If I were to decide that I was going to tightly control pyVmomi and force feature teams to first get approval for API from me before I clicked my big "approved" button and there by uploading their API to github that would mean I could conceivably derail their priorities or deadlines. 

What do you think would happen? I could cause a vital product or feature to slip it's release; I could force an unnecessary conflict between managers; I could find my project usurped by something the feature team felt could disentangle themselves from my meddling in their affairs. Any or all these and other scenarios could happen. In short, I would create an unnecessary friction and unintentional power-struggle as people tried to focus on issues wholly unrelated to something as trivial as providing API to people like yourself.

So, if I want to instead create a successful project in this environment I have to engineer my software so that it can function with the natural (and proper) social boundaries already present. The structure of the code itself will also influence how social interactions occur around the code base. And finally, the modularity of the code will allow me to potentially delegate power, authority, and autonomy to other people.

That's not probably something you're used to a programmer talking candidly about is it? Perhaps I'm an outlier, but I don't particularly want to structure my code to force my hand. In fact, I want quite the opposite. I want my code to empower you. That means my relationship with my software is completely inverted from what Conway was describing... I'm instead structuring the code to foster the social structure I hope to see.

So what are our design concerns?
  • For feature creators adding to pyVmomi we should,
    • leverage existing library features
    • hide low-level concerns
    • allow independent ownership
    • simplify the process of creating a binding as much as possible
  • For integrating developers working with pyVmomi in their separate projects we should,
    • present a single unified "surface" for them to use
    • hide accidental complexity but expose essential complexity
    • follow a rule of least surprise
I admittedly may think of more as I work through the problems but this captures what we're after in a nutshell. This library in the world of Conway's Law represents a bridge between multiple parties and it's structure will end up reflecting how these groups relate. The pyVmomi software sitting between these groups will be least painful to work with for everyone if it can be molded to it present reality.

After we tackle Open Sourcing these three new API I'll have a better picture of what that reality is. And, that's what large-scale software development is really about, not as much computers and code so much as people, relationships, and communication. Our tools affect our lives and better tools make better lives.

More on this another time...


Developer Community Engagement

If you've not been following along at home, pyVmomi was run in a different manner from most VMware Open Source projects. It's been a bit of a social experiment. The last two weeks since our release, I've been working on distilling lessons learned from the past five months of the project.

I did not plan on also looking into rbVmomi ... but ... at the same time just after VMworld a certain blog post started making the rounds on social media. It's clearly an opportunity to examine what we're doing at VMware around OpenSource development projects.

rbVmomi is a more typical VMware fling project. These start as a developer driven POC project and these are developed on a best-effort basis. The rbVmomi project has closed 6 issues and closed 10 pull requests during its entire lifetime as a VMware project.

pyVmomi has benefited from having my attention full time since April/May of 2014. The total number of issues closed to date is 59 with a total number of 70 merges. These differences in numbers shouldn't be surprising, that's to be expected when you go from free time development to full time development. My personal stats have become quite impressive due to the full-time activity on GitHub.

That's all nice for me, but, what does that really mean for the library? What does that mean for developer use, experience, and over-all for VMware? It's a matter of audience and SDK adoption.

Over on stackoverflow, you can see that in its entire life-span rbVmomi has had 9 questions asked as of this writing. That indicates 9 times people who are likely to seek help on stackoverflow have sought help and those 9 times only 4 of those questions got answers.

Taking a look at the same search for pyVmomi yields 24 questions over a much shorter life-span. 

13 of these questions have been answered and 19 of times people voted on the questions where as with rbVmomi no one voted. If you take a closer look, 17 of those questions occur after my full time commitment to the project. In the shorter time-frame my public commitment and effort to the library has helped increase developer engagement with the library improve by an order of magnitude.

The next question is... is this effort worth it? And how do we determine that

I'm open to suggestions. What else should I be looking at?


This week: Retrospectives

This was a short week, so I would like to take a moment and reflect on where we've been since May.

This week I've been getting ready for a creating a set of new spikes around a few of our unanswered sample requests. Out of these will come my development plan is to create new features to include in pyvmomi-tools. By the way, if you're new to the projects or haven't been following along, part of the reason for breaking out pyvmomi from pyvmomi-tools is to allow us to develop different release cycles and development standards and styles. It would allow a quicker-turn around on some items and also allow some experimentation in pyvmomi-tools that we might not want to risk in the core pyvmomi library.

This is all part of a social experiment I'm conducting trying to help VMware at large find the best way to engage with the open source communities. Earlier in the week someone called out this negative reaction in the rbVmomi community. Ironically, rbVmomi is much older, more robust, and feature complete than pyVmomi but the community reception to pyVmomi has been much warmer in recent weeks.

I'm getting reports of a number of python projects shifting from other vSphere API bindings to pyVmomi as the confidence in the library grows. I've been getting almost nothing but positive feedback from users of the library and I think we're well on our way to becoming the best way for integrators to work with vSphere. And, we're reaching that goal by encouraging better developer practice.

From some IRC chats earlier in the week I found out that at least one shop has traded from vijava to pyVmomi running in jython. I've not tested or verified jython for use with pyVmomi but I'm encouraged to hear the work is progressing well. If anyone else is attempting this kind of work I would appreciate hearing and/or reading about the experience.

I also find this language switch curious because vijava is actually quite well done. As a side note I am doubly curious because of my own past involvement in alternative languages for the JVM. I've not attempted jython with Java hybrid projects before and I'm curious as to how and why these fit together.

 I'll be presenting these stories to VMware teams to help build the case for this style of community engagement and I plan on having a series of discussions around rbVmomi as well. As I've mentioned multiple times in this blog, merely building pyVmomi up is only one of the objectives toward my much larger goal of helping change the way we write software for the cloud.

Part of that larger goal is testing, process, and engagement.

More on this next time...


pyVmomi v5.5.0-2014.1.1 - bug fix - is available from pypi.

This week was VMworld and we still managed to release a quick turn-around bug-fix release for pyVmomi. Up on pypi right now is the v5.5.0-2014.1.1  release. Part of the changes made involved improving our release process and incorporating feedback from RPM and DEB package maintainers.

If you're still using the December release, here's a unified change list between 5.5.0 and 5.5.0_2014.1.1:

  • Simplifies test-requirements.txt
  • Introduces support for tox
  • Changes version number scheme
  • Fixes README to be pypi compatible
  • Improved sdist and bdist packaging support
  • Changes to package information based on The Python Packaging Authority instructions
  • Fix bug that produces traceback when running tests for the first time

Far and away, working on the pyVmomi has been one of the most professionally satisfying projects of my career. That's chiefly because of how immediate the interaction and feedback from users of pyVmomi has been. In most other projects feedback from users of your code base has to navigate its way back to you over time, instead on pyVmomi the feedback has been immediate.

Part of this is due to the ease of public interaction that tools like github, freenode, pypi, and twitter (with ifttt automations) offers. I plan on evangelizing this working-style to other teams within VMware and your positive feedback can only help. For a developer it's a very rewarding way to work and hopefully for a customer it's also very satisfying.

Next week, I'll be working on our 2014.2 milestone and setting up to make a go at some more community samples. It should be noted that while pyVmomi itself is able to run just fine on Python 3, many of the samples are written assuming Python 2.7 ... this is fine since the two projects are decoupled.

The reason we keep the community samples repository separate from the core pyvmomi project is to allow for each project to exercise its own standards. The samples project is very loose while pyVmomi is run extremely tightly. That's a function of where on the stack each project lives. In general the deeper down the stack you go, the more rigid the project needs to run to ensure quality and stability for all the projects built on top of it.

More on the community project next week...


pyVmomi: preparing for 5.5.0-2014.1.1 our first bug fix release.

I've been heads down this week, so a short update this time.

Last week, we released pyVmomi 5.5.0_2014.1 to pypi. This week we started work on preparing RPM and Debian packages which will allow pyVmomi to be included with Linux distributions important for OpenStack work.

Having released 2014.1 I opened a milestone for 2014.1.1 as a bug fix release and identified a few smaller quick turn around tasks. We have one known bug and a few minor improvements and changes that we'll have to get in place. (Being right up against VMworld has slowed down a few tasks, but we're still moving rapidly.)

Notably the 5.5.0-2014.1.1 release will:

  • Fix a bug with sending date-time stamps to the server
  • Tweak the version number standard to fit better with pypi and RPM.
  • Standardize packaging for the pyVmomi library
  • Improve release documentation and processes for versions.
I got a bit side-tracked this week as I investigated more of the pyVmomi internals and noted that the XML documents are non-deterministically assembled. This means from one run to the next the SOAP message coming out of pyVmomi can be different in certain trivial ways.

I noted that...

  • The order of xmlns="*" attributes inside tags can occur in random order
  • The amount of whitespace in documents can vary in some situations
This meant that naive string-based comparisons done in vcrpy's built-in body request matcher can't actually compare XML document contents for logical consistency. I wrote the start of an XML based comparison system but after burning more than three days on the problem I have to stop and wonder if this is worth the effort and if it wouldn't be simpler to just figure out why pyVmomi creates random ordered XML and then fix that instead.

In the next week-or-so, we should be able to deliver a quick-turn around bug fix release with the improvements I've listed here. We're doing well for this now that we're getting more attention from last week's release. The 2014.2 is tentatively scheduled for November 15th this year.

More on these topics next week...


pyVmomi version 5.5.0-2014.1 is available.

Earlier today with help from other VMware employees and #python IRC users, I pushed the buttons necessary to release pyVmomi version 5.5.0-2014.1 to the general public. This is the first open source community built and released version of the vSphere Management API library since VMware open sourced the project in December.

This release featured contributions from 13 contributors, 10 of which were completely unaffiliated with VMware. With the exception of myself the most active contributors are not at all affiliated with VMware.

Special Thanks to...

Thanks as well to William Lam for championing our project as well.

This release marks an important step for pyVmomi, most notably the ability to use pyVmomi with vcrpy means we can do much more in-depth and thorough bug reporting for 3rd party developers. It also means that the focus of development can move from infrastructure to feature parity.

More on that next time...


On the topic of Goals and Objectives for pyVmomi

As long as I'm working on the pyVmomi project, its goal will be to become the single best way to talk to vSphere technologies. This is a broad and hard to measure thing, and with luck as VMware technologies evolve, it's also a moving target. Once we arrive at one location there will be a new version to sail toward. In order to steer accurately, we must set a course by some kind of metaphorical guiding star, and this is what objectives are meant to be.

While the goal is the destination we are sailing our metaphorical ship toward, our near-term objectives are the metaphorical stars we choose to help guide us toward that goal. Just as sailing a ship may require course corrections, which implies choosing different stars; so too might managing a project require course corrections which implies choosing different near term objectives. When we course-correct ourselves, our projects, our teams, or our careers, we are changing the objectives much as a ship navigator might choose a different guiding star at different times during a ships journey toward a distant destination.

In common parlance, the terms goal and objective are virtually synonymous but it makes sense to make a distinction between them when we're talking about conducting the daily business of building software. Being "the best" is kind of hard to measure, it implies quality, adaptiveness, and nimbleness in the library as well as a breadth of uses for the library. This requires us to choose some guiding stars with wisdom.

How do you design a library for these attributes? How do you pick the correct guiding stars?

A little philosophy

A little philosophy is necessary to guide how we go about setting our objectives to achieve our goals. Just a little philosophy is necessary, too much and we spend our days navel gazing, too little and we flail about wildly wondering why we work so hard and accomplish so little. Regular periods of slack followed by pressure are best for this. Creativity can only come when you are free to explore but productivity only really truly solidifies itself under threat of deadline.

So what are some philosophies I could hold in this case?

A few contrary points of view

I might think that audience size is the best measure of project success. That might mean I would have less priority on other aspects of the software and more on things that make the software appeal to a wide audience. In other words, a bug's priority only matters as a measure of the number of new users it can net.

Examples of projects guided by this principle to the exclusion of others might include things like PHP. The design choices in PHP were early on driven by getting broad adoption and rapidly securing mind-share. Very little time was spent thinking about the long-term impact of design decisions. The result is PHP is the most successful programming language (in its niche) of all time and one of the most hated.

I might choose to believe that 'feature count' was a better measure for project success. If I believed that cranking out code was the way to 'get stuff done', then I would probably believe that anything that got in your way of 'getting stuff done' was a waste of time. That would probably mean I would be after the largest volume of code in the shortest amount of time. More code means more function right?

The problem with this is feature creep. If you want to keep a light nimble software project that can respond quickly to changes in its environment a small modular project is best. You keep the project itself small or suffer the consequences. There's usually an 80/20 rule for features and following it can result in faster release cycles.

After years of working on software systems big and small, with tiny audiences and accidental audiences of millions, I've come to believe a few things.

In Generalities

 I feel that the competition between Xbox versus Playstation in the marketplace is a good case-study of these philosophical disagreements in action. The results have mostly played out in the marketplace for us all to study. If we take a lesson from this history we might be able to improve our own state of practice.

In 2012 it was hard to tell if there was a marketplace winner in the video game console market. The three top video game consoles of the time period had traded top position on various metrics repeatedly but by Q2 2014 there is a clear winner in market sales (only time will tell if this is a permanent state of affairs).

Sony had always invested in a complete engineering package for its consoles and frequently talked about 10 year plans with its ecosystem. Ironically, this same business strategy had failed them before. When it came to BetaMax versus VHS this same strategy of 'technical superiority' did not pay off, however, and that's a cautionary tale. The entire engineering process matters.

When building a system you have to take into account all the pertinent forces shaping its market share. These include multiple product audiences, shareholders, and customers, as well as multiple engineering considerations about how the product functions. Not the least of which includes the process by which you create the product itself.

Engineering Objectives

Audience size matters, feature count matters, and perceived quality matters. Each affects the other and each affects the total impact of your software project. Minmaxing on only one dimension or another doesn't necessarily equate to a lasting victory. So we need to find ways to incorporate all these elements into our daily choices. That's what setting objectives is all about.

Over the years, I've been thoroughly impressed at how products generated by very bad engineering can sometimes capture and dominate markets when very good engineering fails. I believe the problem comes from improperly balancing objectives. A single dimension of engineering and design has been maximized at the expense of balancing concerns. It's far too easy to make an easy to measure metric, set it as an objective, and steer the metaphorical ship by a star that has nothing to do with the goal. Such engineering produces something that is arguably beautiful yet broken.

Broken Strategy

For example, a typical strategy used to solve quality issues in software systems is to increase test coverage. Coverage is an easy number to measure. It makes nice charts and gives a wonderful feeling of progress to developers. It's also a trap.

Merely increasing code coverage does not universally improve the code in all its dimensions. In fact, improperly applied test coverage can create tightly coupled systems that are worse suited. This is perhaps the starkest lesson you can learn about successfully reaching a 100% code-coverage goal: you can end up with more technical debt not less. (I could call out certain open source projects here but I won't for brevity's sake.)

If no metric can measure this concept of tight coupling to balance the code coverage metric, then, merely measuring code coverage percentages pushes the software design in the wrong direction. Your team will optimize for coverage at the expense of other attributes. One of those attributes can actually be code quality (in that fixing a simple bug can take an inordinately long time) and flexibility (in the sense that your code can lose the ability to respond to new requirements).

I have come to believe Test Driven Development just as code coverage can also become a trap. Improperly applied, it similarly optimizes systems for unit tests which may not reflect the real design forces that the unit of code is under. These circumstances the code developed can end up very far from the intended destination just as high code coverage numbers can degrade actual quality of a software system.

Actively Compensating

Agile methodologies were intended as a tool to compensate for this dis-joint between the steering stars of objectives and the actual destination. The ability to course correct is vital. That means one set of objectives are perfect for a certain season while the same objectives might be completely wrong for another season.

To effectively use these tools (agile or otherwise) you can't fly by instrument. You need to get a feel for the real market and engineering forces at play in building your software product. These are things that require a sense of taste and refined aesthetics. You don't get these from a text book, you get them from experience.

My experience has taught me that you actually don't want to write more code you actually want to write less. You want to accomplish as much as possible while writing as little code as necessary without falling into code golf. That means that the most effective programmer may have some of the worst numbers on your leader board. Negative lines of code might be more productive than positive, fewer commits may be more profound than more. The mark of good engineering is doing a great deal with very little and that's what we strive for in software engineering.

From Philosophy to Concrete Objective

In the case of pyVmomi, we have no open sourced tests that shipped with version 5.5.0 as released from VMware's core API team. (Note: there are tests but they are fenced off from public contributors and this is a problem when it comes to getting quality contributions from the general population.) With no unit tests available it is almost impossible for a contributor to independently determine if their change positively or negatively impacts the project's goals. Some over-steer in the area of code coverage would be forgivable.

I also want to avoid solidifying library internals around dynamic package and class creation as well as internal aspects of the SOAP parser and its details. This puts me in an awkward position because the simplest most naive way to fence off units and begin busting-gut on test coverage would also force the tests to tightly couple onto the classes currently defined in what long-time pyVmomi developers refer to as the pyVmomi infrastructure.

Separation of Concerns

The fact that there is even a term 'pyVmomi infrastructure' means that there is an aspect of the library that some people need to talk about separately from some other aspect of the library. That indicates a conflation of separate concerns. This particular point in itself would be a lovely talking point for a whole different article on how software engineering becomes social engineering after a certain point. To become a highly used, trusted, and distributed library; pyVmomi should disambiguate these concerns. But, I digress.

Application of Philosophy as Strategy

I spent no less than three weeks developing the test strategy for pyVmomi that will allow us to test without boxing in the library itself. The strategy leans heavily on fixture based testing and specifically on the vcrpy library. In our case, the nuance is that our fixture needs to setup a fake socket with all the correct information in it to simulate an interaction with vCenter and/or ESXi without requiring mocked, stubbed, or simulated versions of each.

If we avoid testing directly design elements (things like the XML parser itself), and we avoid testing in isolation concerns that are deemed infrastructure versus not-infrastructure, then we are left with testing the API "surface" as exposed by pyVmomi. The unit tests call on the actual symbols we want to expose to developers and these are the API surfaces as I call them. The outermost exposed interface intended for end consumption.

The shape of these fixture-based tests are virtually identical to targeted samples of the API pyVmomi is binding. Given a large enough volume of use-cases these unit tests with fixtures might eventually encompass a body of official samples. Existing as tests means that these samples will also validate the fitness of any changes against known uses of the library.

This strategy effectively retro-fits tests onto the library without locking in design decisions that may not have had very much thought. It frees us to build use-cases and eventually fearlessly refactor the library since the tests will not tightly couple to implementation specifics and instead the tests couple to interface symbols.

Objectives Accomplished

We want pyVmomi to continue to exist long enough that it can accomplish its goal of being the best library for working with vSphere. To survive, we need the library to have a lifespan beyond Python 2. We need the library to allow contributors to objectively measure the quality and fitness of their own contributions so it attracts enough developers to evolve and spread toward its goal.

So far we've accomplished the following objectives in the up-coming release due to come out in mere days:
  • Python 3 support gives the pyVmomi library time to live and flexibility to grow
  • Fixture based tests give users and contributors confidence to develop while also...
    • avoiding design detail lock-in
    • hiding irrelevant library infrastructure details
    • providing official samples based on actual API units that will not break as the API changes
  • we also established contribution standards

Objectives to Accomplish

While we want pyVmomi community samples to evolve unrestricted and rapidly, it is also the source for requirements for the library. The samples project is separate so that it can welcome all comers with a low barrier to entry. But, it is very important as it will feed the main pyvmomi project in a very vital way. The samples become the requirements for pyVmomi.

The samples and the pyvmomi unit tests need not have a 1-to-1 relationship between sample script and test, but each sample should have a set of corollary unit tests with fixtures that give basic examples and tests for the use case illustrated in the parent sample. That means one sample might inspire a large number of unit tests.

These are some of the high level objectives to reach going forward on pyVmomi:
  • remain highly reliable and worthy of trust
    • cover all major vSphere API use cases in unit tests with fixtures
    • squash all high priority bugs rapidly which implies releasing fixes rapidly
  • reach feature parity with all competing and cooperating API bindings
To reach these new objectives we'll need to do some leg work to find way-points along the way. We may change some of our finer targeted objectives later, but these objectives help us reach the goal of being so good nobody working in this space can ignore us, being so good we deserve the title the best.

More on that in a future post...