Thoughts and Ideas: 2012

2012-09-14

Code Samples and the argument for WET versus DRY

In my new role at VMware I've been tasked with writing new code samples for the various API that have been created for supporting the Ecosystem around VMware's product line. This has been a very different experience for me.

I practice the principle of Don't Repeat Yourself (DRY) in most of the code I've produced during my career. In the last few years I've adopted Test Driven Development (TDD) as well as other techniques that make sense. In creating code samples most of these things still apply but DRY sort of doesn't.

The purpose of code samples is to highlight the use of the API and the target user is a programmer that is in a hurry. A methodical and slow programmer wouldn't necessarily even need samples since they would read documentation and work out via TDD the necessary information for how to use the API themselves. So my work with samples is for the developer who won't even necessarily stop to turn their attention to documents.

That means there is a need to Wholly Express our Terms (WET). In other words, if I happen to make my code too DRY there is opportunity for the learning programmer to miss vital steps. The solution to this is to choose things to repeat very carefully. This actually creates code repetition I would not want in a production system but should not completely brush aside code reuse.

The trick is to package steps up into reusable units that are easy to compose but are still expressed in an end product. In a nonsense and non NDA violating example:

    class Frobnicator extends Thingicator {
        public void frobnicate() {
            Something s = someService.doThing0();
            Otherthing o = someOtherService.doThing1(s);
            Anotherthing a = someService.doThing2(s,o);
            someBinding.doTheThing(s,o,a);
        }
    }

Hopefully we get to demonstrate some salient parts of an algorithm leaning on services to the end user. If I have to repeat the lines for how to get 's' and 'o' over and over, this is not so bad as repeating the contents of doThing0 and doThing1 over and over. The argument can still be made that by being WET (which also means Write Everything Twice) we are being instructional.

Conversely, we could argue that DRY code should still be used here since any good coder should be able to trace back the compositions and decompose what happened. The trick with balancing WET and DRY in a code sample is to recall that we are not writing a framework or a library or even a prototype. The purpose of a code sample is to document.

So I'm learning where to balance WET versus DRY in this game of writing code samples. I tend to prefer making the mistake of making things too DRY as opposed to too WET since this means I will have fewer things to think about in the future.

Incidentally, one of the things I would normally avoid is inheritance but I feel in the case of code samples it plays very well since I can say "this algorithm I'm showing you is a variation of ... " which is probably very instructive. In another system I might have use Functional thinking paradigms instead. Remember, I'm trying to keep a low bar in these samples and avoiding the temptation to teach both the API and Functional Programming at the same time. The goal is not to introduce too many hurdles.

Do you disagree with me? Should I just use all the best practices I know and force my sample reader to climb to my level? Am I condescending by making my code samples WET assuming too low a bar for the new developer? Your feedback is welcome.

2012-07-31

Grails Plugin Testing Strategies

I've been maintaining Grails plugins for years now both on the wild-wild web and in private enterprise-only plugins. In this talk at GR8Conf I shared stories from that history, the problems I encountered and how I solved them. Grails 2.x offers us so much in the way of testing and mocking we can now get rid of a lot of the crazy things I used to do to test my plugins.

Ultimately, however, when you do things at the persistence layer you need to do a full integration test with the database since there's no real way to substitute for a real database. So my advice is to avoid at all costs dipping down any lower in the architecture than you absolutely have to. Good plugin testing starts with good plugin design... design to test.

And if you do manage to do something naughty (as I often seem to want to do) then you should make sure you have lots of safety net underneath you with a rich testing environment all around your code.

for code shown, see also:

2012-07-30

Groovy Integration Patterns or The Grobnicating Frobnicator

I just gave the Groovy Integration Patterns talk at GR8Conf.us otherwise known as the Grobnicating Frobnicator talk. Where I take us on a journey through techniques on introducing Groovy into Enterprisey environments, along the way I take a really brief foray into the "Planet of the Monkey Patches" where we discuss some interesting interactions you can do with Groovy embedded in a mostly Java application. Then we move into the world of multi-tennant Java applications that can leverage Groovy and Groovy DSL to swap out function in configuration or dynamically at run-time.

The code is over at https://github.com/hartsock/groovy-integration-patterns

2012-06-06

Unix Philosophy meets Java

In all the literature on computing there's one quote that I find the most profound and illuminating. In three sentences it successfully encodes what I intuit must be a fundamental law of system design. It is the first of my guiding lights when I design anything.

From Wikipedia:

Doug McIlroy, the inventor of Unix pipes and one of the founders of the Unix tradition, summarized the philosophy as follows:^[1]

This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.

And, if you take this philosophy and apply it to Java... you get Separation of Concerns, Dependency Injection, and Interfaces with simple Data Transport Objects.
In Java, you would make one class do one job. When you need two jobs done, we know inheritance eventually makes a mess so we do composition. The best way to handle that composition is to make a generic interface that has multiple implementations and inject the correct implementation when needed. Then we can bind together loosely coupled services to compose the right application or plumb together a new application later.

The Dependency Injection model in Spring mirrors the pipes in Unix. Classes are supposed to do one job. Classes are ideally designed to be composited together to do a job. This part requires a little stretching: the use of interfaces with separate implementations that are wired after compile time is a mirror of using standard input and standard output.

Extend this analogy to the network and it is easy to see that the web itself is an extension of this core unix philosophy written large. Programs are connected through standard protocol and work together to produce the intended result. Often these protocols are nothing but the standard text streams that under-gird the unix pipe. In one sense the entire dot-com boom and subsequent internet related advances are all built on these three simple ideas that Doug McIlroy so clearly illustrates.

History proves that if you do observe these rules your long-term prospects are much more favorable than those who do not. Consider the internal design of the systems and what happened to them over time. For example Apple pre and post OS X shift or Windows pre and post VAX infusion.

Now consider the design of Android the OS. Few other OS have so completely embraced this design philosophy and been so completely designed around the web. The intent and subsequent OS design is highly instructive.

Now, what will history prove about this new intersection of the unix philosophy and Java? Time will tell.

EDIT: I added a link to the Android intent java documentation. I felt it would be clearer that I meant the intent java interface not merely the english word "intent" ... this was intended as an aid to casual readers.

2012-05-04

there's engineering and then there's social engineering

Twitter is on the face of it such a simple application that it can be built in 40 minutes in Grails.

For a consummate software engineer with state of the art tools, building a twitter in clone 40 minutes that can scale... might even be easy today given a proper set of cloud enabled tools. But, even if it is an exact working copy of twitter and can scale to ten times the size of the current twitter... it's NOT twitter and will never be twitter.

Twitter is a brand now. It is more than merely a functional set of software performing a simple function. It is a user experience driven not by technology but by human interaction. That experience in itself constitues a wholly separate kind of real engineering.

In fact, if Twitter were to completely rewrite their application in a whole new technology stack, the net effect (providing only nominal interruption of services) would probably be zero on their brand. That's because aside from being a brand and an application Twitter is about doing something very simple (and rather boring to systems level folks) with your computer and with existing internet technologies.

So why aren't there a bajillion Twitter-like applications? Lord knows at one point people tried. Well there are, just none of them are mentioned on CNN and the Colbert Report. There was a time when Twitter was novel and now it's not. But, that doesn't matter anymore because we all know about Twitter.

It's just like Windows. Or Google. Or Kleenex. And all the interesting socio-economic issues that go with the success of a brand like Kleenex and the problems Kleenex has keeping people off its damn lawn. See, you can't call your tissue paper Kleenex because only we can call it that even if what you make basically is in fact the same thing.

You may not care if a tissue paper is of equal quality to Kleenex. You may choose to still buy Kleenex because you remember warm and fuzzy things about that Kleenex box. You know what to expect from a Kleenex box. You like Kleenex. So you buy it.

Whenever you consume something with an intent beyond the mere act of consumption then anything that distracts you from your end goal is a waste of attention. In this day and age we can't afford to pay much attention so we had better conserve it. Your customer service representative is there to try and keep you happy and consuming.

The same is true for technologists just as it is for any other demographic. When a technologist becomes comfortable with his favorite tool X no other tool can be as good as X even if it is better. That's because technologists get brand loyalty viruses too.

So I'm going to ask you... are you sure you like to use the tools in your toolbox because they are the best for the job or are you just buying Kleenex because that's what you buy? Is it worth thinking about?

2012-05-01

Beyond Finite State Automatons

Has someone moved computing beyond the Finite State Automaton in recent years? I'm unaware of anyone providing a system of computation that is in wide spread use that is not a Finite State Automaton. If someone has please let me know. I've heard of some researchers in Tel Aviv working on a neural net.

Considering that all we have to work with, even in the clouds is FSA it seems that we will always be working with conceptual derivatives of the Turing processing tape. So as much as I would like to see something other than a suped up text editor for creating code I have to say while some feel that an IDE is not enough ... I'm afraid it will basically have to do when it comes to what we call "programming" until we escape the confines of the FSA.

Just as a FSA is the core definition of behavior for a computer, DNA is the core definition of a cell's behavior. To be sure it's not all that's going on. There's a whole mess of inputs and environmental factors too. Where do we see similarities between FSA, DNA, and epigenetics? The cloud.

When working with emergent systems, however, the external forces may produce unforeseen effects. These, however, remain in the space of discrete systems. If they did breach into the world of contiguous systems this would pose a problem on the order of the incompleteness theorems. We would have found the bridge to the mystic. Our limited discrete systems would be able to cross the cosmic gulf into to the divine space of contiguous systems.

As of yet, we can't even prove that physical reality can represent irrational numbers. We know they exist. We know they are real... we just can't represent them with atoms, particles, or waves. We slice finer and finer and we hit a generalizable uncertainty principle. So... what does that mean for my programming language?

Basically, it means that all representations of information contained within our universe (as best as we know it) is lossy and discrete. We can't ever create representations of information that is precise as a datum. Instead we can create FSA or algorithms that can compute to arbitrary precision what a particular unit of data is.

Now it is true that we have no sufficient model of computation to properly represent concurrent computing in a fully realized cloud environment. In systems programming discussions today we have to consider the effects of millions of independent FSA acting on a network. The cloud provides us an illusion of locality for things that are not actually local. The fabric of the network and the true concurrency of the cloud exacerbates issues of consistency, atomicity, and durability of systems in the cloud.

In a nutshell that's why we have big moves in NOSQL and in non-relational data stores. These solutions seek to eliminate centralized state. They compartmentalize the state issue inherent in a big ACID compliant data store.

If you don't believe me... have you ever tried to debug a neural net? How about a lisp program? How about a fully distributed system? These are non-trivial problems to debug. In this case I have to agree... an IDE is not enough.

In all of this, I continue to wrestle with one of the original problems my first research projects touched on... the emergent system. Somehow we only have discrete matter and yet it breaches into the world of the divine. The SA breaks local minima an maxima. The EA gives rise to the mystical. And, these are accomplished by interacting FSA which are discrete systems described by strings of symbols yet not fully understood by just these.

Yet this is all we have... and all evidence we have at this is that all the universe itself has are these small planck lengths filled with properties each potentially a cosmic symbol in a universal Finite State Machine.

So I ask... has anyone found something other than this kind of a toy to do computation on? Will anyone ever find such a thing? When will we know all the digits in Pi?

2012-04-18

The answer is 5

Many, many, years ago I got the chance to work with a highly seasoned developer. This guy had helped write tools like make. Yes kids... he worked on an early version of make. So as he came up to speed on our new system I left him be, figuring who the heck was I to direct his work at all? I was just some schlub who started at the company earlier.

This seasoned developer got under the gun on his first project at our little company and we had a monthly UAT cycle. In the process this company used: two weeks out, you would freeze new development and then turn the whole system over to test. The new system development would start on a new branch as we practiced "clean trunk, dirty branch" style of version management. (Today I prefer dirty trunk, clean branch but I digress.)

UAT went badly and as was the policy in this company the responsible developer had to turn around and fix the bugs found by the human testers (and turn it around rapidly) so the whole human driven Testing Cycle could start all over again. Some cycles went smooth. Some cycles were hell as the release date never moved and people would have to work longer and longer hours as deadlines loomed closer and closer.

As was typical, a client might hold off on their contractual obligations to test. They might wait until the day before delivery to test. We had no automated testing at this little company. In those days things like that were not common. There was no clause that let us move the deadline. That meant if that client reported something that prevented the project from shipping we would have to work through the night to fix the bug an then sit attentively waiting for the client to test again and approve.

So when this seasoned paragon of coders came upon just such a situation what did he do? He fixed the bug in nothing flat and got everything through UAT late that night. He was a hero.

The UAT scripts were run that night were all written to test for "5" and so there was a bug in the UAT process. We did not retest *everything* after a failure. We only retested the failed things. We did not retest everything before we shipped... we only tested the failures. But, can you blame the testers? They were tired. They had been up late for days testing. They had to repeat the same boring actions over and over and over... can you blame them for beginning to short cut tests?

I looked over the changes and perhaps it was because I was a fresh pair of eyes... but I swear... I could see it right away. Somehow, the code entered a tautological loop that I don't remember exactly suffice to say... the only answer it could ever return was "5". But, because of the process I couldn't stop the deploy and I couldn't slip-stream the changes because I had to wait for the next cycle of UAT.

We would end up in an eight month long non-stop fire-fight due to that one mistake. When the fires finally died down we would have a staff party and celebrate our victory. Yet we would do little about our mistake other than remember it and make jokes about "5" being the right answer to everything.

To this day if you ask me a question and I'm tired I may respond with a whimsical "5" ... and all my coworkers from that name-less and thank-less job (you know who you are) will chuckle along with me. We all learned a valuable lesson none of us will ever forget because it is burned into our foreheads with the brand "5" as its name.

So every time I think: "eh, I'll skip the tests." I remember what it is like to have no automated testing at all. I remember what its like to not be able to quickly tell that you've just made a mistake that means your method only returns 5. I remember 8 months forged in hell, that nearly crashed that company, and I run the tests again anyway.

2012-04-13

Clojure User's Group @ Google NYC [Feb, 2011]

This post was sitting in my drafts bin for a while but I decided to finish it off and post it today. My apologies if it is a bit dated and not up on current Clojure syntax.

I was at Clojure User's Group, NYC back in 2011...

And Stuart Halloway flashed this code on the screen:

(describe String (named "last"))

He complained that while it appears simple, it is in fact not simple and showed this code instead (which does the same job).

(->>
   (reflect java.lang.String)
   :members
   (filters
        #(.startWith (str (:name %)) "last"))
   (print-table)
)

He then claimed the new code was simpler. I had to admit I just didn't see it. How is the second listing simpler? The idea is to describe in table format some data right? "describe" seems pretty simple to me.

Over beer that night Stuart illuminated my foggy understanding. The second listing is simpler because it introduces fewer new constructs to the language. It is also simpler because the components are smaller, looser in coupling, and can be re-used later without some special facility or dependency on printable output. The key to reuse being that "print-table" now does tablular output on any incoming data as opposed to hiding that nugget of function deep inside the "describe" machinery.

We went on to discuss the issues with ORM and how mapping data adds a layer of accidental complexity. Being a good disciple of the Java/Groovy camp I believe in hiding complexity behind clever facades like Domain Specific Languages, Frameworks, and API. This is in stark contrast to a good disciple of Lisp's view of the world.

What makes Clojure interesting is the deliberate collision between these two world views. I normally would not be forced to dredge up my decade old exposure to Lisp and place it directly in front of what I've learned from Java the last decade. The Clojure folks force us to do that and you can blame this momentary encounter with Stuart for my talk on Functional Programming in Java and for the thought experiment that is lambda-cache which is an attempt to do caching (via ehCache) using a lambda construct in Plain Old Java Objects.

What is interesting is as we explore Functional Programming on the JVM and as project lambda matures we are going to finally explore how best to reconcile these differing views of simple. That is because simple is anything but simple. In it's purest form simple is really an analogy for beauty when it comes to computer science or mathematics. The value of beauty is debatable. How to achieve beauty is at least in part subjective.

What is most surprising is that there are objective measures of beauty in many fields. For example humans find adherence to the golden ratio in proportion to be beautiful. There may in fact be a measure of beauty that can be performed mechanically. Similarly, is there a measure for beauty when it comes to computing? Ostensibly simple is the opposite of complex and computational complexity theory is still a dark murky thing. Do we need to shine some light into the corners of complexity theory before we can measure simplicity?

2012-03-29

POSSCON 2012

I had a great time at POSSCON 2012, met up with Steve Graham (a fellow Triangle Resident), got to talk to Jon 'Maddog' Hall, and a whole host of Open Source luminaries. I even learned what a Palmetto is. My talk was An Introduction to Grails where in I live code while explaining each line. The source code is up on github now if you want. The slides are naturally available below: POSSCON is definitely a conference you should check out.

2012-03-20

Functional Programming in Java @ TriJUG

Last night's slides presented at TriJUG.

Code is available on github. Special thanks to Muhammad Ashraf for his contributions and peer review.

2012-02-11

F.lux

F.lux is a free as in beer application that alters the color of your screen. It can help people who have problems sleeping after heavy computer use. At minimum it makes your laptop feel more cozy at night.

I don't normally write endorsements for applications in this space but I had to make mention of a bit of software written by David Santiago. The idea behind F.lux is that your laptop is designed to look good during the day, at night the color balance changes around you and your laptop's color should too.

I recently moved myself to a new MacBook Pro and it's crisp display does look quite good during the day. I had found myself using the laptop late at night wanting to get a little something done. Some how, these late night coding adventures would end up stretching later and later. Recently, I looked up and noticed it was 4:30 am and I wasn't tired.

I had been told about F.lux by another MacBook Pro owner and so I installed it that early morning. The software immediately altered my screen color to a pleasant amber tint. Within five minutes I was sleepy.

I had not realized what a huge effect staring at the blue tinted screen was having on my sleep cycle. I wonder how many other professional computing people are suffering from a similar problem and are completely oblivious to it.

If you suspect you might be one of the people who like me are very effected by local lighting you should definitely take this for a spin. If David Santiago had a tip jar I'd definitely drop a few bucks in there. He deserves a beer for saving my sleep cycle.

Thanks David.

2012-02-08

Google+ ate my blog.

I am simply not blogging anymore. The content I would put here is ending up in Google+. I'm not sure why that is. Using Google+ seems more targeted and lighter weight than using a blog.

A blog feels heavier, more official and more permanent. Google+ feels less permanent and lighter. That's just the impression I have. I know logically that both medium are potentially longer lasting than concrete and more public than a billboard plaster over the moon.

The experience of blogging involves "going to" the blog where as a Google+ experience is both reading and writing at the same time. The user experience is fluid. The content can be the same size, shape, and form for the most part but the feeling of disengaging and "going off to write" is much more psychologically present with a blog. It feels like "something I have to do" as opposed to something that just happens while I'm doing another thing.

It would be nice if there was a way for my Google+ posts to flow into my blog with the addition of a tag. Maybe someone could write that and become famous for it.

Okay, enough of that... we have work to do. Now get to it!