As I'm working out SOAP + Langauge du jour problems I continue to be impressed with the utility embedded in the JVM.
Rarely do you get to develop software in a completely greenfield project. Most of us have to integrate our software with other systems and that often means dealing with SOAP. I asked about what people are doing with Groovy + SOAP over at Google+ and had a few side conversations with folks.
It turns out that when forced to deal with the abject insanity that is SOAP, most Groovy users drop back and pick up those Java tools like JAX-WS. It's tools like these that keep developers coming back to the JVM and that Java to Groovy integration story is still one of the best stories going for Groovy.
In part, you have to wonder if ... in order to understand SOAP's madness you don't need a little madness to get the job done. This insanity might have to continue to exist in practical long-lived projects for quite some time. Keeping that madness bottled up in some pre-baked libraries is honestly the best thing you can do for your own continued sanity and the sanity of your own projects.
2013-01-10
2012-09-14
Code Samples and the argument for WET versus DRY
In my new role at VMware I've been tasked with writing new code samples for the various API that have been created for supporting the Ecosystem around VMware's product line. This has been a very different experience for me.
I practice the principle of Don't Repeat Yourself (DRY) in most of the code I've produced during my career. In the last few years I've adopted Test Driven Development (TDD) as well as other techniques that make sense. In creating code samples most of these things still apply but DRY sort of doesn't.
The purpose of code samples is to highlight the use of the API and the target user is a programmer that is in a hurry. A methodical and slow programmer wouldn't necessarily even need samples since they would read documentation and work out via TDD the necessary information for how to use the API themselves. So my work with samples is for the developer who won't even necessarily stop to turn their attention to documents.
That means there is a need to Wholly Express our Terms (WET). In other words, if I happen to make my code too DRY there is opportunity for the learning programmer to miss vital steps. The solution to this is to choose things to repeat very carefully. This actually creates code repetition I would not want in a production system but should not completely brush aside code reuse.
The trick is to package steps up into reusable units that are easy to compose but are still expressed in an end product. In a nonsense and non NDA violating example:
Hopefully we get to demonstrate some salient parts of an algorithm leaning on services to the end user. If I have to repeat the lines for how to get 's' and 'o' over and over, this is not so bad as repeating the contents of doThing0 and doThing1 over and over. The argument can still be made that by being WET (which also means Write Everything Twice) we are being instructional.
Conversely, we could argue that DRY code should still be used here since any good coder should be able to trace back the compositions and decompose what happened. The trick with balancing WET and DRY in a code sample is to recall that we are not writing a framework or a library or even a prototype. The purpose of a code sample is to document.
So I'm learning where to balance WET versus DRY in this game of writing code samples. I tend to prefer making the mistake of making things too DRY as opposed to too WET since this means I will have fewer things to think about in the future.
Incidentally, one of the things I would normally avoid is inheritance but I feel in the case of code samples it plays very well since I can say "this algorithm I'm showing you is a variation of ... " which is probably very instructive. In another system I might have use Functional thinking paradigms instead. Remember, I'm trying to keep a low bar in these samples and avoiding the temptation to teach both the API and Functional Programming at the same time. The goal is not to introduce too many hurdles.
Do you disagree with me? Should I just use all the best practices I know and force my sample reader to climb to my level? Am I condescending by making my code samples WET assuming too low a bar for the new developer? Your feedback is welcome.
I practice the principle of Don't Repeat Yourself (DRY) in most of the code I've produced during my career. In the last few years I've adopted Test Driven Development (TDD) as well as other techniques that make sense. In creating code samples most of these things still apply but DRY sort of doesn't.
The purpose of code samples is to highlight the use of the API and the target user is a programmer that is in a hurry. A methodical and slow programmer wouldn't necessarily even need samples since they would read documentation and work out via TDD the necessary information for how to use the API themselves. So my work with samples is for the developer who won't even necessarily stop to turn their attention to documents.
That means there is a need to Wholly Express our Terms (WET). In other words, if I happen to make my code too DRY there is opportunity for the learning programmer to miss vital steps. The solution to this is to choose things to repeat very carefully. This actually creates code repetition I would not want in a production system but should not completely brush aside code reuse.
The trick is to package steps up into reusable units that are easy to compose but are still expressed in an end product. In a nonsense and non NDA violating example:
class Frobnicator extends Thingicator {
public void frobnicate() {
Something s = someService.doThing0();
Otherthing o = someOtherService.doThing1(s);
Anotherthing a = someService.doThing2(s,o);
someBinding.doTheThing(s,o,a);
}
}
Hopefully we get to demonstrate some salient parts of an algorithm leaning on services to the end user. If I have to repeat the lines for how to get 's' and 'o' over and over, this is not so bad as repeating the contents of doThing0 and doThing1 over and over. The argument can still be made that by being WET (which also means Write Everything Twice) we are being instructional.
Conversely, we could argue that DRY code should still be used here since any good coder should be able to trace back the compositions and decompose what happened. The trick with balancing WET and DRY in a code sample is to recall that we are not writing a framework or a library or even a prototype. The purpose of a code sample is to document.
So I'm learning where to balance WET versus DRY in this game of writing code samples. I tend to prefer making the mistake of making things too DRY as opposed to too WET since this means I will have fewer things to think about in the future.
Incidentally, one of the things I would normally avoid is inheritance but I feel in the case of code samples it plays very well since I can say "this algorithm I'm showing you is a variation of ... " which is probably very instructive. In another system I might have use Functional thinking paradigms instead. Remember, I'm trying to keep a low bar in these samples and avoiding the temptation to teach both the API and Functional Programming at the same time. The goal is not to introduce too many hurdles.
Do you disagree with me? Should I just use all the best practices I know and force my sample reader to climb to my level? Am I condescending by making my code samples WET assuming too low a bar for the new developer? Your feedback is welcome.
2012-07-31
Grails Plugin Testing Strategies
I've been maintaining Grails plugins for years now both on the wild-wild web and in private enterprise-only plugins. In this talk at GR8Conf I shared stories from that history, the problems I encountered and how I solved them. Grails 2.x offers us so much in the way of testing and mocking we can now get rid of a lot of the crazy things I used to do to test my plugins.
Ultimately, however, when you do things at the persistence layer you need to do a full integration test with the database since there's no real way to substitute for a real database. So my advice is to avoid at all costs dipping down any lower in the architecture than you absolutely have to. Good plugin testing starts with good plugin design... design to test.
And if you do manage to do something naughty (as I often seem to want to do) then you should make sure you have lots of safety net underneath you with a rich testing environment all around your code.
for code shown, see also:
Ultimately, however, when you do things at the persistence layer you need to do a full integration test with the database since there's no real way to substitute for a real database. So my advice is to avoid at all costs dipping down any lower in the architecture than you absolutely have to. Good plugin testing starts with good plugin design... design to test.
And if you do manage to do something naughty (as I often seem to want to do) then you should make sure you have lots of safety net underneath you with a rich testing environment all around your code.
for code shown, see also:
- https://github.com/hartsock/grails-audit-logging-plugin/tree/v100_beta
- https://github.com/hartsock/grails-qrcode
2012-07-30
Groovy Integration Patterns or The Grobnicating Frobnicator
I just gave the Groovy Integration Patterns talk at GR8Conf.us otherwise known as the Grobnicating Frobnicator talk. Where I take us on a journey through techniques on introducing Groovy into Enterprisey environments, along the way I take a really brief foray into the "Planet of the Monkey Patches" where we discuss some interesting interactions you can do with Groovy embedded in a mostly Java application. Then we move into the world of multi-tennant Java applications that can leverage Groovy and Groovy DSL to swap out function in configuration or dynamically at run-time.
The code is over at https://github.com/hartsock/groovy-integration-patterns
The code is over at https://github.com/hartsock/groovy-integration-patterns
2012-06-06
Unix Philosophy meets Java
In all the literature on computing there's one quote that I find the most profound and illuminating. In three sentences it successfully encodes what I intuit must be a fundamental law of system design. It is the first of my guiding lights when I design anything.
From Wikipedia:
Doug McIlroy, the inventor of Unix pipes and one of the founders of the Unix tradition, summarized the philosophy as follows:[1]
And, if you take this philosophy and apply it to Java... you get Separation of Concerns, Dependency Injection, and Interfaces with simple Data Transport Objects.This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
In Java, you would make one class do one job. When you need two jobs done, we know inheritance eventually makes a mess so we do composition. The best way to handle that composition is to make a generic interface that has multiple implementations and inject the correct implementation when needed. Then we can bind together loosely coupled services to compose the right application or plumb together a new application later.
The Dependency Injection model in Spring mirrors the pipes in Unix. Classes are supposed to do one job. Classes are ideally designed to be composited together to do a job. This part requires a little stretching: the use of interfaces with separate implementations that are wired after compile time is a mirror of using standard input and standard output.
Extend this analogy to the network and it is easy to see that the web itself is an extension of this core unix philosophy written large. Programs are connected through standard protocol and work together to produce the intended result. Often these protocols are nothing but the standard text streams that under-gird the unix pipe. In one sense the entire dot-com boom and subsequent internet related advances are all built on these three simple ideas that Doug McIlroy so clearly illustrates.
History proves that if you do observe these rules your long-term prospects are much more favorable than those who do not. Consider the internal design of the systems and what happened to them over time. For example Apple pre and post OS X shift or Windows pre and post VAX infusion.
Now consider the design of Android the OS. Few other OS have so completely embraced this design philosophy and been so completely designed around the web. The intent and subsequent OS design is highly instructive.
Now, what will history prove about this new intersection of the unix philosophy and Java? Time will tell.
EDIT: I added a link to the Android intent java documentation. I felt it would be clearer that I meant the intent java interface not merely the english word "intent" ... this was intended as an aid to casual readers.
2012-05-04
there's engineering and then there's social engineering
Twitter is on the face of it such a simple application that it can be built in 40 minutes in Grails.
For a consummate software engineer with state of the art tools, building a twitter in clone 40 minutes that can scale... might even be easy today given a proper set of cloud enabled tools. But, even if it is an exact working copy of twitter and can scale to ten times the size of the current twitter... it's NOT twitter and will never be twitter.
Twitter is a brand now. It is more than merely a functional set of software performing a simple function. It is a user experience driven not by technology but by human interaction. That experience in itself constitues a wholly separate kind of real engineering.
In fact, if Twitter were to completely rewrite their application in a whole new technology stack, the net effect (providing only nominal interruption of services) would probably be zero on their brand. That's because aside from being a brand and an application Twitter is about doing something very simple (and rather boring to systems level folks) with your computer and with existing internet technologies.
So why aren't there a bajillion Twitter-like applications? Lord knows at one point people tried. Well there are, just none of them are mentioned on CNN and the Colbert Report. There was a time when Twitter was novel and now it's not. But, that doesn't matter anymore because we all know about Twitter.
It's just like Windows. Or Google. Or Kleenex. And all the interesting socio-economic issues that go with the success of a brand like Kleenex and the problems Kleenex has keeping people off its damn lawn. See, you can't call your tissue paper Kleenex because only we can call it that even if what you make basically is in fact the same thing.
You may not care if a tissue paper is of equal quality to Kleenex. You may choose to still buy Kleenex because you remember warm and fuzzy things about that Kleenex box. You know what to expect from a Kleenex box. You like Kleenex. So you buy it.
Whenever you consume something with an intent beyond the mere act of consumption then anything that distracts you from your end goal is a waste of attention. In this day and age we can't afford to pay much attention so we had better conserve it. Your customer service representative is there to try and keep you happy and consuming.
The same is true for technologists just as it is for any other demographic. When a technologist becomes comfortable with his favorite tool X no other tool can be as good as X even if it is better. That's because technologists get brand loyalty viruses too.
So I'm going to ask you... are you sure you like to use the tools in your toolbox because they are the best for the job or are you just buying Kleenex because that's what you buy? Is it worth thinking about?
2012-05-01
Beyond Finite State Automatons
Has someone moved computing beyond the Finite State Automaton in recent years? I'm unaware of anyone providing a system of computation that is in wide spread use that is not a Finite State Automaton. If someone has please let me know. I've heard of some researchers in Tel Aviv working on a neural net.
Considering that all we have to work with, even in the clouds is FSA it seems that we will always be working with conceptual derivatives of the Turing processing tape. So as much as I would like to see something other than a suped up text editor for creating code I have to say while some feel that an IDE is not enough ... I'm afraid it will basically have to do when it comes to what we call "programming" until we escape the confines of the FSA.
Just as a FSA is the core definition of behavior for a computer, DNA is the core definition of a cell's behavior. To be sure it's not all that's going on. There's a whole mess of inputs and environmental factors too. Where do we see similarities between FSA, DNA, and epigenetics? The cloud.
When working with emergent systems, however, the external forces may produce unforeseen effects. These, however, remain in the space of discrete systems. If they did breach into the world of contiguous systems this would pose a problem on the order of the incompleteness theorems. We would have found the bridge to the mystic. Our limited discrete systems would be able to cross the cosmic gulf into to the divine space of contiguous systems.
As of yet, we can't even prove that physical reality can represent irrational numbers. We know they exist. We know they are real... we just can't represent them with atoms, particles, or waves. We slice finer and finer and we hit a generalizable uncertainty principle. So... what does that mean for my programming language?
Basically, it means that all representations of information contained within our universe (as best as we know it) is lossy and discrete. We can't ever create representations of information that is precise as a datum. Instead we can create FSA or algorithms that can compute to arbitrary precision what a particular unit of data is.
Now it is true that we have no sufficient model of computation to properly represent concurrent computing in a fully realized cloud environment. In systems programming discussions today we have to consider the effects of millions of independent FSA acting on a network. The cloud provides us an illusion of locality for things that are not actually local. The fabric of the network and the true concurrency of the cloud exacerbates issues of consistency, atomicity, and durability of systems in the cloud.
In a nutshell that's why we have big moves in NOSQL and in non-relational data stores. These solutions seek to eliminate centralized state. They compartmentalize the state issue inherent in a big ACID compliant data store.
If you don't believe me... have you ever tried to debug a neural net? How about a lisp program? How about a fully distributed system? These are non-trivial problems to debug. In this case I have to agree... an IDE is not enough.
In all of this, I continue to wrestle with one of the original problems my first research projects touched on... the emergent system. Somehow we only have discrete matter and yet it breaches into the world of the divine. The SA breaks local minima an maxima. The EA gives rise to the mystical. And, these are accomplished by interacting FSA which are discrete systems described by strings of symbols yet not fully understood by just these.
Yet this is all we have... and all evidence we have at this is that all the universe itself has are these small planck lengths filled with properties each potentially a cosmic symbol in a universal Finite State Machine.
So I ask... has anyone found something other than this kind of a toy to do computation on? Will anyone ever find such a thing? When will we know all the digits in Pi?
Considering that all we have to work with, even in the clouds is FSA it seems that we will always be working with conceptual derivatives of the Turing processing tape. So as much as I would like to see something other than a suped up text editor for creating code I have to say while some feel that an IDE is not enough ... I'm afraid it will basically have to do when it comes to what we call "programming" until we escape the confines of the FSA.
Just as a FSA is the core definition of behavior for a computer, DNA is the core definition of a cell's behavior. To be sure it's not all that's going on. There's a whole mess of inputs and environmental factors too. Where do we see similarities between FSA, DNA, and epigenetics? The cloud.
When working with emergent systems, however, the external forces may produce unforeseen effects. These, however, remain in the space of discrete systems. If they did breach into the world of contiguous systems this would pose a problem on the order of the incompleteness theorems. We would have found the bridge to the mystic. Our limited discrete systems would be able to cross the cosmic gulf into to the divine space of contiguous systems.
As of yet, we can't even prove that physical reality can represent irrational numbers. We know they exist. We know they are real... we just can't represent them with atoms, particles, or waves. We slice finer and finer and we hit a generalizable uncertainty principle. So... what does that mean for my programming language?
Basically, it means that all representations of information contained within our universe (as best as we know it) is lossy and discrete. We can't ever create representations of information that is precise as a datum. Instead we can create FSA or algorithms that can compute to arbitrary precision what a particular unit of data is.
Now it is true that we have no sufficient model of computation to properly represent concurrent computing in a fully realized cloud environment. In systems programming discussions today we have to consider the effects of millions of independent FSA acting on a network. The cloud provides us an illusion of locality for things that are not actually local. The fabric of the network and the true concurrency of the cloud exacerbates issues of consistency, atomicity, and durability of systems in the cloud.
In a nutshell that's why we have big moves in NOSQL and in non-relational data stores. These solutions seek to eliminate centralized state. They compartmentalize the state issue inherent in a big ACID compliant data store.
If you don't believe me... have you ever tried to debug a neural net? How about a lisp program? How about a fully distributed system? These are non-trivial problems to debug. In this case I have to agree... an IDE is not enough.
In all of this, I continue to wrestle with one of the original problems my first research projects touched on... the emergent system. Somehow we only have discrete matter and yet it breaches into the world of the divine. The SA breaks local minima an maxima. The EA gives rise to the mystical. And, these are accomplished by interacting FSA which are discrete systems described by strings of symbols yet not fully understood by just these.
Yet this is all we have... and all evidence we have at this is that all the universe itself has are these small planck lengths filled with properties each potentially a cosmic symbol in a universal Finite State Machine.
So I ask... has anyone found something other than this kind of a toy to do computation on? Will anyone ever find such a thing? When will we know all the digits in Pi?
Subscribe to:
Posts (Atom)