Recently, ex-Cassatter (”Cassattian”? “Cassattite”?) turned Ciscoer (uh… Ciscoan?), always-blogger James Urquhart called attention to both the need and a couple of proposals for cloud computing taxonomies/ontologies. ”That would be nice”, I thought, because taxonomies and ontologies can bring a lot of clarity and precision to an otherwise murky, poorly-specified picture. After taking a look at the proposals and discussion, I wondered if we were talking about the same things.
Taxonomies are frameworks for understanding the often-hierarchical organization of related “things”. Probably the most famous and familiar taxonomy is the Linnaean taxonomy of life we all learned in school (you remember: kingdom, phylum, yadda, yadda, yadda, species, sub-species), but almost everything we have and do can be, and often is, organized into taxonomies. With so many claimants touting cloud products and services, potential customers of those products and services (which include nearly everyone producing or consuming IT products and services) could probably use a comprehensive classification system to help put things in their proper context in the cloud ecosystem.
Ontologies offer deeper context than taxonomies. Where a taxonomy provides a classification system, a way of distinguishing and categorizing domain members, an ontology adds formal specification of relationships and interactions between members and classes that can be used to draw inferences beyond mere categorization. Taxonomies often are expressed as trees or tables because they are, in essence, a sorting of members into smaller and smaller groups by successive application of more and more specialized criteria. Ontologies may be expressed as more generally connected graphs or even in one of several formal specification languages. For instance, the World Wide Web Consortium’s Semantic Web project has defined the web ontology language, OWL, as part of an ambitious effort to enable computers to use and “understand” (i.e., reason about) the web. A true cloud ontology could enhance interoperability by specifying roles, relationships, and interactions between cloud domain members so completely and formally as to constitute (or at least facilitate creation of) APIs. Cool, but much more difficult to achieve than a sort into taxonomic groups.
We already have one generally-accepted taxonomy in cloud-space, the SaaS/PaaS/IaaS, or “SPI” taxonomy. It’s simple and clear, but it’s also informal and fails to account for many dimensions and elements of cloud-dom, including such “nuances” as delineating private or internal clouds from the external variety, providing places for components like service “governors”, and criteria for sorting into sub-categories of I, P, and S. In fact, there are a lot more “aaS” categories out there. David Linthicum convincingly lists 10 here (wisely, he doesn’t use the words “taxonomy” or “ontology”); his aaS “framework” includes storage, database, information, process, application, platform, integration, security, management, and testing — which William Vambenepe reorders and wickedly labels “SADIST-PIMP”. Of course, many of these unmapped aspects of the cloud domain are also controversial and/or rapidly-evolving. That makes figuring out whether they fit in a general cloud taxonomy – and if so, how — all the more important.
So, do the new proposed candidates for a cloud taxonomy/ontology help? Sadly, not really.
The Youseff/Butrico/Da Silva paper, somewhat over-titled ”Toward a Unified Ontology of Cloud Computing“, with authors from UCSB and IBM, falls far short of SADIST-PIMP as an enlightening taxonomy, much less blazing a trail toward an applicable ontology. In addition to the subdivision of IaaS into “computation resources” (itself recursively labeled IaaS), “storage”, and “communications” (aren’t storage and communications also infrastructure?), the proposal adds two additional, somewhat discordant, layers to the standard SPI taxonomy: “kernel” and “HaaS” (hardware as a service). Kernel refers to any and all software management of the underlying hardware, including hypervisors and operating systems, but the authors focus most on grid middleware, like Globus, as representative of this layer. HaaS is epitomized, according the the authors, by hardware leases containing SLA terms, but the reference (a CNET article written from IBM’s press release) describes the complete outsourcing of Morgan Stanley’s IT to IBM. The article calls it utility computing, and it may be (Morgan Stanley’s apps and infrastructure were all moving to centralized IBM data centers), but it doesn’t look much like HaaS. Overall, the paper offers little new and useful classification structure and the embellishments actually detract from the clarity of SPI.
Chris Hoff, inspired by SADIST-PIMP and the Youseff/Butrico/Da Silva paper as reported by John Willis, bravely hosted something of a community effort at a “cloud taxonomy and ontology”, seeded by his own “mashup” of the predecessor material but withholding most explanations that might have clarified some otherwise very professional-looking illustrations. Like the Youseff/Butrico/Da Silva paper, the net result is something of an embellishment of SPI, but perhaps a bit more useful as it also contains a representation of a cloud delivery stack (though not necessarily a correct or complete one). Again, despite the title, it doesn’t really constitute a system of classification, so it’s hard to claim it’s a useful taxonomy (beyond the embedded SPI taxonomy aspect), and (probably inherently, as it is only an illustration and not a specification) it is not an ontology. I might find more constructive criticism to offer, but the dearth of description and discussion of what it really means (beyond the blog’s comments, which were apparently truncated by TypePad) make the diagram something of a Rorschach test. Anyone discussing it may be revealing more about themselves than what the concepts suggested by the diagram might actually mean.
I can’t blame the authors for not producing more useful tools. If he were alive, I’m certain Douglas Adams would describe the width and breadth of cloud computing as “big; vastly, hugely, mind-bogglingly big”. Like real clouds (the water-based atmospheric phenomena), it’s not a static thing, and there is tremendous variety as well as lots of confusing things that appear cloud-like, but arguably may just be smoke. I think a genuinely useful cloud taxonomy gets built by first enumerating the fundamental classification principles and defining distinguishing attributes (something I’ll take a whack at in a future post; in the mean time, what do you think that list might include?), then by parsing the membership of the cloud domain, adjusting and adding principles and attributes as necessary, and as we learn and the cloud evolves. Taxonomy isn’t a static thing, either. Ontology, in my opinion, will just have to wait. The general, formal specification of the many manifestations of clouds and their components (and, by association, definition of generic cloud APIs) is premature. Things need to condense more before ontology will be a productive exercise.
I do think some blame (a mild chastisement) is owed to anyone participating in the cloud taxonomy conversation that is not exercising appropriately-high levels of skepticism and insisting on well-defined and valid standards in their frameworks. Taxonomies are thought-shaping tools and bad tools make for bad thinking. One commenter on one of the many blogs echoing/amplifying the taxonomy conversation remarked that some of the diagrams were mere “marketecture” and others warned against special interests warping the framework to suit their own ends. We should all be such critical thinkers.