David Rokita - VP Technology Operations, Hexagrid Computing

David Rokita - VP Technology Operations

Say No SOPA/PIPA

The topic of piracy remains near and dear to my heart.  Of course, I am speaking of the digital version of Blackbeard patroling the high seas of the Internet to pillage defenseless multi-national corporations.  As a software provider, my interests in this topic are clear.  My team spends an inordinate amount of time building safe guards into our software to ensure a reasonable degree of protection from these rum swilling savages.  We use encryption keys on top of call-backs on top of secure registration on top of audits toprotect our software.  My life is my intellectual property.  Nothing is more important.

The Stop Online Privacy Act (SOPA) and Protect Intellectual Property Act (PIPA) aim to strike a blow to Internet piracy by fundamentally changing the way that both the Internet and our legal system work.  In general terms, these acts work to shift the burden of proof from the accuser to the accused.  As an intended side effect, it also shift the financial obligation of policing the misappropriation of content to the Internet service providers and web sites.

SOPA defines an entity called a ‘rouge site’ as any site that doesn’t operate under US law.  It further empowers the US Attorney General to obtain a court order without due process to essentially eliminate an ‘offending site’ from the Internet.

A service provider shall take technically feasible and reasonable measures designed to prevent access by its subscribers located within the United States to the foreign infringing site (or portion thereof) that is subject to the order…Such actions shall be taken as expeditiously as possible, but in any case within five days after being served with a copy of the order, or within such time as the court may order.

PIPA takes a similar position but focuses more on domain name service providers and search engines.

These bold new powers should alarm any American.  They run contradictory to basic freedoms and principals; never mind the fact that there are already laws in place to handle such things.  Don’t believe me?  File-sharing site Megaupload was brought down by the FBI on January 18th, 2012 with no help from SOPA and PIPA.  Those of us that have been in the cloud computing market have been fighting tooth and nail against fear of government.  Fear of the law and fear of government hijacking of cloud data is one of the single largest complaints about adopting cloud.  What worse, representatives in congress represent these bill as a means for protecting US jobs and innovation.  Here is Senator Roy Blunts take:

Intellectual property industries employ more than 19 million people, making it an integral part of our economy.  Rogue websites dedicated to the sale and distribution of counterfeit goods and pirated content are a direct threat to these jobs and to entrepreneurs growing and building legitimate businesses online.

(no link as this automated response was sent directly to my email).

Protecting innovation?  The cloud computing market is innovating right now.  It represents the single biggest technology opportunity since the dawn of the Internet.  In just the last year, the US government plunged a dagger into the heart of innovation by orchestrating a shutdown of wikileaks on Amazon.  Now, the US government is holding millions of legal files hostage as they shutdown Megaupload on the allegation of harboring illegal files.  More disturbing, these reckless actions display a fundamental misunderstanding of the Internet and the massive collateral damage that can occur through misguided policies and knee-jerk reactions.  Congress, its business interests and medieval bills are the biggest threat to innovation.

The charade that congress is doing this for me and my protection really pushes this over the edge.  Roy Blunt‘s (and others) automated response to me postured that intellectual piracy is a bad thing.  If you don’t agree with intellectual piracy, then you must agree with SOPA and PIPA.  You are either for us or against us.  This is a rhetorical fallacy; a phoney dillema.  Intellectual property piracy is problem.  We already have laws in place to protect intellectual property and I am reminded of them everytime I put a DVD into the player.  Draconian laws are dangerous.  SOPA and PIPA are not necessary.  Most people reject real piracy (the Captain Hook kind), but most people do not support carpet bombing the coast of Africa to prevent it.

Breaking news as of today states the SOPA/PIPA have been postponed indefinitley.  Maybe I’ll shut up before I get detained indefinitley.

Follow DaveRokita on Twitter Follow DaveRokita

Follow Hexagrid on Twitter Follow Hexagrid

Enhanced by Zemanta
David Rokita - VP Technology Operations, Hexagrid Computing

David Rokita - VP Technology Operations

Watch for Icebergs!

Watch for Icebergs!

Of all the inflated expectations, the most dangerous is the misconception that the cloud never goes down.  Although never claimed by any cloud provider or foundational technology, this belief permeates preconceptions about cloud computing.  In the movie Titanic, passenger Joseph Ismay incredulously says, “But this ship can’t sink!”  A stoic Thomas Andrews replies, “She’s made of iron, sir!  I assure you, she can… and she will. It is a mathematical certainty.”  The beauty of that exchange arises from the self-evident clarity of Andrews response.  The truth stares the passengers in the face like a 10 ton heavy thing, yet it lies outside their ability to perceive; obscured by hopes and dreams of what they wish for the truth.  Imagine the passengers horror upon learning that there were only enough life rafts for less than 50% of the passengers.  It may be a bit of hyperbole to compare tragedy and loss of life to loss of data and loss of compute power, but the absurd consistency that belies both stands valid.  Clouds are built with computers.  It is a certainty that they will go down.  The deleterious tendency to ignore sound application design give way in favor of faith in an ethereal never-failing cloud is a recipe for disaster.

Much to the chagrin of the cloud computing media, Larry Ellison lobbed some missiles across the bow of cloud computing by explaining (paraphrase) the quantum duality of cloud computing; that it both doesn’t exist and that it pervasive.  This monologue may be the most important commentary on cloud computing since the media coined the term.  Larry explains that everything that we do is today cloud computing.  Cloud computing is not some new technology, but rather an evolution in the application of technologies that we already use.  The concept of server (mainframe) virtualization pre-dates many IT professionals career. Modern server technology simply needed to evolve to a point where more resources existed on a single chassis than any single workload could use.  The concept of accessing these workloads across the Internet falls way short of  ‘One giant leap for mankind’.  The laws of physics remain intact, icebergs exist and systems still go down.

Thanks to both the unbelievable success and the frustrating missteps of Amazon, plenty of high-profile source material exists.  Let’s start with server maintenance.  The concept shocks people, but yes major cloud providers may require reboots of serve infrastructure from time to time.  After all, the end product is simply a virtual machine running on-top of a managed hypervisor.  Why wouldn’t it need to be rebooted to pick up patches, changes or upgrades?  Lydia Leong, author of cloudpundit.com, discusses massive reboots of Amazon EC2 infrastructure.  Many make the mistake of comparing the cloud to their homegrown virtualization farm.  I use KVM all of the time.  It is quite simple for me to migrate a workload with no perceivable outage time.  While migrated, I simply patch, change or upgrade the original and move the workload back when finished.  The problem lies in that automating and managing these migration schemes is complex (not to mention extremely sensitive to error).  Complexity does not scale.  Sure, two server running VMWare may avoid reboots via this mechanism for a while.  It also a lot costs more, requires subject matter experts and isn’t delivered as a service.

Unplanned outages happen.  In April, Amazon Web Services suffered a serious outage that affected an entire region of their EC2, RDS and EBS services.  Like most outages, this situation arose from the unknown.  When designing systems, engineers labor over the obvious points of failure and harden the total system from succumbing to those weaknesses.  Engineers can never plan for the unknown.  If you are not familiar with the formal Amazon explanation, I recommend reading it to make sense of the next few sentences.  What becomes clear after reading the Amazon explanation is that at no point did Amazon consider the interconnect between availability zones to be a threat to service across availability zones.  The control plane that handles API calls across availability zones became saturated and the customer’s ability to interact with infrastructure in other availability zones was impacted.  Cloud pundits love to point this out as a failure in the Amazon design; yet, nobody complains that they can control all of their infrastructure from a single dashboard.

Clouds will go down.  This is a simple immutable fact.  That being said, companies need to understand how their partners cloud work and engineer their applications accordingly.   Unfortunately, as the Amazon outage demonstrated, even that approach falls short of ensuring success.  Clouds are designed to get people out of the business of datacenters.   In that endeavor, they succeed.  Their ability to reduce risk becomes a selling point when competing in the cloud market place.  Risk lurks in the most unlikely of places.  When computers are involved, risk can never be reduced to zero.  Don’t get caught without a lifeboat.

Follow DaveRokita on Twitter Follow DaveRokita

Follow Hexagrid on Twitter Follow Hexagrid

Enhanced by Zemanta
David Rokita - VP Technology Operations, Hexagrid Computing

David Rokita - VP Technology Operations

The Evil Triangle

The Evil Triangle

Back in October I put a brief commentary on the state of cloud and its current alignment on the Gartner Hype Cycle.  At the time, the general thought was that the industry had moved past the Peak of Inflated Expectations and into the downward spiral to the Trough of Disillusionment.  Although at arms length, the Trough of Disillusionment is not yet upon us.  In that, I would like to bring to the forefront some of my favorite expectations brought back from this peak.  On the Peak of Inflated Expectations, white-board engineering and “whitepapers” reign supreme and the view is remarkable.

Simple misunderstandings of cloud computing infect the paradigm of cloud consumers, pundits and commentators.  This episode of The Peak of Inflated Expectations involves a simple yet pervasive rule that governs everything that we do in technology.  This rule states that systems that are affordable, fault-tolerant and high-performance simply do not exist.  It is entirely possible to construct a system that incorporates any two of these factors.  The Evil Triangle is no more obtainable than the impossible Penrose Triangle.  In the minds of the average cloud computing consumer and much to the surprise of the Evil Triangle, cloud computing supposedly vanquished this enemy of progress once and for all.  Taking some liberties Mark Twain’s quote, Reports of [Evil Triangle's] death are greatly exaggerated.

I am introducing this series of blogs with the Evil Triangle because it lies at the root of nearly all faulty expectation of cloud.  When confronted by the Evil Triangle, most experienced IT engineers and managers readily accept its limitations.  When obscured by clouds, the Evil Triangle creates chaos.  As you know, “The greatest trick [Evil Triangle] ever pulled was convincing the world he didn’t exist.” Amazon’s cloud services provide a perfect place to illustrate the power of the Evil Triangle.  When it comes to general purpose Infrastructure as a Service, few would argue that Amazon is a pioneer if not the world leader in offering cloud services.  For this example we will look at various service offerings and the method in which they negotiate the Evil Triangle.

Amazon S3 Storage: S3 storage, Simple Storage Service, is the central unifying fabric that connects the entire Amazon cloud services offering.  It offers a massively scalable object store for storing images, backups and general purpose unstructured data.  From a total cost of ownership perspective, $0.14/month (used) per gig for the standard replication and redundancy, S3 storage is extremely cost effective.  ”Designed to provide 99.999999999% durability and 99.99% availability of objects over a given year.”, S3 offers an unprecedented level of fault-tolerance for the price.  ”Perfect”, shouts the overly excited IT admin, “I’ll host all of my VM images there so I never have to worry about data loss”.  ”Not so fast”, replies the Evil Triangle, “S3 storage is API driven and will never support the I/O that is required.”.  S3 storage successfully fulfills the requirement for affordability and fault-tolerance, but at the cost provided is unable to meet the demand of runtime operation (for reason including but not limited to performance).  It must be mentioned that data-out charges for transfers could adversely affect this price.

Amazon EBS Storage: Amazon Elastic Block Service offers a low-latency block level storage option for facilitation higher I/O workloads.  It may be successfully used for hosting transactional database workloads.  It seems to meet the performance requirement.  What about fault tolerance?  Amazon explains that EBS volumes are replicated within an availability zone.  This means that within a closed system the EBS volume can sustain single failures.  It goes further to suggest that customer concerned about durability have “the ability to create point-in-time consistent snapshots of your volumes that are then stored in Amazon S3“, ensuring that the data is persistent.  At $0.10 per gig, EBS reigns supreme as the vanquisher of the Evil Triangle.  ”Wait” bellows the Evil Triangle as it pulls out it slide rule.  When factored in, the requirements for bandwidth and S3 storage (to meet the durability recommendations of Amazon), the picture is not so clear.  To use Amazon’s example:

As an example, a medium sized website database might be 100 GB in size and expect to average 100 I/Os per second over the course of a month. This would translate to $10 per month in storage costs (100 GB x $0.10/month), and approximately $26 per month in request costs (~2.6 million seconds/month x 100 I/O per second * $0.10 per million I/O).

Now let’s augment this estimate with S3 snapshots.  Assuming that we keep a single daily snapshot to be replicated across S3 availability zones, add $14.00 per month for a grand total of $40 dollars for 100GB ($0.40 per gig).  It makes for fault tolerant storage provided a 24 hour point of recovery time is acceptable.  It is not cheap.  ”Check and mate”, exclaims the Evil Triangle.

Amazon EC2: Amazon’s rightfully does not market Elastic Compute Cloud as a strict storage option, but since storage represents an inextricable part of compute, it will be considered here.  Not to speculate to much into the physical design of EC2, by all accounts it appears to exist on a locally available disk array using some form of software/hardware RAID.  It is easily the highest performance option for all of the Amazon offerings.  Being shared infrastructure, it has been documented to suffer at the behest of competing workloads.  Overall one can expect a higher performance profile form EC2 than any other Amazon storage offering.  Using an Extra Large Linux server as an example, the price is $0.68 per hour.  Multiplied out ($0.68 x 24 hours x 30 days), the price equals $489.60 per month.  This offering come with 1,690 GB of storage.  Using brute force, this comes to ~$0.29 per GB/month.  This in no way considers the value provided by the 8 EC2 compute units (2 cores with 2 vCPUs each) and 15GB of RAM.  EC2 presents the most cost effective way to store data.  As the Evil Triangle pulls the plug on the EC2 server it asks, “How do you like me now?”.  There is no concept of perpetual storage, durability or fault-tolerance in the EC2 offering.

A casual observer might suggest that this article intends to point out the weaknesses with the Amazon offering.  Quite to the contrary, this article intends to expatiate on the sheer brilliance of the Amazon offering.  These example illustrate a company that understands that the Evil Triangle lurks behind every corner of their datacenters.  Rather than fear the Evil Triangle, Amazon embraced it.  By doing that, they built a comprehensive approach that solves a myriad of problems using different solutions.  At no point in time does Amazon claim that any single solution is perfect for all problems.  Expectations must be level-set against the Evil Triangle.  These may be the most lessons as the cloud computing tools journeys through the Trough of Disillusionment on its way to the Plateau of Productivity.

Follow DaveRokita on Twitter Follow DaveRokita

Follow Hexagrid on Twitter Follow Hexagrid

Related articles
Enhanced by Zemanta
David Rokita - VP Technology Operations, Hexagrid Computing

David Rokita - VP Technology Operations

Gartner's hype curve

The Gartner Hype Cycle

When working with Hexagrid customers, my role is to work at the highest levels of a customer’stechnology organization to level set expectations and goals concerning the adoption of cloud computing.  What I often find uncover in these conversations are expectations that, although possible, may not be realistic. In fact, I find that the single biggest inhibitor to achieving success through cloud computing comes from expecting too much.

Following the Gartner Hype Cycle means crossing the Peak of Inflated Expectations.  At no time in the past has the Peak of Inflated Expectations towered so high.  At the peak of inflated expectations, outages will never occur again and software upgrades happen automatically and one tool to rule them all.  In this utopian place, lower CAPEX means zero CAPEX and disaster recovery  just happens.  The Peak of Expectations invariably gives way to the Trough of Disillusionment.  Here the laws of physics still apply, failures occur and sound application design still maintains a purpose.

According to Gartner’s analysis, the market is moving just beyond the Peak of Inflated Expectations.  The deafening roar of a hyperactive media coupled with under-equipped/over-hyped vendor products make slide to the Trough of Dissilutionment an ominous one.  Smart companies will start measuring their assumptions.  If they do, they can avoid becoming an unfortunate example of over-hyped expectations.

Follow DaveRokita on Twitter Follow DaveRokita

Follow Hexagrid on Twitter Follow Hexagrid

Enhanced by Zemanta

Tara Kinney, Director of Communications

After years of cloud innovation, public versus private remains the focal point of enterprise cloud evaluations despite the hybrid cloud alternative which would likely accelerate cloud adoption.   

As reported by Neal Weinberg of Network World, Daryl Plummer advised enterprises at Gartner’s IT Symposium last week to consider public cloud services first and turn to private clouds only if the public cloud fails to meet their needs.  He goes on to mention that the inability to get desired SLAs, issues with regulatory compliance, concerns about disaster recovery, and realizing that cloud might not save money are reasons to not to take the cloud route.  Couple this with his warning about the complexities of cost analysis and the potential need for a cloud broker?  Sounds complicated.  Isn’t there an easier way yet?

These are all concerns primarily associated with public cloud which is why I’m surprised by Daryl’s “public cloud first” comment when a hybrid private cloud may mitigate these concerns and retain valuable public cloud support services.  Hexagrid, like other cloud enabling companies, offers service providers a way to deliver hosted private clouds as an alternative to the traditional public cloud.  Hosted private clouds provide dedicated hardware, secured to meet enterprise specifications, and can be bundled with support services.    

According to a PricewaterhouseCoopers (PWC) study, “growing demand for Infrastructure-as-a-Service in the enterprise market is creating opportunities for a new breed of cloud-based IT infrastructure providers to deliver innovative private cloud services.”  The study also estimates that by 2014, the typical IT infrastructure for today’s customers of outsourced IT services will rely on cloud computing for more than a third of their IT resource base.”  At Hexagrid, we agree that both service providers and enterprise benefit from accelerated adoption of hybrid private cloud models to achieve the advantages of Infrastructure as a Service. 

Gartner expects the $4.2 billion Infrastructure as a Service industry to grow by over 48% in the next 5 years which parallels the PWC study findings that “77 percent of respondents already have a plan and 64 percent said some type of cloud, including private and public, would be the best way to manage IT infrastructure in three years.”  Based on my understanding of the concerns and challenges about cloud, it seems that the third option of hybrid clouds might best meet many market needs, mitigate many enterprise concerns, and open doors for increasing service provider profitability.

David Rokita - VP Technology Operations, Hexagrid Computing

David Rokita - VP Technology Operations

Scalability reamins a hot topic when discussing cloud computing, yet the basis of the conversation is often misguided.  Many believe that the issues of scalability originate from limitations in technology.  In reality, inability to scale is almost always a factor of cost.  When we really understand Amazon’s cloud computing model, started with a commodity technology and then removed all of the complexity of traditional enterprise IT to create the ‘illusion’ of infinite scalability.  Through simplicity and commodity, Amazon reduced cost and therefore achieved scale.  This makes for a valuable lesson to those that have aspirations in the cloud.

Having discussed the fiscal challenge to scalability, the usual suspects (HP, IBM, VMWare, Oracle) all have something in common.  Enterprise software is bad for enterprise customers and vendors will not change for obvious reasons.  For all the money blown on AIX, Solaris, HP-UX and Windows, these products fail to chart year after year on the Top 500 Super Computer List.  Only 9% of these are non-linux (45 of 500) and the top 10 are Linux.  Some of this is likely attributed to technology, but more than anything it is a factor of cost.  Think about it, the majority of these projects are funded university projects or other pure technology-for-the-sake-of-technology endeavors.  Budgets are tight and there are no retries for overshooting your mark.  By using a completely free (or manageable cost) platform, they have removed a significant bottleneck to scale.  When applying this lesson to cloud computing, one must question enterprise software’s viability in a truly scalable cloud.  This is the same enterprise software that failed to solve any real problems other than the ones it caused in the first place.  Of the biggest cloud players (Amazon, Rackspace, GoGrid), none of these are built on these products.  My hunch is that it is purely due to cost considerations.  If the cloud doesn’t scale financially, it doesn’t scale technically.  Remember, cloud computing is a technology that if done right should become less expensive (per unit of measurement) as it grows.  Amazon’s $40 per gig of RAM is a tough number to hit using enterprise class software.

Those of that are building clouds right now will be judged not against what was done yesterday, but rather against what competitors are doing today.  It is human nature to marvel at the money saved with VMWare as compared to old-school IT discipline.  5 years ago it was about how much money I could save by ditching Windows in favor of Linux.  Then it was how money I would save ditching Weblogic in favor of JBOSS.  Likewise, there will be a day where VMWare is no longer the savior of the enterprise, but the enemy.  When that day comes Linux/KVM will be there.

*** The author is aware that GoGrid, Amazon and Rackspace use Xen, but there is reason to believe that the winds of change are blowing in KVMs favor.  KVM was not a viable production option when these services were first created.

Follow DaveRokita on Twitter Follow DaveRokita

Follow Hexagrid on Twitter Follow Hexagrid

Enhanced by Zemanta
Get Ripped!

Get Ripped! Image via Wikipedia

David Rokita - VP Technology Operations, Hexagrid Computing

David Rokita - VP Technology Operations

Several months ago, Information Week emblazoned the term “Innovation Atrophy“.  Innovation Atrophy is the grand result of too many years of trying to save money through IT.  The author, Chris Murphy, explains that year-over-year squeezing of ROI out of IT creates a negative drain on an organizations ability to innovate.  Either through fear, contempt or plain old boredom this “beat down” takes a toll.  It comes in many forms: we are a [blank] company, not a technology company, buy for parity and build for competitive advantage, no technology for the sake of technology.  At times these phrases are steadfast maintaining of maintaining the course.  At other times it is simply short-sighted.  The key to success is to know when to check the bet or raise the stakes.

Cloud computing (Iaas) offers one of those rare opportunities to move the needle back towards innovation.  In the past, innovation posed a double headed threat.  If the initiative became successful, mobilizing the human and financial resources to meet the demand posed a challenge.  If unsuccessful, invested assets were incapacitated or lost.  Business respond to this risk with analysis paralysis.  The stakes are so high, that no business is willing to take the risk required to innovate.  The physicist Joseph F. Engleberger claimed that innovation only requires three things: A recognized need, competent people with relevent technology and financial support. Unfortunately, the nature of business requires that financial support be placed before the others.

In perfect world, businesses freely create ideas and throw them against the wall to ‘see if they stick’.  Those that stick would receive financial backing and grow accordingly.  With cloud computing (IaaS), the opportunity throw ideas against the proverbial wall isn’t free, but it is considerably more cost effective.  It becomes simple to build virtual infrastructure one day, and tear it down the next.  Only for the time that the infrastructure is up does the organization incur cost.  Parallel environments can be spun up when needed and turned down when not in use.  Shelved initiatives simply sit in the bull-pen while resources are freed for the next big thing.  Fear and uncertainty will still remain, but it just may be enough for an organization to start flexing those innovation muscles again.

Related articles

Follow DaveRokita on Twitter Follow DaveRokita

Follow Hexagrid on Twitter Follow Hexagrid

Enhanced by Zemanta
David Rokita - VP Technology Operations, Hexagrid Computing

David Rokita - VP Technology Operations

In addition to working technology, I dabble in real-estate.  Over the course of a few years, I have bought, sold and held onto several rental properties.  One time I purchased a condo unit from a frustrated owner that was having trouble covering the spread between the rent he could collect and the mortgage, taxes and association fees.  He was right, based on the downward trending economy, the rent he could collect was breaking even with the expenses and overall the property was operating at a bit of a loss.  I bought the property, and then proceeded to spend $700 dollars on a sofa, chair, dining room table, bed and dresser.  I advertised the property for a 50% increase as fully furnished and lived happily ever after.  I love marveling at my own ingenuity, but this was hardly genius.  People pay more when they get more, and understanding this is key when navigating changing markets.

A common myth among data-center facilities providers states that the new cloud economy will drive cloud providers to their data-centers in droves.  They need only to sit back and wait for the deluge of customers to flock to their data-centers.  To certain extent, the logic is sound.  Cloud providers will bring business into their data-centers, but a very important characteristic of cloud will prevent this from having the intended affect.  The plan to sit back, fat and happy, selling picks and shovels to the prospectors of the new gold rush is a fool-hardy strategy.  Unless these data-center providers adjust their business model, they may just find themselves irrelevant.  Smart data-centers will get out of the business of power, ping and pipe and transition to selling CPU, memory, network and disk.

The key to understanding this conundrum lies in the very same reason that end-users are evaluating cloud computing models.  Ultimately, these customers want clouds because they maximize efficiency.  This efficiency permeates through power consumption, space consumption and complexity.  Of these, data-center providers enjoy the biggest margins on the floorspace.  If their customers are building their clouds right, floorspace consumption will dissipate radically (as well as power consumption).  With the density of servers and the advent of extended memory management, it becomes possible to pack a 1.5 terabyte RAM cloud computing behemoth into 7U in a rack.  This space allocation even includes all required switching.

The last time I had this conversation with a data-center provider, it was explained to me that I didn’t understand.  These guys have been fighting Moore’s law for years, and $1250 is a good price for a rack if you can get it.  Moore’s law or not, server efficiency rarely topped 25%.  Cloud computing at scale easily achieves 80% utlization, if not more.  I realize that I am performing surgery with a meat cleaver here, but lets just take best case scenario described above.  The aforementioned 1.5 terabyte cloud, at an assumed Amazon equivalent price of ~$37.80 per gig of RAM per month for open source Linux servers.  Since the set up takes 7U, the conservative view says that we can fit 2 clouds in one rack.

(1500 GB * .80 * $37.80 * 2) = $90,000 per rack per month (approximately)

Ok, again, this was brute force.  For a variety of reasons, this is an extreme example.  Examination of a more practical example still yields an extremely compelling story.  The names have been changed to protect the innocent, but the following represents the base configuration of a cloud built by a Hexagrid partner.  This particular partner opted to use more traditional hardware as opposed to a more consolidated blade configuration.  In this configuration, 3 compute nodes with 96 gigabytes of RAM were dedicated to running VMs.  With included storage and all of the redundancy, 2 of this installation will fit in a rack.

(288 GB * .80 * $37.80 *2) = $17,418

This makes for an extremely compelling case financial case considering that IaaS services from other providers like Amazon and Rackspace are threatening to take the business and the clients that you do win will take less space this year than they did last year.

Yes, changing the data-center business model is a big scary thing.  Fortunately, the automation tools, portals and practices have already been worked out.  Sure, the workforce may need a bit of restructuring and the sales team will require some training.  You might even need to hug a hardware vendor.  You can do this and you can be very successful.  It beats selling wagon wheels in the age of the internal combustion engine.

Follow DaveRokita on Twitter Follow DaveRokita

Follow Hexagrid on Twitter Follow Hexagrid

Josh Restivo

Approximately six years ago, I testified as an expert in court for the very first time (it was a civil pre-trial motion – the perfect place to cut your teeth as an expert witness).  In answering a question for opposing counsel, I delivered a response that surprised even myself and which has continued to guide me to this day. I was enumerating all of the sundry measures that forensic computer examiners take to ensure that they do not alter or lose the data with which they are working. In my best techie->english translation, I prattled on about duplicate cryptographically-assured hard drive copies, hardware-based isolation of hard drive write commands from the host system, etc.  After hearing all of this, opposing counsel asked a painfully simple yet profound question. “Even with all of that process and all of your expertise,” he paused, setting up for the kill, “isn’t it still possible for someone like yourself to lose or alter the data that you’re working with?”

When delivering answers in court, you must provide a concise answer to the question as asked even if you know that, in a sane world, it should be wrapped in a thick blanket of context and caveats. The myriad implications of the answer I was about to give raced through my mind. My first reaction was to deliver a technical dismissal of the question based on the unlikeliness of such a failure. But, his question was too simple, too austere. Anything other than an equally austere retort would have been taken as evasive and suspect. In some fraction of a microsecond, reason seized control of my vocal chords and facial muscles and, before I really had time to consider it, I’d uttered my response: “Sure. Lightning strikes.”

In the end, we won that pretrial motion and opposing counsel, who’d kept me on the stand for an hour-and-a-half, became a client of mine shortly thereafter. Many technical people are heavily inclined to believe that, given a sufficient degree of redundancy at all levels, the systems that they build are impervious to failure. I was once one of these people. Unfortunately, lightning does strike. It strikes in random places, at random times, and does an unpredictable amount of damage. Similarly, both hardware and process can fail in totally unpredictable ways regardless of the amount of failure-condition logic surrounding it.

Herein lies the heart of the problem. As engineers, we plan for predictable failures. If our organization is well-oiled, we plan for lots of predictable failures. We intricately configure logical protocols to respond to physical failures, we trigger physical devices in the event of logical failures. We route packets around failed routers and switches, often way around because of some overlooked interdependency. We have backup generators in the event of a power failure. Though, since the power fails so rarely in North America, we sometimes find that the generators refuse to start when needed or that they are out of gas. So, we add more processes, more management and oversight procedures, we bolt on more monitoring gadgets (each of which must be redundant, of course) and what happens? Someone forgets to check the oil in the generators, they seize while in use and the din of human folly beats with renewed vigor.

Anything that is difficult to configure initially can be exponentially more difficult to troubleshoot in stressful situations. I’ve witnessed too many outages related to redundant hardware implementations where, after an inordinate amount of time was spent troubleshooting bizarre behavior between two devices, the call was made to pull the plug on one of the systems in order to restore service.  Of course, during the next maintenance window, the engineers and support staff would sort out the mess and set everything back up ‘correctly’ again. That is to say, they put it right back in the state it was in prior to the last mess. It should be of no surprise to anyone when, during the inevitable next outage, someone has to go pull the plug once more to stabilize everything.

At this point, I must make it clear that I’m not advocating for elimination of redundancy in systems design. It has a place. The whole of the computer science field, however, needs to think long and hard about the many adverse implications of its use. Quantitative risk assessment is often undertaken in larger environments to determine where redundant systems are needed. This exercise has value. I’m not aware, though, of any organization which has performed a proper risk assessment focused on the fully quantified complexities introduced in their effort to achieve resiliency. I’ve only been witness to the unfortunate results.

The misguided trumpets of marketing have been sounding loudly in the cloud industry as of late: five-9’s up-time, ultra-secure data centers powered from divergent power sources with six generators on-site and a secret bag of magic dust that can power the whole place for months, if need be. Accordingly, we’re beginning to see tremendous interest in need-blind redundancy of all systems within the cloud. Given that cloud, when done correctly, brings with it inherent resiliency which should suffice for all but the most mind-bendingly critical applications, I think it’s high time to stop shoveling out cash along with the good graces of our families to those vendors and solution providers who are perpetuating FUD and who, in the end, have arguably contributed far more to system downtime and late-night stress than the occasional failed power supply.

Before you design your cloud environment or before you buy into one that has been designed for you, take a long hard look at the added complexity of the fault-tolerance features that have been factored in. If they took hours or even days to configure properly when installed in a stable pre-production environment, expect it to take no less time to troubleshoot the fickle monsters when they’re misbehaving while passing live data.

Follow DaveRokita on Twitter Follow JoshRestivo

Follow Hexagrid on Twitter Follow Hexagrid

Josh Restivo

A recent encounter with the General Manager of a rather large and well-known book-purveyor-turned-public-cloud-giant reaffirmed the validity of a common but oft-ignored design principal. In my brief interaction with this gentleman, he misunderstood a statement that I’d made about our (Hexagrid’s) VxDatacenter product. Specifically, when asked whether we had our own hypervisor, I replied, “Yes”. His, admittedly understandable, interpretation was that we’d coded our own hypervisor. He promptly turned around and walked away with a flippant, “Good luck with that!”.

Since we’re competitors, there was really no point in chasing him down to explain what I’d meant. Some cloud solutions are crude bring-your-own-hypervisor management layers which leave much of the low-level optimization and configuration to the client. Our product includes the hypervisor (KVM) in order to streamline installation and configuration while providing a more comprehensive approach to cloud delivery. So, for purposes of casual conversation, our product does have its own hypervisor – you can’t bring yours. Of course, we didn’t code KVM but it’s a fundamental component of our product.

Awkward exchange aside, I got to thinking about this man’s overall response. Though it wasn’t entirely professional, in one trite parting remark the GM of an undeniably successful company reinforced a principle that’s so easy to forget when developing something as complex as a cloud solution: K.I.S.S. (not the rock band, the old adage about keeping it simple).

This, of course, applies to ‘reinvent the wheel’ efforts such as writing your own hypervisor. It’s also highly relevant, in a broader sense, within the cloud industry. Those who are just getting into cloud continue to arrive with expectations of a panacea – built-in advanced security features, host-level backup tools, OS-level incremental backup capability, highly configurable fully-integrated multi-model billing, wysiwyg web portal editing, and a native remote-monitoring platform that ties in seamlessly with their existing monitoring solution.

Having had conversations with individuals who’ve expressed dismay at the omission of one of these “must-have” features in our product and in light of my recent encounter, I’m reminded of a timely development over at everyone’s favorite Redmond-based software company. Approximately 13,500 years ago (in Internet time) that company began its quest to become everything to everyone by promoting its graphics-driven user interface as the answer to all problems.

On multiple occasions, they attempted declare the simple old command-line a relic of computing history. Efforts were made to limit its use and to remove it entirely. As computing systems became increasingly complex, however, the combination of buttons, icons, multi-level nested trees, beeps, bells and split-frame windows became ever-more labyrinthine. In 2011, this same software company who had, along with its many followers, spent over 20 years decrying the primitiveness of the command-line announced that their newest version of software would be able to run without a graphical interface. Why? Ease of management. Or, in other words, simplicity.

They’d spent quite a lot of time and effort reinventing the user-interface ‘wheel’ and discovered, in the end, that their solution was not universally applicable. Had they kept it simple from the beginning and continued to grow the power and flexibility of their command-line interface I believe that the OS landscape would look considerably different today.

To parallel this with the still formative cloud industry, the winning solutions will be those that maintain focus on core principles – efficiency, manageability, scalability – while delivering simple, open and effective avenues for customers to integrate purpose-built monitoring, security, backup and billing systems that work well for their organizations. Cloud solutions must not devolve into an everything-for-everyone endeavor, they must simply enable their owners to realize efficiency while otherwise going on about their business.

Follow DaveRokita on Twitter Follow JoshRestivo

Follow Hexagrid on Twitter Follow Hexagrid

WordPress Appliance - Powered by TurnKey Linux