Manageability by Design – A Definition

November 20, 2009 by Ravi

My friend Iggy Fernandez of Database Specialists and I coined a term “Manageable by Design (MBD)” in the cloud computing context. We define MBD as following:

“An IT System is Manageable By Design (MBD) if it uses Standards, Instrumentation, Interfaces, Automation, Autonomics, and Documentation to facilitate the activities and purposes of IT Service Management (ITSM).”

As stated above, we have identified six objective criteria by which an IT System becomes manageable:

  • Standards
  • Instrumentation
  • Interfaces
  • Automation
  • Autonomics
  • Documentation

Depending upon the complexity of the IT System, some or all of the above may be required to make it manageable. An IT System can also be augmented to make it MBD by the vendor or by a third party. In a cloud computing environment, IT Systems are deployed in an ecosystem which is in a state of continuous flux. This dynamic nature coupled with security concerns makes it imperative that the cloud-deployed IT Systems are manageable.

Standards

Whether they are deployed traditionally or in a cloud, IT Systems need to interact with many other IT Systems for operational purposes. These are operating systems, backup systems, monitoring systems, security sentinels, discovery tools, performance analysis and tuning systems and so on. If standards are created for each of these interactions, the total cost of ownership of IT infrastructure can be brought down substantially. Manageable IT Systems need to implement Instrumentation to provide access to internal parameters, metrics and diagnostics. They need to implement well defined and commonly accepted Interfaces so that they can be operated upon in a uniform manner. Vendors or third parties need to provide standardized Automation tools that surround these IT Systems so that labor costs in operation are reduced. Finally Autonomics should be incorporated which will reduce downtime and the need for expert human intervention.  Standards will lay the foundation for the implementation of these disciplines.

As an example of such a standard, consider the Linux RPM Package Manager utility [1]. This utility defines a standard for packaging software components for installations and upgrades. Applications packaged using RPM format can be installed upgraded and uninstalled using the same RPM utility. This allows a system administrator to automate installations, upgrades and otherwise maintain the IT Systems more efficiently.

Instrumentation

Instrumentation refers to an inherent or extended ability of an IT System to monitor and report its internal parameters, metrics and diagnostics. IT Systems are described using several parameters and its architecture, which are stored in the configuration profiles or in the memory of the processes. IT Systems also contain transient state information regarding the transactions they are processing. In addition, IT Systems generate diagnostic information in the form of error logs and trace files. Instrumentation captures or measures this information and uses Interfaces to deliver it to the consumers. As an example of great Instrumentation, consider Oracle Corporation. Oracle has implemented comprehensive Instrumentation within its flagship RDBMS product that can be used for operational activities.

Interfaces

An Interface is  “…the place at which independent and often unrelated systems meet and act on or communicate with each other” [2].

In the IT domain there are many examples of standard interfaces. For example, USB, Firewire, and CompactFlash are well known standard hardware interfaces.  These interfaces provide a channel for two devices to communicate with each other. However, in the software arena, the interfaces are not standardized and they are usually limited to implementing exchange of data. Standardized Interfaces should provide configuration information, internal states, error logs and trace files, and control functions for startup, shutdown, clone, backup, install, uninstall, upgrade, and patch activities. Interfaces will make automation possible and reduce the need for a team of highly trained professionals for routine operational tasks.

Automation

Automation in the IT System context refers to the technique of making a system operate without human intervention. According to some estimates, labor costs now exceed the cost of IT Systems by an order of magnitude or more. IT personnel spend a lot of time installing, patching, cloning, and troubleshooting. Given proper Interfaces and Instrumentation, many of these tasks can be automated.

Automation is usually not a part of the IT System itself, but a collection of scripts, processes and jobs. Automation leverages existing Interfaces and Instrumentation provided by an IT System. It has the potential to substantially reduce the labor cost in an enterprise application deployment. Some of the common automation tasks are: installation, upgrades, patching, startup and shutdown routines, backups, cloning, etc.


Autonomics

IBM defines autonomic computing in this manner:

“An approach to self-managed computing systems with a minimum of human interference. The term derives from the body’s autonomic nervous system, which controls key functions without conscious awareness or involvement” [3]

Cloud computing requires that the deployed IT Systems be demand elastic. There is a frequent change in the configurations in such an environment.  It is impossible for support personnel to keep track of all the configuration changes, provisioning, and deprovisioning that happens in a cloud. Incident and Problem Management will be even more difficult. Therefore such systems will have to be self aware and be able to perform their tasks without frequent human intervention. IT Systems which have incorporated autonomic computing are introspective, self reconfiguring, continually optimizing, self healing, self protecting, adapting, standards compliant, and demand elastic in nature. Autonomics could be implemented within the IT System itself or implemented externally using the Interfaces and Instrumentation provided by it.

Documentation

Documentation is the most obvious of the six and all IT vendors do provide some documentation along with their products. However, there is no standard for documentation and the quality is not always top notch. The documentation should be context sensitive, indexed and cross referenced. Documentation should also be accessible from the internet. IT system should also provide context sensitive help when appropriate from the same documentation.

References:

1. The Story of RPM by Matt Frye in Redhat Magazine, January 3rd, 2009 (http://magazine.redhat.com/2007/02/08/the-story-of-rpm/)

2. Merriam Webster Online (http://www.merriam-webster.com/dictionary/Interface)

3. Definition of autonomic computing at IBM Research website (http://www.research.ibm.com/autonomic/overview/faqs.html#1)

Cloud Camp in Phoenix

September 11, 2009 by Ravi

We are organizing a Cloud Camp in Phoenix! It is a free all day event, open to all the cloud enthusiasts, vendors, IT consumers etc. Grab your seat at the following link:

http://www.cloudcamp.com/?page_id=1128

Ravi

A tragedy in the cloud

June 13, 2009 by Ravi

According to a news report UK based hosting service company VAServ was a target of  a hacking attack, and as a result, lost data for 100,000 web sites. This is a huge blow to hosting services industry especially those who provide cheap services based on virtualization.

It is not yet clear whether the attack was a result of the carelessness on the part of VAServ or a vulnerability of HyperVM from a company called Lxlabs. According to Lxlabs website, “HyperVM is a multi-platform, multi-tiered, multi-server, multi-virtualization web based application that will allow you to create and manage different Virtual Machines each based on different technologies across machines and platforms.”

What’s truly tragic is that Lxlabs founder, K. T. Ligesh, 32,  committed suicide on 8th of June. As I said earlier, it is not yet clear whether the loss of data at VaServ was due to HyperVM vulnerability or serious security breaches at VaServ. Someone boasted about the exploit at VaServ and claimed it was through simple sniffing and password guessing, and not through HyperVM. If true, it is just goes to show how terrible cybercrime can be.

From such incidents it becomes clear why enterprises will remain weary of the public clouds. Earlier I blogged about public vs private clouds. There is a market for self service clouds like the one offered by VaServ, but for anything more than a small mom and pop operation, it is clearly not enough. A full service (either internal or hosted) private cloud is the only solution. We are reaching a turning point where vendors are beginning to offer Cloud services and it is a matter of time before they offer to convert entire hosted IT services of their clients to private Clouds.

Manageability in Cloud Computing

June 8, 2009 by Ravi

There have been many attempts to define and characterize Cloud Computing recently. NIST (National Institute of Standards and Technology) leads with a draft.

And then there have been some following articles in the blogosphere here and here. And this one appeared before the NIST draft.

What is interesting is that the NIST draft provided the following definition of Cloud Computing:

“Cloud computing is a pay-per-use model for enabling available, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is comprised of five key characteristics, three delivery models, and four deployment models. (emphasis mine)”

And then went on to list the five characteristics:

  • On-demand self-service
  • Ubiquitous network access
  • Location independent resource business
  • Rapid elasticity
  • Pay per use

But they missed out the most important part in my opinion. I have highlighted the manageability in the definition above. Managing the applications, or IT Service Support is one the most expensive factor in the total cost of ownership (TCO). The software license costs and the cost of the infrastructure to support it only forms a part of the TCO. During the lifetime of an enterprise software application majority of the costs are incurred in maintaining and supporting it.

Therefore any Cloud Computing Environment (CCE) should consider the manageability of the applications deployed. If the deployed applications are not manageable by design,  the CCE will not able to manage them autonomically and therefore dramatically increasing the cost of support. Stating it in another way, applications being developed for the Cloud should include manageability as the part of design rather than as an afterthought.

Change Management in the Cloud

May 28, 2009 by Ravi

Change Management is an essential process of any IT department. Change Management ensures that only authorized and carefully considered Changes are implemented. There are planned Changes and there are unplanned or emergency Changes and there is a process to handle both.

Typically the RFC (Request For Change) are raised when one of the following happens:

  1. There is a Problem that needs resolution – RFC raised by Problem Management
  2. There is a vendor supplied patch or upgrade – RFC raised by Operation/Infratsructure team
  3. There is a change in Architecture to address growing needs – RFC raised by Capacity Management
  4. There is emergency which requires a quick fix – Emergency Change raised by Problem or Incident Management

In a Cloud Computing Environment, the requirements are very similar, except in (3) above. Due to Automated Provisioning and Virtualization, Cloud’s promise is rapid elasticity. To ensure that request for new resources are attended to in minutes or hours instead of weeks or months, all the ITIL processes need to suitably modify their functioning. In case of Change Management this is what needs to happen:

1. All provisioning activities follow an established and approved business workflow. In addition it is completely automated.

2. Configuration Management is automatically updated to reflect the Changes.

3. Even regular Changes need to be applied to the Images used by Automated Provisioning.

4. Change Management keep in mind that the Cloud architecture is dynamic by definition, so yesterday’s snapshot may not be good enough for tormorrow’s Change.

Public clouds v/s private clouds

May 27, 2009 by Ravi

One of the major objections to cloud computing has been that it is not secure enough. There is some truth to it and it is not an easy matter to secure an entire enterprise in a public cloud. Given all the apprehensions about the security, privacy, legislation involved, it is safe to say that the deployment of public cloud computing in large enterprises remains a distant dream.

Having said that, I believe the public cloud can greatly benefit the individuals. Cost of ownership of a computer today is unnecessarily high for all the well known reasons. Having to pay only for the computing power, software licenses, storage and networking bandwidth that I actually use is a very compelling proposition. I think over a period of time people will begin to realize the value in cloud computing, just as they did when utility companies began to deliver electricity to the homes. There are some concerns that cloud computing could lead to loss of freedom to choose, but I think those can be managed by proper legislation and also by developing open cloud standards and bill of rights (http://wiki.cloudcommunity.org/wiki/Cloud_Computing_Manifesto).

Private clouds can benefit large enterprises which invest in enormous computing power, network bandwidth and storage. Companies like IBM are developing tools and technology to make it happen. Private clouds will address the security and privacy issues as well as the risk of cloud hosting company going down under. Given a large enough enterprise, private cloud computing can be as cost effective as public cloud computing.

Entities that can benefit from cloud computing:

Large enterprises
Defense organizations
Government agencies
NGOs

Pay as you go

May 27, 2009 by Ravi

I read Dave Malcolm Surgient’s blog on the characteristics of cloud computing. As cloud computing still remains nebulous, this kind of clarity helps everyone understand it a little better. He talks about five characteristics, which I list here:

Characteristic 1: Dynamic computing infrastructure
Characteristic 2: IT service-centric approach
Characteristic 3: Self-service based usage model
Characteristic 4: Minimally or self-managed platform
Characteristic 5: Consumption-based billing

I was particularly struck by Consumption-based billing. What a great idea! When was the last time you paid for a generator installed by your utility? When was the last time you paid for the cable laid by your cable television company? And yet we continue to pay for the CPUs, the hard disks, the network interfaces. Not to mention all the junk that the Microsofts, the Ciscos, the Intels and the rest of them want to put on your PC. If you ever looked at the services running you will notice that most of them are never used. Most of the computing power we purchase is never used.

Imagine if you only need to pay for what you use. Imagine a world where you could plug in a simple device and begin to use the IT service just as you would electricity or a telephone service. You only pay for the storage, processing and the network bandwidth usage. In addition, unlike electricity or cable, you have many competing companies to choose from. This service will be available where ever you are, not just at home.

As an extension, you only pay for software when you use it. ALL computing services will be metered on a pay-as-you-go basis rather than a license per copy with fat yearly support fees. If you are using open source products, there is no need to pay for them ever!

I know this will greatly upset the establishment, such as Microsoft and Oracle. So be it. For too long they have ruled the IT world with outsized profits. Monopolies rule the enterprise and desktop software. This can not go on forever. The open source community has matured sufficiently now that we can do a lot of computing without buying anything from Microsoft or Oracle.