Exam essay: Virtualization, cloud computing and open source
January 10, 2009
In this essay, I discuss three technologies and their relation to IT strategy: virtualization, cloud computing and open source.
Exam essay: Virtualization, cloud computing and open source
Three technological forces which are currently impacting organizations are virtualization, cloud computing, and open source. Discuss what they are, their impact on business goals and IT strategy.
In this essay, I discuss three factors in IT that are currently impacting or will soon be impacting business goals and IT strategy: virtualization, cloud computing and open source.
Virtualization
Virtualization is a technology which allows one physical server to run many virtual servers. By virtual server, I mean a software container that contains a full operating system and set of software, just like a traditional server would. Each virtual server “believes” that it is running on physical hardware, and thus any software you would run normally on real physical machines will work without modification in a virtual server. Furthermore, each virtual server appears to the outside world – to clients which connect to them – to be traditional physical machines with dedicated hardware. The operating system of the physical server provides hardware interfaces to the virtual servers via simulation in softwre: CPUs, hard drives, network interfaces, displays, CD roms, etc.. The physical server translates I/O to these virtual devices to the I/O to the requisite physical devices, if necessary. The resources allocated to virtual machines can be changed nearly on-demand (disk, memory, network interfaces); most operating systems need merely to be rebooted in order to see the new resources. Limiting factors in how many virtual machines an individual virtualization server can run are simply how much memory, disk space and network bandwidth the virtualization server has available to it, and to how many resources each virtual machine consumes. A very basic virtualization server could host eight to twenty virtualized machines.
Furthermore, most implementations of virtualization technology allow a cluster of physical virtualization servers to seamlessly exchange running virtual machines.
Virtualization is a paradigmatic shift in server resource provisioning for several reasons. First, once an organization has built a cluster of virtualization servers, new server provisioning can happen without an additional capital outlay, without having to have machines shipped, unpacked, tested, racked and provisioned with networking, and with substantially less power consumption. Thus the barrier to entry for new services and servers, and the time to bring a server up from identification of a need to a working system can be dramatically be reduced. Operating costs for datacenters also become dramatically reduced. Secondly, the on-demand provisioning characteristic of virtualization lowers the risk of implementation – if a virtualized server doesn’t work out, we simply delete it. If we didn’t allocate enough RAM or disk to it, we can add more; or if we started with too much RAM or disk, we can reduce it There is no sunk cost (aside from staff time), and no depreciating, aging physical hardware to dispose of . Thirdly, once a cluster of virtualization servers has been built, organizations can experience improved uptime due to being able to move virtual servers from physical server to physical server without impact to the users; this allows us to lower the cost of hardware maintenance and outages.
Among the drawbacks of virtualization are that it can be quite expensive to set up a virtualization cluster. Virtualization machines should be very powerful, with a large amount of computational resources and RAM. They should be SAN backed to enable quick recovery from hardware faults and to enable seamless moving of virtual servers. Managing virtualization servers and virtualized machines requires a new and somewhat different and deeper set of skills than managing individual physical servers, because the environment that virtual servers run in is significantly more complicated, technology-wise, than is that which physical servers run in. Thus we can have fault conditions and problems that we’ve never experienced before. Furthermore, not every service can be virtualized: services which need direct hardware access or very high resource allocation are not good candidates for virtualization. High performance database servers and graphics render farms are good examples of this.
Cloud computing
Cloud computing is really closely related to virtualization technology these days, because virtualization has really enabled the rapid growth in cloud computing offerings. Cloud computing is an outsourcing option for many different kinds of technologies: applications, infrastructure, computing, and development platforms. The term “cloud” is borrowed from its common usage in describing the Internet, and for the same reasons: clouds appear to the user to be homogenous single point entities, but are complex and may be widely distributed geographically and be heterogeneous behind the scenes. Cloud computing has several characteristics that it shares with traditional managed computing:
- A client utilizing a cloud computing vendor trades up front capital outlay for hardware and software with a use-based or monthly fee.
- The client provisions, manages, accesses the resources they use in the cloud via the Internet.
Cloud computing has several characteristics which differentiate it from traditional managed hosting outsourcing.
- It is device and location independent: we should be able to use any device (desktop, wireless device) to access our cloud, and should be able to get to it from anywhere in the world
- It allows on-demand resource provision, much like virtualization does. We can add, change or remove resources as needed and on the fly.
- In some cases (cloud storage, cloud compute clusters), it offers utility computing: per byte or timespan fees instead of monthly.
- It has a flat performance characteristic – we can expect it to generally behave the same over time.
- Resources in the cloud are geographically distributed, and thus we should see increased reliability over managed hosting.
Many different kinds of resources are currently being offered as clouds: server provisioning (www.slicehost.com), applications (Google Apps), compute clustering/scientific computing (Amazon’s EC2), storage (Amazon’s S3), web platforms (Ruby on Rails, Python Django, Java, and PHP web development frameworks are all available as clouds from various vendors). Virtualization on the vendor end is a major technology in providing cloud computing offerings.
In terms of using clouds to deploy services, everything that I said about the benefits of virtualization to the organization are true for cloud computing as well: lower barrier to entry for new services, lowers risk of implementation, lower operating costs. Additionally, the organization forgoes the capital outlay an operating costs for the virtualization cluster and staff skillsets to run it. Clouds offer other advantages since one can get infrastructure (storage and database) and application (e-mail, document suites) and intermittent use computing (Amazon EC2 on-the-fly scientific parallel computing). The usual caveats regarding outsourcing apply: you probably shouldn’t outsource a core competency/service; carefully select your vendor (finances, past performance); be careful in writing your SLA; understand that it can be difficult to bring services back inside once you’ve outsourced them. Although clouds should provide better service reliability than can be provided internally by most firms, nearly every major cloud vendor has had significant service outages.
Open source
Open source software is user designed, implemented and maintained software whose intellectual property (code) is released into the commons and distributed via the Internet for all to use and benefit from. Open source software projects typically do not have a vendor that supports them. Instead they are designed, implemented and maintained by a self-organizing group of volunteer, uncompensated, independent developers and users, and are typically started by one or a handful of developers to solve a personal need. Support comes from volunteers from the community surrounding the software. A firm does not buy open source software.
The philosophy of open source development is embodied by a saying of Linus Torvalds, creator and maintainer of the Linux kernel: “with enough eyes, all bugs are shallow.” The quality of much successfully open source software (the Apache webserver, the Firefox and Mozilla browsers, the Wordpress blogging platform, and the postfix and sendmail SMTP servers are only a handful of the many success stories) shows at least that Torvalds is on to something. Open source software differs substantially from commercial software, where a firm creates software to fill a perceived need of their market, and then sells rights to use the software they’ve created while retaining intellectual property rights.
The decision of a firm to use open source software is an outsourcing decision, much as purchasing and developing commercial software is, and as such, but open source software and commercial software incur the same kinds of considerations that any outsourcing arrangement might have, plus many other common considerations: integration and customization costs, for example. I will therefore consider the impact of open source software on firm and IT strategy as it contrasts with commercial software. The main differences are in cost savings, support (troubleshooting, documentation, bug fixes and feature requests), in project lifecycle, community dynamics, design and focus, project types, and legal issues.
Open source software is free to use, and this is very attractive to firms, because the budget that would be tied up in software licensing fees can be used for other purposes. On the other hand when a firm uses open source software they don’t have a contractual arrangement with the developers. Open source development communities are not beholden to firms who use their products to fulfill the firm’s needs. There are many implications of this lack of a formal relationship. Open source software is supported by volunteers (although there are also some companies that offer support for open source software packages as a value added service), and as such it can sometimes be difficult to find sufficient documentation and training (coding is far sexier than writing documentation). Further, if you submit a development request – bug or feature – it may or may not be fulfilled. On the other hand, if the firm’s IT staff is skilled the fact that the code for open source projects is freely available enables the staff to make their own bug fixes and enhancements, as well as provide examples to learn from. Open source projects may die due to lack of interest, may “fork” (split into two projects).
Open source development communities tend to focus on certain genres of software: internet services (www, ftp, blogging, etc.), operating systems (Linux, *BSD), systems support, development support (eclipse), some desktop applications (OpenOffice, gimp). Large software (data warehousing and OLAP, ERP, CRM and SCM tools) have not been worked on yet, and as such the impact of open source software on the firm will be in subset of all the areas of IT that a firm might have interest in.
There are also legal issues that come with open source, especially for software developers. Much open source software is distributed with software licenses (the GPL for example) that prohibit it from being used in other products without causing the intellectual property of those products to be also released to the commons. Although this has not been challenged in court (to my knowledge), software development firms who wish to profit off of software they’ve built that incorporates open source code should be careful, as this incurs potential risk.

My self-critique:
Once again, this took way too long: 1.5 hours this time. This time, my problem was not that I couldn’t remember the topic, it was that I understand them all too well, and so having to cut down on details while retaining clarity is going to be a problem for me.
During the section on open source, I was very aware that I was over time already, and that affected my thinking. Really, now that I think about it, there are sections in there that I don’t need, and also I should present OSS as opportunities vs risks. I also didn’t really give an indication of what kind of company should use which technology (strategic grid). I used an implied SWOT (skills, weaknesses, opportunities and risks) in the open source section, but my answers need to be organized better.
Open source (beyond usual opportunities/risks of outsourcing):
Cloud computing:
Virtualization
What I also didn’t include, and which I have a really hard time remembering is “future research”. All I can say is what I think should be looked at:
How does virtualization affect IT rollout in an org? Do services get developed faster, systems faster? What is the impact of more systems (virtual and physical) on required staff, skillsets and other IT resources? What services virtualize well, and which don’t, and what characteristics do they have?
For cloud computing, what kind of business models do storage and service clouds allow for? Models for what kinds of services should go to clouds and what shouldn’t.
Open source is easy: impact on business models, how and why does it work, and what motivates people to participate.