Iaas or Infrastructure as a Service tries to provide a way to quickly and easily provision VMs. It can be implemented in a private data center or rented in a public cloud. It allows you to treat hardware as a pool from which you just take the resources you need. In its simplest form, it is as simple as putting in a few spec numbers, your ssh-key, and clicking create. In fact, this is how most public cloud and VM vendors look like. While they may offer more advanced features or services (load balancing and other network functions, backups, scaling, serverless etc…), their main selling point is that it requires very little knowledge and some money to do stuff that 20 years ago took a team of experts to operate.
For reference, as recent as 2005 if you wanted a dual-core server, you got a server with 2 physical CPUs. If you needed faster per-server performance you spent a fortune on a shiny 6 or 8 CPU slot motherboard. Nowadays due to multicore being the norm, colocation with both “virtual” and “true” CPU cores become much more economical.
How infrastructure looks at different scales
For a startup, it’s easy to start with some inexpensive VM, or with a server somewhere in the room. Heck, even a small-medium company with large internal IT demands can survive with an under-the-staircase 1 rack server room. But once you start growing, things will get complicated. When you pass one room’s worth of server racks, server management is difficult. At least until you get to the size where your scale or growth can justify purpose-built halls and whole datacenters
The same happens to people, the more hardware or software you add more needs to be managed. And more people you need to work full time just to keep everything operational. Then keeping the status quo can become quite costly. And this is the main appeal of the cloud. They are not trying to be cheaper than Linux neckbeard that lives in his company’s basement datacenter. They are offering computing scale, at a linear or decreasing price that you can flexibly change.
You pay one price for not caring about the location (property is expensive, especially if you need a whole datacenter), installing, hosting, electricity or labor. In the long term, you might save money by doing it yourself. But with a rented cloud, your costs are rising with your needs, not with your long-term capital investments. Former is great for bootstrapping your service, latter for a startup with an expected capital injection.
What is Infrastructure as a service all about
You might think: “Wait, why are you selling me Cloud right now? Aren’t you supposed to be anti buzz and exposing the other side of this kind of calculation!”. Well yes, and no!
IaaS means that for your stakeholders (admins, DevOps, programmers, team leads…), infrastructure is just another service. They use Discord for memes, youtube for cat videos during long compile times, Gmail for emails, GitHub for code, and your IaaS platform for “servers”. Service in a sense that they don’t know where it’s located, no idea how or what it’s running. They have some vague ideas of fast storage and what is an adequate number of CPUs or RAMs or IPs. But they are blissfully ignorant of all the details or proper way to refer to those resources.
They don’t care, because they shouldn’t. Pay people to care for the stuff, but everybody else should seamlessly just use the product of that care. This is not “Treat your sysadmins like the trash they are”. It’s more of an “Everything that is not a problem in their code, is a distraction to the programmer”. And while this blissful seamless cloud is something they try to sell you, it’s also something you can implement yourself. Provided that you have, or plan to have the need to provision at least a couple of VMs/servers per month.
IaaS is not same as having 100% of infrastructure behind a Web UI
It takes some knowledge and a bit of time. Your end goal is not to compete against AWS or Google Cloud. Your end goal is to trivialize your most common system needs.
For example, your team is mostly building web apps that are backed by a database for a few hundred users. For their normal testing and production purposes, such workloads will usually take 2/4 cores, 4-16GB of RAM, and 40-120GB of storage. Start by scripting or templating the creation of such instances. The next step would be to either set up a base OS image or a script to bootstrap it. Now, provide a way for users to edit and remove instances from a Web UI. Congratulations you now have a basic IaaS Cloud!
Of course, if you’re upgrading your legacy systems to an IaaS inspired architecture it will take more time. Keep in mind that not everything has to be migrated right now. It often happens that a few use-cases cover the majority of all running instances. Provide end-users the path of least resistance. A few clicks are much easier than contacting and waiting for operations to spin up an instance. This will save time for everybody involved. And also leaves more time for your operations to work on improving current systems and tackling more custom cases.
Just keep in mind that IaaS is a concept and not an end goal. It’s something that has to be taken as far as it makes sense for your organization. A similar but slightly more focused concept than Iaas is IaC or Infrastructure as Code. IaC strives to document and write down your whole infrastructure. It can be a nice complementary step towards implementing IaaS inside your organization.
IaaS – I(aaS)as a Silver Bullet
This is one of those concepts that are similar but different when implemented on-premise or rented from a cloud provider. Every organization and use case has different needs and therefore can differently get satisfied by on-premise and cloud provider offerings. Therefore we will now take a quick look at both of those cases. So now, in the end, we will see a few benefits and disadvantages that Iaas can bring to your organization.
This means you either implement or buy a solution. Sed solution you then manage your own hardware in your own datacenters. Alternatively, this may include renting barebone servers, VMs, or collocation depending on your needs.
- Decoupling of resources and teams that consume them – Your end-users (programmers, team leads, departments) don’t actually care where their workload is running as long as it works and has the resources it needs. Using virtualized workloads (or small servers), you can easily migrate workloads to different servers as new ones appear or die. This enables you to automatically migrate workloads if the host goes down. And this helps improve overall utilization.
- Utilization of old hardware – assuming you are not space-constrained. Try to treat all old hardware as just numbers in a resource pool. By doing so you reduce new hardware purchases to services where current speed and features are not sufficient.
- Automated/instantaneous provisioning – By completely automating the process, you end up with something that anyone can do. This saves time, allows faster scaling, and frees both sides to actually work on what matters to their project.
- Change of mentality – For most organizations moving to IaaS may involve a significant change of mentality. For both private and public options, it will involve changing both communication channels and individual responsibilities. Shift to Iaas can cause significant resistance with your staff. This must be taken into account when assessing is IaaS the right way for your organization.
- Won’t fix the overall issue with provisioning – IaaS assumes your teams treat the Infrastructure as a waiter in a restaurant. If your team doesn’t know what they want to consume, the service cannot deliver what they want. If you experience issues planning capacity or understanding your current needs, monitoring and analyzing the current systems should come before you decide on IaaS.
- Time to build & cost to maintain – It’s not a trivial change for most organizations, and it will require a dedicated staff of experts to build and maintain. Also, your IaaS has to keep evolving in order to keep pace with the changing needs of the organization.
- Requires scale to evaluate and make sense – This is an investment that increasingly makes more sense as your IT needs grow. Not every company has a direct correlation between company growth and IT needs growth. For smaller companies, and companies not highly IT-oriented it might happen that IaaS can only bring marginal improvement to current systems and needs. Also, the expected future change of IT needs must be taken into account when making said decision.
- Reliability – Unreliable services will keep being unreliable until you put effort into making them reliable. While moving to IaaS can be an excuse to fix them, you still have to fix them. Your implementation will require adjustment and “breaking into” until you get the confidence to state that it’s equally or more reliable than the previous solution.
Public Cloud Iaas
Most providers renting servers nowdays abstract their hardware behind IaaS, Paas, or SaaS solutions. In cases where points are similar or identical to the on-premise solution, they will be repeated but not elaborated upon.
- Decoupling of resources and teams that consume them – same as for on-premise
- Automated/instantaneous provisioning – same as for on-premise
- Speed and cost of scaling – Automation, usually associated with on-premise IaaS will in most cases speed-up scaling. But it’s laughable compared to the scale speed public cloud can deliver. If you can afford the price, you can rent literal datacenters with just a few dozen clicks. Unlike real-life datacenters, large cloud rentals are near-instantaneous while in real-life you have to build and fill them with servers.
- Time to action – Exhibit A: this website. During the provisioning process, the longest step was picking a meaningful cloud VM name.
- Less expertise required – While I am among the few who enjoy tinkering around hardware, deadlifting 4U servers full of drives, and whispering dark spells into a black mirror (aka, the terminal as viewed by common muggles). Many others find such environments scary and stressful. Heck, I even know programmers who refuse to use the terminal except for compiling purposes.
- Due to various reasons, system administration is one of the more difficult IT areas to go from newbie to expert. This makes finding experts is difficult. And then there is the ever-present learning curve for every little bit of custom setup your company ever made.
- It may well happen that you will have difficulties finding, training, and retaining personnel.
- Won’t fix the overall issue with provisioning – same as for on-premise
- Requires scale to make sense – same as for on-premise
- Cost – Requires little, or no upfront investment, it’s has a continuous cost. Physical hardware tends to be treated as an investment which means that at least for accounting on-premise and cloud are very different. While physical purchases usually have stringent policies, spinning up cloud VMs might not end up as simple to control and quickly eat up any potential savings. Worse it may end up behind a strict approval process, and end up slower and harder than your previous solution. These are just some short examples of how the cost of Cloud, at least to your finance department, can end up being much different than the simple price quoted on the website.
- Vendor lock-in – While also true for on premise, it can be very challenging to change public cloud providers. This may mean that you expose yourself to additional risk as the failure and policy change of the provider can affect your IT services.
- Change of mentality – Again your teams will have to adjust to the new workflow. Also, cloud offerings might impose some limitations on what can be done on their IaaS platform. This might require rework on existing and slightly different constraints on future projects.
- Time to build & cost to maintain – Instantly spinning up VMs doesn’t mean that you’ll be able to move everything overnight. And it won’t save you from having experts. While you don’t need traditional sysadmins, you will need Cloud Administrators. These are a breed of their own. Don’t assume that your old sysadmins are able or willing to take over operating such a drastically different environment.
- Reliability – while major outages are rare, they do happen. Also, keep in mind to check both the reported SLA and look for user reports of their promised number of nines.