This blog is the first of a 3-part blog series that identifies challenges that enterprises face in the cost management of their cloud infrastructures. This blog covers the major challenges and makes some key recommendations. Subsequent parts propose a comprehensive cost management framework and do a deep-dive into some of these recommendations.
Cloud adoption is no longer an “if” but rather a “what, when and how.” More and more enterprises are asking the questions, “What (to move to the cloud)?” “When (to move it)?” and “How (to choose the right architecture and services)?”
As enterprises move more and more workloads to the cloud, the first pain our customers feel is the sting of cost overruns. So what has happened? The budgets were planned. Some initial sizing was done. But almost immediately after a migration, costs are the first factor that start causing headaches to IT managers. In this blog, we talk about some of the pitfalls and lay out a comprehensive framework for managing costs of cloud workloads.
From a cost perspective, there are four phases of a typical lifecycle that a workload goes through:
Let’s start with the key questions that should be asked during each of these phases:
When starting their cloud adoption journey, enterprises sometimes do not consider the above questions, and they miss putting a cost management framework in place. This usually results in a situation commonly called the “cloud sprawl.” It means that the enterprise has lost visibility and control of its cloud landscape and costs. These situations lead to (often substantial) cost overruns. Some of the common reasons are listed below.
This is a key challenge. To utilize the full benefits of the speed and agility that cloud provides, modern IT usually provides a common services framework, wherein the business teams are allowed to manage the cloud resources for their applications themselves. While this is the recommended practice, cost ownership often falls through the cracks. We’ve seen customer situations where IT creates accounts and projects for business teams to use, and then hands them over to the business teams (but still owns the costing and billing).
What results from this arrangement is that the business teams get free reign to create resources, which they do — and often well outside of their allocated budgets. They are also neither aware and often not bothered with the mounting spends since they are not the ones footing the bill.
This is usually made worse with the fact that IT does not have strong cost reporting mechanisms to bring visibility into the who and what of the budget overruns.
Doing an initial cloud TCO is absolutely essential to arrive at a budget for your cloud landscape. When this is not done, stakeholders have no visibility into what their infrastructure is going to cost. Cost savings is often one of the biggest reasons for cloud adoption, but not doing this exercise results in a bill shock to the enterprise and often takes the steam out of the momentum.
Even when enterprises do a TCO exercise, they often do the TCO for the final production landscape. They sometimes miss taking into account the migration plan, DevOps processes, and Go-Live dates (and also do not sufficiently size for them). This causes situations where costs skyrocket even before the application is fully migrated. Dev/Test environments tend to severely bloat up and eat into the overall budget.
Even when enterprises have done initial sizing and defined cost ownership, having day-to-day visibility into the costs is important. Because it’s very easy to create resources in the cloud (within minutes), waste becomes a concern. Resources may be created for temporary use but never shut down. We have also seen situations where hackers have obtained access to customers’ cloud accounts and created hundreds of servers. The problems with a lack of visibility can be summarized below:
Building the correct cost governance is a key pillar of the overall cloud governance framework. Problems occur when some of the following governance structures are not put into place:
Even when governance models are defined, for large landscapes, enforcing governance manually comes close to not enforcing it at all (for example, imagine tagging 1,000 VMs manually). When tools and automation strategies are not used and applied across the entire cloud landscape, IT teams always play catch-up and endure a lot of manual work to keep the landscape in shape.
Similarly, when cost management and remediation tools are not used, manual compliance, cost reporting, and optimization become simply untenable and are often abandoned.
Public clouds are evolving fast. They already provide innovative features like autoscaling that are not available within on-premise environments. In addition, they provide innovative costing models and multiple discount options.
Lastly, they come up with new managed services that not only allow the customer to pay for just what they use, but also lift the management overhead for these services. Enterprises miss out on these benefits when:
Based on our experiences with customer landscapes and cloud best practices, we have come up with an approach that can help enterprises control and optimize costs effectively.
While the cost management framework covers a lot of ground in the following sections, here are some of the key recommendations that enterprises can get started with immediately: