-
-
-
-
URL copied!
The cost management lifecycle for an enterprise landscape closely follows the path from migration to operations. The diagram below highlights the various parts of the framework.
Figure 1: Cloud Cost Management Framework
Cost control should be considered across the application lifecycle — from the initial planning, to day-to-day operations, to periodic architecture optimization. Many enterprises do this through a well-defined underlying governance framework that is optimized using various automation techniques. In this blog, we provide our experience-based recommendations on how to execute cloud governance, initial planning and sizing, and operational visibility and forecasting for enterprise cloud infrastructures.
I. Cloud Governance
Cloud Resource Ownership
- Transform provisioning practices for the cloud through existing cloud management platforms or cloud-enabled CMDBs like ServiceNow.
- IT teams can provide the enterprise governance and management infrastructure and practices, while individual LOBs are responsible for managing their application infrastructure as per enterprise best practices.
- App teams should be responsible for the cost ownership of resources within projects (i.e., you provision it, you pay for it).
- App teams should be responsible for tagging/labelling all created resources.
- App teams should be responsible for the clean-up of unused resources.
Cloud Resource Provisioning
- Define clear access control policies (i.e., who can provision what resources).
- Build standard enterprise reference architectures and templates for provisioning resources (this should be the Enterprise Architecture's responsibility).
- Automate the provisioning of reference architectures and templates.
- Use a cloud-based configuration management tool where appropriate (check if existing configuration management databases provide cloud support).
Tagging and Labeling
- Use tags for resource management and labels for resource identification, grouping, searches, and billing.
- Define a list of labels and tags to be applied.
GlobalLogic recommends the following tags (as reference):
- Identification/Classification Tags
- BU/Cost Center
- Application
- Owner-email – application owner/group
- Environment – Prod/Dev/Test/QA/Perf
- Environment-Name – Prod1a, Dev4 etc.
- Chargeback/Showback ID
- created-by - User who created resource.
- role = <db, appserver, proxy, etc.> - Classify by application role within a project
- Operations/Automation Tags
- schedule-* - Used to drive instance scheduling
- can-delete = <true/false>
- Can be added by app teams once resources are ready to be removed.
- Can also be added by automation scripts, after untagged resources have been reported and no action taken.
- Subsequently, a delete script will read this label and clean up this resource).
- image-type - App type for baseline images, e.g. Apache, Cassandra etc.
- image-version - Adds version ID of all the images of a certain app.
- Reservation-expiry – Used to alert and renew reservations
Other tags can be added as per the business need.
Inventory Management
- Build or use a lightweight inventory management system to:
- Track current cloud sprawl
- Report data on current inventory, new resources, projected cost for new resources, etc.
- Find gaps between what was planned and what exists in the cloud
II. Initial Planning (Sizing and Provisioning)
TCO and Budgeting
- Use the max CPU/RAM for budgeting, but execute the initial sizing based on CPU utilization, etc. (especially for dev/test).
- For dev/test, be sure to consider the uptime hours (i.e., 9x5 as opposed to 24x7) for TCO calculations.
- Execute instance right-sizing based on performance characteristics.
- Use on-premise monitoring data to arrive at a more accurate initial cloud sizing.
- For new migrations, enforce budgets from Day 1.
Service Catalogs and Provisioning
- Create IAM policies so that teams only create services that are needed by the app in that project.
- Build IT-certified base images and templates for reference architectures.
- Publish and enable self-provisioning through tools like ServiceNow.
- Integrate with approval processes.
- Complement provisioning policies with proactive reporting and automated resource clean-up to build awareness and discipline while controlling costs.
III. Operational Visibility and Forecasting
Reporting Approach
Daily Reporting with cost, utilization, non-conformant resources:
- Automatically send daily reports directly to stakeholders with key data points.
- Obtain intelligence by analyzing individual resource level data points and environment-level correlations.
- Recommendations should be generated based on analytics; data points include:
- No or low CPU, memory or disk utilization, or during limited times (e.g., office hours for dev/test)
- No or low network traffic
- No login on VM
- VM uptime (but no activity)
- For cloud services, use cloud-provided metrics
Reporting and Automation Architecture
The following diagram describes the reporting and automation architecture for a cloud landscape:
Figure 2: Reporting and Automation Architecture
Data Points to Report
- Cost (filtered by app/environment)
- Daily, MTD, and projected monthly spend
- Budgeted vs actual, and overrun projection
- Alerts on any change in usage pattern and/or budget overruns
- Utilization
- Show unused resources + age + wasted cost:
- Unattached disks
- Orphaned snapshots
- Unallocated IPs
- Unused/unaccessed storage (recommend moving to archive: Glacier/ColdLine)
- Show underutilized resources
- Show individual instances
- Show environments that have predominantly no utilization (e.g., dev27 is not being used)
- Inventory
- Current inventory
- New resources created + corresponding cost
- New projected monthly spend based on new resources
- Conformance
- List of resources without tags and labels
- List of resources not confirming to naming conventions
- List of instances based on older versions of baseline images
- Show unused resources + age + wasted cost:
Recommendations
- Rightsizing + corresponding cost savings
- Reservation planning/committed use recommendations + corresponding cost savings
- Results in up to 24%-57% potential savings
- Instance scheduling + corresponding cost savings
- Spot/pre-emptible instance recommendations + corresponding cost savings
- Results in up to 60-80% potential savings
- Instance/environment cleanup candidates (based on consistent low/no usage)
- Instance/environment cleanup candidates (based on non-conformance)
- Reserved/committed instance renewal alerts (for instances with approaching expiry dates)
Conclusion
Using the above best practices, enterprises can create an effective governance framework that proactively manages costs across the entire cloud infrastructure lifecycle. In the final installment of this blog series, we will provide recommendations for cost optimization and automation, including some popular tools currently in the market.
Top Insights
Manchester City Scores Big with GlobalLogic
AI and MLBig Data & AnalyticsCloudDigital TransformationExperience DesignMobilitySecurityMediaTwitter users urged to trigger SARs against energy...
Big Data & AnalyticsDigital TransformationInnovationRetail After COVID-19: How Innovation is Powering the...
Digital TransformationInsightsConsumer and RetailTop Authors
Top Insights Categories
Let’s Work Together
Related Content
Leveraging SaMD Applications to Improve Patient Care and Reduce Costs
One of the most exciting developments in healthcare is the emergence of Software as a Medical Device (SaMD) as a more convenient and cost-effective means to deliver superior care to the tens of millions of people worldwide who suffer from various health conditions.
Learn More
View on payment industry modernisation: Drivers of change
The payment industry has been going through radical modernisation with multiple regulatory and infrastructure changes over the last five to ten years. The post-pandemic era has accelerated these efforts as consumer behaviour changed significantly during the COVID-19 outbreak. Consumers across the world expect real-time responses in all aspects of digital payment transactions and have adopted … Continue reading Enterprise Cloud Cost Management – Part 2 →
Learn More
The Rise of The Invisible Bank
Banks will power experiences, but everyone will ignore them. Inspiration for this blog title comes from Jerry Neumann, the author of the blog Reaction Wheel, who wrote in 2015 that ‘software eats the world and everybody ignores it’. Neumann also observed that ‘information and communications technology becomes ubiquitous but invisible’ – in other words, … Continue reading Enterprise Cloud Cost Management – Part 2 →
Learn More
GlobalLogic wins at the 2023 Analytics Institute Awards, Dublin
*This blog was updated on Friday 16th June. The team is excited to announce that GlobalLogic was named winners of the Emerging Technology Award at last night's Analytics Institute Awards! This prestigious award recognises organisations that have successfully employed new technologies such as IoT, Edge Computing, Machine Learning, or RPA. Our submission showcased the successful application of … Continue reading Enterprise Cloud Cost Management – Part 2 →
Learn More
MLOps Principles Part Two: Model Bias and Fairness
Welcome back to the second instalment of our two-part series – MLOps (Machine Learning Operations) Principles. If you missed part one, which focused on the importance of model monitoring, it can be found here. This blog explores the various forms that model bias can take, whilst delving into the challenges of detecting and mitigating bias, … Continue reading Enterprise Cloud Cost Management – Part 2 →
Learn More
The GlobalLogic Academy Programme – a personal, introspective recollection
Ben Graham – Academy 2022 Graduate/Delivery Consultant I am currently in the DevOps capability for consulting and a recent graduate of the Academy 2022 programme which ran from September to December. I’d like to detail my thoughts on the process and share how my fellow graduates and I felt going on this journey. The GlobalLogic … Continue reading Enterprise Cloud Cost Management – Part 2 →
Learn More
Share this page:
-
-
-
-
URL copied!