The rise of platform engineering correlates directly to a trend that our customers have seen for years—infrastructure is becoming too complex for traditional models to deliver at scale.
For modern engineering and development teams, much of the CI/CD pipeline has long been automated. The infrastructure and services to support the various stages of the CI/CD pipeline are often automated through open source Infrastructure as Code (IaC) tools and container technologies like Kubernetes or Helm.
Over the years, infrastructure demands for the CI/CD pipeline have become increasingly complex. Today’s developers and testers often need environments consisting of multiple cloud and containerized services to work together.
Meeting this demand is creating operational bottlenecks for both DevOps and development teams. Provisioning cloud or hybrid infrastructure across multiple IaC configurations and Kubernetes assets can be challenging, time-consuming, and error-prone.
As a result, DevOps productivity diminishes and the developer experience suffers. The redundant manual work and back-and-forth between these teams often leads to frustration, burnout, and shadow IT.
In this article, we’ll define platform engineering and walk through recommendations that have been helpful for the DevOps and development teams that have worked with Quali over the years.
What is Platform Engineering?
Platform engineering focuses on creating and maintaining a unified internal developer platform (hence the name) to manage infrastructure across the entire organization.
A platform team prioritizes automation, self-service, and standardization to improve developer productivity. Essentially, this means evolving beyond the traditional ticket-submission process for infrastructure to a method in which all infrastructure is delivered and managed continuously.
An effective platform can also improve overall infrastructure management and cloud governance.
Engineering teams often leverage the infrastructure defined via IaC and Kubernetes within their CI/CD tools. While this approach can improve velocity, it sacrifices standardization. The Git repositories defining the IaC and Kubernetes resources lack the visibility, orchestration, and management functionality needed to ensure all configurations are up to date and configured correctly.
Learn more: Comparing Internal Developer Platforms and Internal Developer Portals
As a result, the teams that rely on this approach struggle to orchestrate and maintain environments across multiple IaC resources or tools.
Engineering a unified platform can establish an important balance for both DevOps and developer organizations—velocity and centralization.
Many platform engineering teams leverage internal developer portals to accelerate velocity. An internal developer portal delivers the front-end developer experience for accessing resources needed to execute development work–including the ability to provision infrastructure, launch environments, or perform day-2 actions on those environments via self-service.
Most DevOps people recoil at the very word “centralization” out of concern for all that tends to come with it. Traditionally, centralized processes mean unnecessary oversight and delays. Since cloud infrastructure promises speed from the decentralization of infrastructure, any attempt to “centralize” that infrastructure has been seen as a step backward.
An effective platform engineering approach centralizes the delivery and management of complex infrastructure while decentralizing access to infrastructure for those who need it.
Benefits of Platform Engineering for Multi-Cloud and Hybrid Infrastructure
Provisioning multi-cloud and hybrid infrastructure can be complex and time-consuming for DevOps and IT teams.
Since the various technologies used to define public cloud, private cloud, and containerized infrastructure were not designed to work together, provisioning multiple components to work together requires extensive coding and testing to reconcile the differences between the tools and validate that the environment will operate as needed.
The developers and other staff who need multi-cloud and hybrid infrastructure are forced to wait for that infrastructure before they can carry out their day-to-day work that relies on it. Poorly configured environments can hold back testing and development work if they fail to produce the conditions needed for those stages. And redundant, manual orchestration also creates risk of misconfigurations, which can diminish the quality of the environment, drive up unnecessary cloud costs, and create security risks.
In turn, this can impact the overall efficiency of DevOps and IT teams, resulting in missed deadlines and poor delivery.
Platform engineering mitigates the provisioning challenges by making infrastructure scalable. An effective platform defines complete environments to support use cases, regardless of how the infrastructure is defined or delivered.
Once defined, that environment can be deployed and updated as frequently as needed—eliminating the redundant manual orchestration that slows down delivery and creates risk of misconfigurations.
That scalability, in turn, enables self-service. If the developer no longer needs the DevOps team to orchestrate the environment every time they need it, then they can simply launch via self-service. The ability to democratize access to infrastructure, while setting role-based access controls and protecting account credentials and keys, is critical to striking that balance between velocity and governance.
Once your teams are orchestrating and deploying all infrastructure from a single unified platform, you can monitor activity and set rules at the platform level. For example, setting a rule to prohibit a specific size of cloud instance will deny any attempt to launch instances of that size, thereby preventing activity that drives up cloud costs. Another common example is automating deployment and teardown based on a daily schedule to ensure that all infrastructure is available when needed and shutdown when it’s not.
Through this approach, you can:
- Accelerate developer velocity with self-service access to infrastructure pre-configured for their specific use case
- Make your DevOps teams happier and more productive by reducing the redundant manual work and cognitive load required to manage infrastructure across platforms and tools
- Track and forecast cloud costs in real-time based on resource configurations and frequency of deployments via your platform
- Optimize costs by enforcing configuration standards and automating the infrastructure lifecycle deployed via the platform
Best Practices for Platform Engineering
To ensure a successful platform engineering strategy that balances developer experience, governance, and efficiency, keep these key considerations in mind:
- Establish a clear strategy and vision for your platform, including defining roles and responsibilities of DevOps and IT teams and identifying key metrics to track success
- Understand the tools and resources your teams use currently to ensure that your platform encompasses all infrastructure that your teams will need
- Develop your cloud standards by identifying unnecessary cloud activity that is likely to drive up costs or create security risks
- Calculate productivity to understand how automation will help your teams
- Evaluate security and governance implications, including defining roles for access permissions, account credentials, and security keys for the cloud platforms your teams use
- Incorporate collaboration and communication tools into your platform so DevOps, IT, and development teams can share knowledge on infrastructure and activity easily