Cesar Rodriguez, VP of Engineering, StackGen

September 13, 2024

How to make Infrastructure as Code secure by default

Infrastructure as Code (IaC) has become a widely adopted practice in modern DevOps, automating the management and provisioning of technology infrastructure through machine-readable definition files.

IaC

What can we to do make IaC secure by default?

Security workflows for IaC

First, let’s consider that the security workflows for IaC usually comprise multiple steps and practices.

IaC code is stored in version control systems, such as Git, with changes tracked and reviewed before merging, which helps improve consistency. Security policies and configuration checks are often automated and integrated into CI/CD pipelines to ensure each commit or pull request is validated against security policies before deployment.

Among other security best practices (such as principle of least privilege, threat modeling and detection, runtime monitoring, and auditing and logging), organizations must also ensure that new infrastructure is provisioned based on updated, secure specifications, rather than by modifying existing resources or using outdated IaC templates. All these workflows and automations have played a vital role in enabling organizations to more easily deploy consistent infrastructure in modern environments.

Security flaws are intrinsic to IaC

Unfortunately, converting security policies into code in IaC involves several potential issues, primarily caused by human error. When security teams or developers manually translate security policies into IaC code, there’s a significant risk of mistakes or misinterpretations, which could then be broadly propagated across multiple environments.

In addition to the potential for human error, this process is also labor-intensive, slowing down development and deployment processes. And unfortunately, these manual conversions may negate some of the efficiency gains that IaC promises to deliver.

Another significant challenge is that security policies inevitably evolve, and manually updating the corresponding IaC introduces more potential for human error. If they aren’t updated across the board, the infrastructure may be operating under outdated security standards.

Plus, as infrastructure grows more complex, manually managing and implementing security policies becomes increasingly challenging and error prone because it introduces more potential points of failure. More complex applications also often have multiple interdependent components that can be hard to manage without creating unintentional conflicts. And all of this relies on your team to have a deep understanding of both security policies and IaC coding.

Scanning IaC

Scanning IaC templates before deployment is undeniably important; it’s an effective way to identify potential security issues early in the development process. It can help prevent security breaches and ensure that your cloud infrastructure aligns with security best practices. If you have IaC scanning tools integrated into your CI/CD pipelines, you can also run automated scans with each code commit or pull request, catching errors early.

Post-deployment scans are important because they assess the infrastructure in its operational environment, which may result in finding issues that weren’t identified in dev and test environments. These scans may also identify unexpected dependencies or conflicts between resources.

Any manual fixes you make to address these problems will also require you to update your existing IaC templates, otherwise any apps using those templates will be deployed with the same issues baked in. And while identifying these issues in production environments is important to overall security, it can also increase your costs and require your team to apply manual fixes to both the application and the IaC.

Automation may miss the mark

Some tools offer automated remediation features to minimize the need for manual fixes to IaC by applying security patches automatically. Unfortunately, automating remediation can create a different set of problems. For example, automated remediation tools may fall short because they:

Operate based on predefined rules and algorithms, which may not fully account for the unique context of each application or environment, leading to changes that break the applications or cause other issues.
Apply overly restrictive fixes without detecting whether they cause issues. For example, reducing privileges for a Kubernetes pod that requires escalated privileges could result in a non-functioning application.
Fail to account for discrepancies introduced by configuration drift, leading to further drift and potential application instability.
Can’t differentiate between critical and non-critical issues, resulting in unnecessary or even harmful changes that could impact service availability and integrity.
Address symptoms rather than root causes, causing recurring issues because the underlying problem remains unresolved.
Struggle to apply fixes for complex scenarios that require nuanced decision-making; complex issues often require human intervention.
Introduce new vulnerabilities if a fix opens new attack vectors or weakens existing security measures.

Automated remediation sounds ideal but introduces myriad unintended consequences that could impact the reliability and usability of your applications.

Make the application the source of security

With all these security considerations in mind, how do we make infrastructure secure by default when there are so many manual steps along the way?

One suggestion is to look at the application as the source of truth for infrastructure. What does that look like?

When a developer writes an application, they are continuously making choices. For example: what database does the application need to access? What other resources does the application need to connect to to work? Each decision requires infrastructure to support it and all of that infrastructure must be available to the developer. And not only that; the infrastructure a developer has access to needs to align to the relevant security standards based on the decisions already made.

By using the application as the source of truth, you can:

Eliminate the need for a developer to know all the security policies required for an application
Ensure that the right infrastructure is available to support it
Remove the need for developers to remember to code all the requirements
Save security teams time by eliminating the need to audit each line of IaC to ensure that it aligns with security and compliance policies
Ensure that the infrastructure configuration aligns directly with the application’s requirements
Minimize the risk of mismatches between the application and infrastructure

This approach improves efficiency and automation, streamlining your deployment by standardizing on what an application requires. It also makes it much simpler to enforce security and compliance policies and guarantee that least privilege access controls are in place.

IaC secure by default is possible

While IaC solves many challenges of application deployment, it still relies on people to manually convert security policies into IaC. But if you can abstract IaC away using tools that generate the infrastructure from the application code itself, you can make IaC secure by default.

By incorporating the context of the application into the infrastructure, organizations can stop worrying about vulnerabilities and misconfigurations and focus instead on developing and delivering applications.

More about