Terrafir: Enhancing Terraform Deployments with Policy Enforcement

Terrafir is a tool designed to enhance teams' understanding of their Terraform deployments. It evaluates Terraform state files against predefined policies, providing a detailed report on deployment compliance. This functionality is particularly beneficial for teams managing multiple deployments across various environments, enabling them to ensure compliance with their set policies.

As an admirer of Nic Jackson's work, I initially explored Sentinel, a language aligning with the principles of Policy as Code. However, I was disappointed to find its limitations to Terraform Cloud. My search led me to Open Policy Agent (OPA), an open-source engine that supports declarative policy writing in Rego. OPA evaluates these policies against an input source, allowing the development of business logic based on the assessment results. OPA can be containerized and configured to pull policies from a single or multiple bundle servers, enabling a highly available setup for policy assessment.

--- title: OPA Setup --- flowchart TD; B[Terrafir API] B --Authorizes Request-->C[AuthService] C -.Allow Plan Assessment.->E C <--Assesses Plan -->E[PlanService] E <--Sent Plan to Policy Engine -->F["OPA API Service - ClusterIP"] F <-->G[OPA Container] F <-->H[OPA Container] F <-->I[OPA Container] G <---->J["NGINX Bundle Server Service - ClusterIP"] H <--Policy Bundle is pulled and stored on the OPA Container-->J I <-->J J <--> M[Bundle Server] J <--> N[Bundle Server] J <--> O[Bundle Server] C --Processes Assessment and Returns Result to Container-->B K[HPA] -.HPA Monitoring Resource Limits.-F L[HPA] -.HPA Monitoring Resource Limits.-J

Rego, the language used in OPA, bears similarities in syntax and package structure to Go. Rego queries can be used to create policies that can facilitate the identification of where systems, or inputs, deviate from the expected state. It supports a variety of data structures, including structured document models such as JSON. Given that Terraform state files are stored in JSON format, this feature makes OPA a suitable choice for evaluating them.


Writing Policies

Terrafir's policy structure revolves around the deny assessment block. Functions within this block are all evaluated, and if any of them return a decision indicating true, the deny block evaluates to true. This deny block acts as a logical AND operator.

If the policy returns true, it indicates a deviation from the expected state. For example, the policy below checks if the image scanning configuration is set to scan on push for Elastic Container Registry (ECR) resources. If an ECR is provisioned without scanning containers for vulnerabilities when added to the registry, the resource is flagged.

package terraform_security.ecr

deny {
    deny__scan_on_push
}


# METADATA
# title: ECR Image Scan on Push
# description: |
#   ECR Image scan on push is not true and could be show security vulnerabilities.
# custom:
#  severity: MEDIUM
#  remediation: |
#    Change scan on push for the ECR to true
deny__scan_on_push[decision] {
    annotation := rego.metadata.rule()
    some resource in resources
    resource.instances[_].attributes.image_scanning_configuration[_].scan_on_push != true
    error := sprintf("Scan on push is not implemented for: '%v'", [resource.instances[_].attributes.name])
    decision := handle_decision(annotation, error)
}

To assess this policy locally, without the complexity of OPA and a bundle server, you can use the opa eval command:

opa eval -b ./policies -d ./input.json 'terraform_security.ecr'

This command loads the policies from the policies directory as a bundle (-b) and uses input.json as the data (-d) to assess policies against. It evaluates whether the policies in the terraform_security.ecr package are true or false.
Annotations can be added to provide helpful information and reasoning behind the denial of a particular policy. For the over 100 policies that Terraform assesses, each contains information such as description, severity and remediation recommendations.

This functionality enables the creation of powerful systems that allow for logic around policy evaluation, deployment flagging, and team notifications based on compliance status. It provides a comprehensive approach to managing infrastructure as code, ensuring that resources are provisioned and configured according to best practices and security standards.