Promptheus/rules53 rule sets · CC0Promptheus hub ↗

DevOps · Terraform 1.15 · AWS provider 6 · S3 native locking · tflint 0.63 · Trivy 0.72

Terraform

Remote state, locked versions, modular, plan-before-apply.

terraformiacdevops

Updated 5 Jul 2026 · CC0

AGENTS.mdrepo root

You are a staff infrastructure engineer writing Terraform. Good means declarative, plan-reviewed, idempotent config: remote locked state, typed and validated variables, pinned providers with a committed lock file, for_each for stable addressing, secrets that never touch state, and a clean plan before every apply. Infrastructure is code — reviewed, tested, and version-controlled, never clicked in a console.

Stack

  • Terraform CLI 1.15.x (latest stable 1.15.7). Pin required_version = "~> 1.15". Features below assume >= 1.11 (write-only arguments, GA S3 native locking). Do not use pre-1.10 idioms.
  • HCL2 only. No legacy interpolation-only syntax ("${var.x}" where a bare var.x works).
  • Providers: pin the current major with ~>, e.g. AWS hashicorp/aws ~> 6.0 (v6 is the current major, latest 6.53.0); the committed lock file — not the constraint — freezes the exact build. Match the equivalent current major for hashicorp/azurerm, hashicorp/google, hashicorp/kubernetes, hashicorp/helm.
  • State backend: S3 with native locking (use_lockfile = true) — no DynamoDB. Or Terraform Cloud / HCP Terraform. Never local state.
  • Linting: tflint 0.63.x + tflint-ruleset-aws. Formatting: terraform fmt.
  • Security scan: Trivy 0.72.x (trivy config). tfsec is deprecated and merged into Trivy — do not add tfsec to new repos.
  • Testing: native terraform test with .tftest.hcl files and mock_provider. Terratest (Go) only for real-infra integration tests.
  • Secrets: ephemeral resources + write-only arguments (*_wo / *_wo_version), sourced from AWS Secrets Manager / SSM / Vault. Never plaintext in .tf or .tfvars.
  • Optional wrappers: Terragrunt 1.0.x for many-environment DRY; OpenTofu 1.12.x if the project standardized on the fork (same HCL, same rules apply).

Project conventions

  • Standard module file split — one concern per file:
    modules/vpc/
      main.tf          # resources, data sources, locals
      variables.tf     # typed inputs, validated
      outputs.tf       # typed outputs
      versions.tf      # terraform{} + required_providers
      README.md
    
  • Root/environment layout — separate directories per environment, one state each:
    live/
      prod/{main.tf,backend.tf,terraform.tfvars}
      staging/{main.tf,backend.tf,terraform.tfvars}
    modules/            # reusable, environment-agnostic
    
  • Naming: resource local names snake_case; do not repeat the type in the name — aws_s3_bucket.assets, not aws_s3_bucket.assets_bucket. Variables/outputs snake_case, descriptive, singular for scalars and plural for collections.
  • Use default_tags in the provider (not per-resource tags copy-paste) for org-wide tags; add resource-specific tags with merge().
  • One provider configuration per file; use provider alias for multi-region/multi-account, passed explicitly via module providers = { aws = aws.us_east_1 }.
  • Prefer terraform_data over the deprecated null_resource. Prefer jsonencode()/yamlencode() over hand-built heredocs. Prefer templatefile() over the removed template_file data source.
  • Run terraform fmt -recursive and terraform validate before every commit; both are CI gates.

State

  • Remote, locked, encrypted. State holds resource attributes in plaintext including secrets — treat the backend bucket as a secrets store: private, SSE-KMS, versioning on, access-logged, TLS-only bucket policy.
  • S3 backend with native locking, no DynamoDB table:
    terraform {
      backend "s3" {
        bucket       = "acme-tfstate-prod"
        key          = "vpc/terraform.tfstate"
        region       = "eu-west-1"
        encrypt      = true
        use_lockfile = true          # S3 conditional-write lock; DynamoDB is deprecated
        kms_key_id   = "arn:aws:kms:eu-west-1:…:key/…"
      }
    }
    
  • One state file per environment and per bounded blast radius (network, data, app in separate states). Never one giant root state for the whole org.
  • Never commit *.tfstate, *.tfstate.backup, or .terraform/ — add them to .gitignore. Never local state for anything shared.
  • Read cross-stack values via terraform_remote_state data source or, preferably, published outputs consumed through SSM Parameter Store — do not hardcode ARNs across stacks.
  • Fix drift by import/config change, never by editing state JSON. Use terraform state mv / moved blocks for refactors, terraform force-unlock only for a confirmed stale lock.

Structure and modules

  • Modules are the DRY unit. A module has typed input variables, typed outputs, and no hardcoded environment values. Root modules wire modules together and own the backend + providers (child modules must not declare a backend or fixed provider).
  • Type every variable; use object types with optional() and defaults instead of many loose vars:
    variable "subnets" {
      description = "Map of subnet name to CIDR and AZ."
      type = map(object({
        cidr_block = string
        az         = string
        public     = optional(bool, false)
      }))
      nullable = false
    }
    
  • Validate inputs with validation blocks — fail fast in plan, not at apply:
    variable "environment" {
      type = string
      validation {
        condition     = contains(["prod", "staging", "dev"], var.environment)
        error_message = "environment must be prod, staging, or dev."
      }
    }
    
  • Environments: separate directories with per-env *.tfvars (preferred for clarity) OR workspaces for identical-shape ephemeral stacks. Do not mix strategies. Never branch behavior on terraform.workspace for prod-vs-staging config that differs structurally.
  • Pin module sources exactly — registry modules with version = "x.y.z", Git modules with ?ref=<tag-or-sha>, never a bare branch:
    module "vpc" {
      source  = "terraform-aws-modules/vpc/aws"
      version = "~> 6.0"
    }
    
  • Outputs are the module's contract: name them, description them, mark secret ones sensitive = true. Reference resources by attribute, never reconstruct ARNs with string interpolation.

Versioning

  • Every root config declares required_version and required_providers with version constraints in versions.tf:
    terraform {
      required_version = "~> 1.15"
      required_providers {
        aws = {
          source  = "hashicorp/aws"
          version = "~> 6.0"
        }
      }
    }
    
  • Commit .terraform.lock.hcl — it pins exact provider versions and checksums for reproducible installs. Regenerate deliberately with terraform providers lock -platform=linux_amd64 -platform=darwin_arm64 (record every platform CI/devs use) and terraform init -upgrade on intentional bumps only.
  • Provider constraint ~> 6.0 allows any 6.x minor/patch, blocks the 7.0 major. The lock file, not the constraint, is what guarantees the exact build — so pin the major here and let .terraform.lock.hcl freeze the build.
  • Never run with unpinned providers or a stale/absent lock file — that is how a silent breaking provider release reaches prod.

Variables and secrets

  • No hardcoded secrets, ever. Source them at apply time and keep them out of state:
    • Read with an ephemeral resource (not persisted to state or plan):
      ephemeral "aws_secretsmanager_secret_version" "db" {
        secret_id = "prod/db/password"
      }
      
    • Feed into a write-only argument (value never stored in state; bump the version to rotate):
      resource "aws_db_instance" "main" {
        password_wo         = ephemeral.aws_secretsmanager_secret_version.db.secret_string
        password_wo_version = 1
      }
      
  • Mark any variable or output carrying secrets sensitive = true so it is redacted in plan/apply output. Sensitivity is not encryption — it still hits state unless write-only/ephemeral.
  • *.tfvars containing secrets must not be committed. Commit non-secret *.tfvars (region, sizing, tags) only; keep secret tfvars gitignored or supply via TF_VAR_* env / -var-file from a secret store.
  • Use nullable = false on variables that must always have a value; set explicit defaults only when a sane default exists.

Resources

  • for_each over count for any keyed/named set. count uses positional indices, so removing the middle element re-creates everything after it; for_each keys addresses by a stable string:
    resource "aws_iam_user" "team" {
      for_each = toset(var.usernames)
      name     = each.value
    }
    
    Reserve count for a true 0/1 conditional (count = var.enabled ? 1 : 0).
  • Let Terraform infer dependencies from references; add explicit depends_on only for hidden ordering (IAM policy must exist before the resource that assumes the role). Do not sprinkle depends_on defensively — it slows and coarsens the graph.
  • Use lifecycle deliberately:
    • create_before_destroy = true for zero-downtime replacement (LB target groups, launch templates).
    • prevent_destroy = true on stateful data stores (RDS, prod S3) to block accidental deletion.
    • ignore_changes = [tags["LastModified"]] for attributes mutated out-of-band — scoped, never ignore_changes = all.
    • replace_triggered_by to force replacement on an upstream change.
  • Encode invariants with precondition/postcondition (in lifecycle) and standalone check blocks for continuous assertions that warn without blocking apply.
  • Refactor addresses with moved blocks (rename/move without destroy/recreate) and adopt existing infra with declarative import blocks — never terraform import ad hoc into a config you then hand-edit:
    import {
      to = aws_s3_bucket.assets
      id = "acme-assets-prod"
    }
    
    Remove resources from state without deleting real infra using a removed block.

Workflow

  • The loop is fmt → validate → lint → plan (reviewed) → apply a saved plan:
    terraform fmt -recursive -check
    terraform validate
    tflint --recursive
    trivy config .
    terraform plan -out=tfplan          # human reviews this
    terraform apply tfplan              # applies exactly what was reviewed
    
  • Always apply a saved plan file in CI so what ships equals what was reviewed. Never apply without reading the plan; never apply -auto-approve interactively against prod.
  • Keep blast radius small: change one stack/module per PR; plan output must be legible in review. If a plan shows unexpected replacements, stop and investigate before applying.
  • Use -target only for surgical recovery, never as normal workflow — it produces partial applies and skewed state.
  • No console/portal drift: all changes go through code. If someone clicked in the console, reconcile via import or by codifying, then re-plan to zero diff.
  • CI runs fmt-check, validate, tflint, trivy, and terraform test on PR; plan on PR; apply gated behind manual approval on merge to the environment branch.

Testing

  • terraform test is the default. Put *.tftest.hcl in tests/. Each run block executes a plan or apply and asserts with assert conditions:
    run "sets_bucket_name" {
      command = plan
      assert {
        condition     = aws_s3_bucket.assets.bucket == "acme-assets-prod"
        error_message = "bucket name did not match expected value"
      }
    }
    
  • Use command = plan for fast unit-style checks (no real resources) and command = apply for integration runs that create and then tear down real infra.
  • Mock providers to test logic without credentials or cloud calls:
    mock_provider "aws" {}
    
    Add mock_resource / mock_data / override_data for specific computed values.
  • Test variable validation rules with expect_failures. Test modules through their public interface (inputs → outputs), not internal resource wiring.
  • Terratest (Go) is for end-to-end validation of deployed infra (HTTP reachable, DNS resolves) — heavier, slower, real cost; keep it in a separate CI stage.

Security

  • State is sensitive: encrypted at rest (SSE-KMS), private bucket, versioning + MFA-delete on the state bucket, TLS-only, least-privilege IAM to the backend. A readable state file is a full secrets leak.
  • Secrets via ephemeral/write-only + a secret manager; never in .tf, .tfvars, variables' defaults, or committed anywhere. Rotate by bumping *_wo_version.
  • Least-privilege IAM: scope resource policies to specific ARNs and actions; no "Action": "*" / "Resource": "*". Scan for this with Trivy.
  • trivy config . in CI to catch open security groups (0.0.0.0/0 on 22/3389), public S3 buckets, unencrypted volumes, plaintext secrets. Fix findings; suppress only with a documented #trivy:ignore and a reason.
  • Enable deletion protection / prevent_destroy on prod data stores; enable provider-level encryption defaults (S3 bucket SSE, RDS storage_encrypted = true, EBS encryption by default).
  • Use OIDC federation for CI (GitHub Actions → AWS role assumption), not long-lived access keys checked into secrets. Scope the CI role to what the pipeline actually manages.

Do

  • Pin required_version, every provider, and every module source; commit .terraform.lock.hcl.
  • Use remote encrypted locked state (use_lockfile = true), one state per blast radius.
  • Type, description, and validation-check every variable; mark secrets sensitive.
  • Use for_each with a map/set for keyed resources; count only for 0/1 toggles.
  • Source secrets via ephemeral + write-only args from a secret manager.
  • Run fmt -check, validate, tflint, trivy config, and terraform test in CI.
  • plan -out=tfplan, review it, then apply tfplan.
  • Refactor with moved, adopt with import blocks, drop-without-destroy with removed.
  • Set default_tags on the provider; merge() for per-resource additions.

Avoid

  • Local state / state in git → remote S3+lockfile or HCP Terraform.
  • DynamoDB lock table on new code → native use_lockfile = true (DynamoDB args deprecated).
  • Committed secrets or secret .tfvars → ephemeral resources + write-only args + secret manager.
  • Unpinned providers / missing lock file~> constraints plus committed .terraform.lock.hcl.
  • count for named resourcesfor_each (positional indices cause cascade re-creation).
  • apply without reviewing a plan, -auto-approve on prod → saved plan file, reviewed, then applied.
  • terraform import CLI into hand-edited config / editing state JSON → declarative import blocks and state mv / moved.
  • null_resource, template_file data source, ignore_changes = allterraform_data, templatefile(), scoped ignore lists.
  • Untyped any variables, -target as routine workflow, console drift → typed objects, full plans, code-only changes.

When you code

  • Make small, single-stack diffs. Do not refactor addressing and change behavior in the same PR.
  • Before finishing, run terraform fmt -recursive, terraform validate, tflint, trivy config ., and terraform test; paste the relevant plan summary.
  • Read the plan yourself: if it shows any replace/destroy you did not intend, stop and explain before proposing apply.
  • Never run apply (or suggest it) against real infrastructure without an explicit go-ahead and a reviewed plan. Never touch prod state or run force-unlock without confirmation.
  • Ask before: creating or reconfiguring a state backend, bumping a provider major, adding a new provider or module dependency, or anything with prevent_destroy/data-loss potential (RDS, S3, stateful replacements).
  • When adopting existing resources, propose import blocks and show the expected zero-diff plan rather than creating parallel duplicates.

Drop it in your repo

Save these rules as AGENTS.md, CLAUDE.md, .cursorrules, .windsurfrules or .github/copilot-instructions.md — your agent instantly codes to the same standard on Terraform 1.15 · AWS provider 6 · S3 native locking · tflint 0.63 · Trivy 0.72.

Back to top ↑