Create a GKE Cluster (Workload Identity enabled) using Terraform

In this blog, I will show how Terraform can be used to create a Google Kubernetes Engine (GKE) cluster. Photo

Goal

Create a GKE Cluster which has Workload Identity feature enabled using Terraform.

Prerequisites

This post assumes the following:

  1. We already have a GCP Project and a GCS Bucket (we will use this to store Terraform State file) created.
  2. Google Kubernetes Engine API is enabled in the GCP Project.
  3. Google Cloud SDK (gcloud), kubectl and Terraform is setup on your workstation. If you don’t have, then refer to my previous blogs - Getting started with Terraform and Getting started with Google Cloud SDK.

Source Code

Full code is available on GitHub.
Note: This is not a production-ready codebase.

Create a GKE Cluster

  1. Create a unix directory for the Terraform project.
mkdir ~/terraform-gke-example
cd ~/terraform-gke-example
  1. Create Terraform configuration file which defines GKE and Google provider.
vi gke.tf

This file has following content

# Specify the GCP Provider
provider "google-beta" {
  project = var.project_id
  region  = var.region
  version = "~> 3.10"
  alias   = "gb3"
}

# Create a GKE cluster
resource "google_container_cluster" "my_k8s_cluster" {
  provider           = google-beta.gb3
  name               = "my-k8s-cluster"
  location           = var.region
  initial_node_count = 1

  master_auth {
    username = ""
    password = ""
  }

  # Enable Workload Identity
  workload_identity_config {
    identity_namespace = "${var.project_id}.svc.id.goog"
  }

  node_config {
    machine_type = var.machine_type
    oauth_scopes = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]

    metadata = {
      "disable-legacy-endpoints" = "true"
    }

    workload_metadata_config {
      node_metadata = "GKE_METADATA_SERVER"
    }

    labels = { # Update: Replace with desired labels
      "environment" = "test"
      "team"        = "devops"
    }
  }
}

Note: workload_identity_config & workload_metadata_config block enables Workload Identity.

  1. Define Terraform variables.
vi variables.tf

This file looks like:

variable "project_id" {
  description = "Google Project ID."
  type        = string
}

variable "region" {
  description = "Google Cloud region"
  type        = string
  default     = "europe-west2"
}

variable "machine_type" {
  description = "Google VM Instance type."
  type        = string
}

Note: We are defining a default value for region. This means if a value is not supplied for this variable, Terraform will use europe-west2 as its value.

  1. Create a tfvars file.
vi terraform.tfvars
project_id   = ""                     # Put GCP Project ID.
machine_type = "n1-standard-1"        # Put the desired VM Instance type.
  1. Set the remote state.
vi backend.tf
terraform {
  backend "gcs" {
    bucket = "my-tfstate-bucket"    # GCS bucket name to store terraform tfstate
    prefix = "gke-cluster"          # Update to desired prefix name. Prefix name should be unique for each Terraform project having same remote state bucket.
  }
}
  1. File structure looks like below
cntekio:~ terraform-gke-example$ tree
.
├── backend.tf
├── gke.tf
├── terraform.tfvars
└── variables.tf

0 directories, 4 files
  1. Now, initialize the terraform project.
terraform init

Output

Initializing the backend...

Successfully configured the backend "gcs"! Terraform will automatically
use this backend unless the backend configuration changes.

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "google-beta" (terraform-providers/google-beta) 3.10.0...

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

Note: terraform init downloads all the required provider and plugins.

  1. Let’s see the execution plan.
terraform plan --out 1.plan

Output

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

------------------------------------------------------------------------

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # google_container_cluster.my_k8s_cluster will be created
  + resource "google_container_cluster" "my_k8s_cluster" {
      + additional_zones            = (known after apply)
      + cluster_ipv4_cidr           = (known after apply)
      + default_max_pods_per_node   = (known after apply)
      + enable_binary_authorization = false
      + enable_intranode_visibility = false
      + enable_kubernetes_alpha     = false
      + enable_legacy_abac          = false
      + enable_shielded_nodes       = false
      + enable_tpu                  = false
      + endpoint                    = (known after apply)
      + id                          = (known after apply)
      + initial_node_count          = 1
      + instance_group_urls         = (known after apply)
      + label_fingerprint           = (known after apply)
      + location                    = "europe-west2"
      + logging_service             = "logging.googleapis.com/kubernetes"
      + master_version              = (known after apply)
      + monitoring_service          = "monitoring.googleapis.com/kubernetes"
      + name                        = "my-k8s-cluster"
      + network                     = "default"
      + node_locations              = (known after apply)
      + node_version                = (known after apply)
      + operation                   = (known after apply)
      + project                     = (known after apply)
      + region                      = (known after apply)
      + services_ipv4_cidr          = (known after apply)
      + subnetwork                  = (known after apply)
      + tpu_ipv4_cidr_block         = (known after apply)
      + zone                        = (known after apply)

      + addons_config {
          + cloudrun_config {
              + disabled = (known after apply)
            }

          + horizontal_pod_autoscaling {
              + disabled = (known after apply)
            }

          + http_load_balancing {
              + disabled = (known after apply)
            }

          + istio_config {
              + auth     = (known after apply)
              + disabled = (known after apply)
            }

          + kubernetes_dashboard {
              + disabled = (known after apply)
            }

          + network_policy_config {
              + disabled = (known after apply)
            }
        }

      + authenticator_groups_config {
          + security_group = (known after apply)
        }

      + cluster_autoscaling {
          + autoscaling_profile = (known after apply)
          + enabled             = (known after apply)

          + auto_provisioning_defaults {
              + oauth_scopes    = (known after apply)
              + service_account = (known after apply)
            }

          + resource_limits {
              + maximum       = (known after apply)
              + minimum       = (known after apply)
              + resource_type = (known after apply)
            }
        }

      + database_encryption {
          + key_name = (known after apply)
          + state    = (known after apply)
        }

      + master_auth {
          + client_certificate     = (known after apply)
          + client_key             = (sensitive value)
          + cluster_ca_certificate = (known after apply)

          + client_certificate_config {
              + issue_client_certificate = (known after apply)
            }
        }

      + network_policy {
          + enabled  = (known after apply)
          + provider = (known after apply)
        }

      + node_config {
          + disk_size_gb      = (known after apply)
          + disk_type         = (known after apply)
          + guest_accelerator = (known after apply)
          + image_type        = (known after apply)
          + labels            = {
              + "environment" = "test"
              + "team"        = "devops"
            }
          + local_ssd_count   = (known after apply)
          + machine_type      = "n1-standard-1"
          + metadata          = {
              + "disable-legacy-endpoints" = "true"
            }
          + oauth_scopes      = [
              + "https://www.googleapis.com/auth/logging.write",
              + "https://www.googleapis.com/auth/monitoring",
            ]
          + preemptible       = false
          + service_account   = (known after apply)
          + taint             = (known after apply)

          + shielded_instance_config {
              + enable_integrity_monitoring = (known after apply)
              + enable_secure_boot          = (known after apply)
            }

          + workload_metadata_config {
              + node_metadata = "GKE_METADATA_SERVER"
            }
        }

      + node_pool {
          + initial_node_count  = (known after apply)
          + instance_group_urls = (known after apply)
          + max_pods_per_node   = (known after apply)
          + name                = (known after apply)
          + name_prefix         = (known after apply)
          + node_count          = (known after apply)
          + node_locations      = (known after apply)
          + version             = (known after apply)

          + autoscaling {
              + max_node_count = (known after apply)
              + min_node_count = (known after apply)
            }

          + management {
              + auto_repair  = (known after apply)
              + auto_upgrade = (known after apply)
            }

          + node_config {
              + boot_disk_kms_key = (known after apply)
              + disk_size_gb      = (known after apply)
              + disk_type         = (known after apply)
              + guest_accelerator = (known after apply)
              + image_type        = (known after apply)
              + labels            = (known after apply)
              + local_ssd_count   = (known after apply)
              + machine_type      = (known after apply)
              + metadata          = (known after apply)
              + min_cpu_platform  = (known after apply)
              + oauth_scopes      = (known after apply)
              + preemptible       = (known after apply)
              + service_account   = (known after apply)
              + tags              = (known after apply)
              + taint             = (known after apply)

              + sandbox_config {
                  + sandbox_type = (known after apply)
                }

              + shielded_instance_config {
                  + enable_integrity_monitoring = (known after apply)
                  + enable_secure_boot          = (known after apply)
                }

              + workload_metadata_config {
                  + node_metadata = (known after apply)
                }
            }

          + upgrade_settings {
              + max_surge       = (known after apply)
              + max_unavailable = (known after apply)
            }
        }

      + release_channel {
          + channel = (known after apply)
        }

      + workload_identity_config {
          + identity_namespace = "workshop-demo-adqw12.svc.id.goog"
        }
    }
  Plan: 1 to add, 0 to change, 0 to destroy.
------------------------------------------------------------------------
This plan was saved to: 1.plan

To perform exactly these actions, run the following command to apply:
    terraform apply "1.plan"

Note: terraform plan output shows that this code is going to a GKE cluster. --out 1.plan flag tells terraform to save the plan in a file.

  1. The execution plan looks good, so let’s move ahead and apply this plan.
terraform apply 1.plan

Output

google_container_cluster.my_k8s_cluster: Creating...
google_container_cluster.my_k8s_cluster: Still creating... [10s elapsed]
google_container_cluster.my_k8s_cluster: Still creating... [1m10s elapsed]
google_container_cluster.my_k8s_cluster: Still creating... [2m10s elapsed]
google_container_cluster.my_k8s_cluster: Still creating... [3m20s elapsed]
google_container_cluster.my_k8s_cluster: Creation complete after 3m23s [id=projects/workshop-demo-adqw12/locations/europe-west2/clusters/my-k8s-cluster]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Note: Creating a GKE cluster could take 4-7mins.

  1. Let’s connect with GKE cluster.
gcloud container clusters get-credentials my-k8s-cluster --region europe-west2 --project workshop-demo-adqw12

Output

Fetching cluster endpoint and auth data.
kubeconfig entry generated for my-k8s-cluster.
  1. Now, run kubectl version command to check server version.
kubectl version

Output

Client Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.10-dispatcher", GitCommit:"f5757a1dee5a89cc5e29cd7159076648bf21a02b", GitTreeState:"clean", BuildDate:"2020-02-06T03:31:35Z", GoVersion:"go1.12.12b4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.10-gke.17", GitCommit:"bdceba0734835c6cb1acbd1c447caf17d8613b44", GitTreeState:"clean", BuildDate:"2020-01-17T23:10:13Z", GoVersion:"go1.12.12b4", Compiler:"gc", Platform:"linux/amd64"}
  1. Now, we will move to Google Cloud Platform Console to view GKE cluster properties. Photo Photo

  2. Once we no longer need this infrastructure, we can cleanup to reduce costs.

terraform destroy

Output

  google_container_cluster.my_k8s_cluster: Refreshing state... [id=projects/workshop-demo-adqw12/locations/europe-west2/clusters/my-k8s-cluster]

  An execution plan has been generated and is shown below.
  Resource actions are indicated with the following symbols:
    - destroy

  Terraform will perform the following actions:

    # google_container_cluster.my_k8s_cluster will be destroyed
    - resource "google_container_cluster" "my_k8s_cluster" {
        - additional_zones            = [] -> null
        - cluster_ipv4_cidr           = "10.0.0.0/14" -> null
        - default_max_pods_per_node   = 110 -> null
        - enable_binary_authorization = false -> null
        - enable_intranode_visibility = false -> null
        - enable_kubernetes_alpha     = false -> null
        - enable_legacy_abac          = false -> null
        - enable_shielded_nodes       = false -> null
        - enable_tpu                  = false -> null
        - endpoint                    = "34.89.124.93" -> null
        - id                          = "projects/workshop-demo-adqw12/locations/europe-west2/clusters/my-k8s-cluster" -> null
        - initial_node_count          = 1 -> null
        - instance_group_urls         = [
            - "https://www.googleapis.com/compute/beta/projects/workshop-demo-adqw12/zones/europe-west2-c/instanceGroups/gke-my-k8s-cluster-default-pool-7dc4a362-grp",
            - "https://www.googleapis.com/compute/beta/projects/workshop-demo-adqw12/zones/europe-west2-b/instanceGroups/gke-my-k8s-cluster-default-pool-0538b086-grp",
            - "https://www.googleapis.com/compute/beta/projects/workshop-demo-adqw12/zones/europe-west2-a/instanceGroups/gke-my-k8s-cluster-default-pool-858715ad-grp",
          ] -> null
        - label_fingerprint           = "a9dc16a7" -> null
        - location                    = "europe-west2" -> null
        - logging_service             = "logging.googleapis.com/kubernetes" -> null
        - master_version              = "1.14.10-gke.17" -> null
        - monitoring_service          = "monitoring.googleapis.com/kubernetes" -> null
        - name                        = "my-k8s-cluster" -> null
        - network                     = "projects/workshop-demo-adqw12/global/networks/default" -> null
        - node_locations              = [
            - "europe-west2-a",
            - "europe-west2-b",
            - "europe-west2-c",
          ] -> null
        - node_version                = "1.14.10-gke.17" -> null
        - project                     = "workshop-demo-adqw12" -> null
        - resource_labels             = {} -> null
        - services_ipv4_cidr          = "10.3.240.0/20" -> null
        - subnetwork                  = "projects/workshop-demo-adqw12/regions/europe-west2/subnetworks/default" -> null

        - addons_config {

            - network_policy_config {
                - disabled = true -> null
              }
          }

        - cluster_autoscaling {
            - autoscaling_profile = "BALANCED" -> null
            - enabled             = false -> null
          }

        - database_encryption {
            - state = "DECRYPTED" -> null
          }

        - master_auth {
            - cluster_ca_certificate = "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURERENDQWZTZ0F3SUJBZ0lSQVBmNzJzWRGZhd25uQUYvZz09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K" -> null

            - client_certificate_config {
                - issue_client_certificate = false -> null
              }
          }

        - network_policy {
            - enabled  = false -> null
            - provider = "PROVIDER_UNSPECIFIED" -> null
          }

        - node_config {
            - disk_size_gb      = 100 -> null
            - disk_type         = "pd-standard" -> null
            - guest_accelerator = [] -> null
            - image_type        = "COS" -> null
            - labels            = {
                - "environment" = "test"
                - "team"        = "devops"
              } -> null
            - local_ssd_count   = 0 -> null
            - machine_type      = "n1-standard-1" -> null
            - metadata          = {
                - "disable-legacy-endpoints" = "true"
              } -> null
            - oauth_scopes      = [
                - "https://www.googleapis.com/auth/logging.write",
                - "https://www.googleapis.com/auth/monitoring",
              ] -> null
            - preemptible       = false -> null
            - service_account   = "default" -> null
            - tags              = [] -> null
            - taint             = [] -> null

            - shielded_instance_config {
                - enable_integrity_monitoring = true -> null
                - enable_secure_boot          = false -> null
              }

            - workload_metadata_config {
                - node_metadata = "GKE_METADATA_SERVER" -> null
              }
          }

        - node_pool {
            - initial_node_count  = 1 -> null
            - instance_group_urls = [
                - "https://www.googleapis.com/compute/v1/projects/workshop-demo-adqw12/zones/europe-west2-c/instanceGroupManagers/gke-my-k8s-cluster-default-pool-7dc4a362-grp",
                - "https://www.googleapis.com/compute/v1/projects/workshop-demo-adqw12/zones/europe-west2-b/instanceGroupManagers/gke-my-k8s-cluster-default-pool-0538b086-grp",
                - "https://www.googleapis.com/compute/v1/projects/workshop-demo-adqw12/zones/europe-west2-a/instanceGroupManagers/gke-my-k8s-cluster-default-pool-858715ad-grp",
              ] -> null
            - max_pods_per_node   = 0 -> null
            - name                = "default-pool" -> null
            - node_count          = 1 -> null
            - node_locations      = [
                - "europe-west2-a",
                - "europe-west2-b",
                - "europe-west2-c",
              ] -> null
            - version             = "1.14.10-gke.17" -> null

            - management {
                - auto_repair  = false -> null
                - auto_upgrade = true -> null
              }

            - node_config {
                - disk_size_gb      = 100 -> null
                - disk_type         = "pd-standard" -> null
                - guest_accelerator = [] -> null
                - image_type        = "COS" -> null
                - labels            = {
                    - "environment" = "test"
                    - "team"        = "devops"
                  } -> null
                - local_ssd_count   = 0 -> null
                - machine_type      = "n1-standard-1" -> null
                - metadata          = {
                    - "disable-legacy-endpoints" = "true"
                  } -> null
                - oauth_scopes      = [
                    - "https://www.googleapis.com/auth/logging.write",
                    - "https://www.googleapis.com/auth/monitoring",
                  ] -> null
                - preemptible       = false -> null
                - service_account   = "default" -> null
                - tags              = [] -> null
                - taint             = [] -> null

                - shielded_instance_config {
                    - enable_integrity_monitoring = true -> null
                    - enable_secure_boot          = false -> null
                  }

                - workload_metadata_config {
                    - node_metadata = "GKE_METADATA_SERVER" -> null
                  }
              }
          }

        - release_channel {
            - channel = "UNSPECIFIED" -> null
          }

        - workload_identity_config {
            - identity_namespace = "workshop-demo-adqw12.svc.id.goog" -> null
          }
      }

  Plan: 0 to add, 0 to change, 1 to destroy.

  Do you really want to destroy all resources?
    Terraform will destroy all your managed infrastructure, as shown above.
    There is no undo. Only 'yes' will be accepted to confirm.

    Enter a value: yes

  google_container_cluster.my_k8s_cluster: Destroying... [id=projects/workshop-demo-adqw12/locations/europe-west2/clusters/my-k8s-cluster]
  google_container_cluster.my_k8s_cluster: Still destroying... [id=projects/workshop-demo-adqw12/locations/europe-west2/clusters/my-k8s-cluster, 10s elapsed]
  google_container_cluster.my_k8s_cluster: Still destroying... [id=projects/workshop-demo-adqw12/locations/europe-west2/clusters/my-k8s-cluster, 1m10s elapsed]
  google_container_cluster.my_k8s_cluster: Still destroying... [id=projects/workshop-demo-adqw12/locations/europe-west2/clusters/my-k8s-cluster, 2m40s elapsed]
  google_container_cluster.my_k8s_cluster: Destruction complete after 2m41s

  Destroy complete! Resources: 1 destroyed.

Hope this blog helps you build GKE Cluster Terraform.

If you have a feedback or questions, please reach out to me on LinkedIn or Twitter