Debugging and Troubleshooting Terraform

Share:

The world of Infrastructure as Code (IaC) can sometimes be complex and challenging. Especially when you are constructing a complex network infrastructure, there are plenty of situations where things might not work as expected. In these conditions, what comes to your aid is your ability to debug and troubleshoot issues. This chapter is dedicated to sharing a guide on how to debug and troubleshoot with Terraform, an open-source infrastructure as code software tool that provides a consistent CLI workflow to manage hundreds of cloud services.

Debugging in Terraform

Terraform provides detailed logs that can help you in debugging issues. By default, Terraform does not generate these logs. To enable the detailed logs, you need to set the TF_LOG environment variable. It accepts string values to represent the log level. The levels are TRACE, DEBUG, INFO, WARN and ERROR, with TRACE being the most verbose and ERROR being the least.

Here is an example of how to set the TF_LOG in your shell:

$ export TF_LOG=TRACE

You can also use the special value TRACE to enable full debugging. It will log everything including sensitive variables, resource attributes, and the complete diff of each step.

Now that we have a grasp of how to enable detailed logging, let's take a simple example. Consider a terraform script named main.tf for setting up a VM in the Avengers virtual private cloud in the Azure region.

provider "azurerm" {
  features {}
}

resource "azurerm_virtual_machine" "ironman_vm" {
  name                  = "ironman-vm"
  location              = "East US"
  resource_group_name   = "avengers-vpc"
  network_interface_id  = "${azurerm_network_interface.ironman_nic.id}"
  vm_size               = "Standard_A0"

  delete_os_disk_on_termination    = true
  delete_data_disks_on_termination = true

  os_profile {
    computer_name  = "ironman"
    admin_username = "ironman"
    admin_password = "******************"
  }

  os_profile_windows_config {}
}

If your script fails to execute for some reason, you can get detailed logs by setting the TF_LOG as shown before. This can give you a lot of information about what went wrong.

In addition to setting the environment variable TF_LOG, you can also specify where the logs should be written by setting "TF_LOG_PATH". Terraform will append its logs to this file:

$ export TF_LOG_PATH=./terraform.log

This is especially handy when dealing with long-running terraform scripts as it allows us to review the logs later without worrying about losing the console output.

Troubleshooting Common Issues

Let's dig into scenarios that might cause issues in your scripts and how to tackle them.

Terraform Plan Fails

This usually happens when there is a discrepancy between the state file and the actual infrastructure, or the Terraform configuration has some issues.

Remember to check the error message in the logs for clues. It's most likely that the error would be something related to variable misconfiguration, forgotten dependencies or misaligned resource attributes. In our case if the name of the VM is already taken or if the resource group "avengers-vpc" is not yet present, the configuration would fail.

Resources Fail to Destroy

Sometimes, when you run terraform destroy, some resources may fail to get destroyed. This is usually caused by dependencies between resources that were not specified in the configuration.

To tackle this situation, you can introduce depends_on logical arguments into your configurations. For instance:

resource "azurerm_virtual_network" "avengers_vnet" {
  name                = "avengers-vnet"
  address_space       = ["10.0.0.0/16"]
  location            = "West US"
  resource_group_name = "avengers"
}

resource "azurerm_subnet" "avengers_subnet" {
  name                 = "avengers-subnet"
  resource_group_name  = "avengers"
  virtual_network_name = "${azurerm_virtual_network.avengers_vnet.name}"
  address_prefix       = "10.0.1.0/24"
}

resource "azurerm_network_interface" "ironman_nic" {
  name                = "ironman-nic"
  location            = "West US"
  resource_group_name = "avengers"

  ip_configuration {
    name                          = "ipconfig"
    subnet_id                     = "${azurerm_subnet.avengers_subnet.id}"
    private_ip_address_allocation = "Dynamic"
  }

  depends_on = [
    azurerm_virtual_network.avengers_vnet
  ]  
}

Now, the 'ironman_nic' will only be created after 'avengers_vnet' is created.

Errors after Upgrading Terraform

When upgrading Terraform, your script, which was working fine previously, might start throwing errors. This usually happens because Terraform introduces breaking changes in their releases. It's always a good idea to read the changelog of the version to which you are upgrading to know if there are any breaking changes.

Debugging and troubleshooting in Terraform requires a calm and attentive mind. While the detailed logs that Terraform provides can seem overwhelming at the beginning, practice and experience will help you narrow down the information you need to analyze to pinpoint the exact issue. Happy Terraforming!

0 Comment


Sign up or Log in to leave a comment


Recent job openings