Creating reusable modules in Terraform

In the previous examples, I’ve shown more and more complicated deployments of AWS infrastructure and resources. The method used in previous examples works great if I am the only person to use the code. Just me, myself, and I deploying architectural wonders into the cloud. Well, that is just not reality, now is it?

The reality is that we create technical solutions by working together as a team. After all, working together is where a team has advantages to form a better solution when we collaborate and share our knowledge and expertise.

This exercise will show continuous improvement (CI) elements by creating reusable modules and continuous development (CD) by deploying the modules.

The code discussed in this example is posted in my GitHub repository

What is discussed in this Post

Discuss Reusable modules
Creating an S3 bucket, Terraform remote state, and DynamoDB table
Using variables with reusable modules
Code for a Reusable module (new VPC in any region by different teams)
Code for a Reusable module (Docker Website)
Code for the Development team
Code for a Quality Assurance team

Caveats & assumptions

Caveat: The VPC reusable module is simple, creating private and public subnets with only two availability zones in any region. It doesn’t accommodate choosing more than two availability zones, it doesn’t accommodate choices like if you want to disable or enable IPV6 or set up a VPN gateway, etc.

Caveat: Docker Website is also very simple. It is a simple Docker container that is a website developed for this demonstration. You can technically use your docker container or any generic website container.

Assumption: To keep costs down, we are using the latest NAT instance published by AWS. NAT Gateways in a free tier account will incur costs if left in the running state for more than an hour, so I opt to use NAT instances instead of NAT gateways to save a bit of change.

Assumption: You are interested in the methods to create modular code. This blog post will discuss in detail not just the how and the why of modular coding, but also I’m trying to express the logic behind some of the requirements as I understand them.

Reusable modules

Simply put, a “module” is a folder. We put some Terraform code into that folder, and Terraform understands the content of a folder as a “MODULE.” The folders (Terraform modules) separate our Code into logical groups.

Think of modules as putting the pieces together to make a complete solution.

In the chart above, we have an example of five different folders. Each folder represents a module, and each module contributes to a complete deployment solution.

We can have a developer publish the VPC module and Security Groups. Have another person develop and publish the Web Template, and yet another developer create the Auto Scaling Group (ASG) and Load Balancer (ALB) modules. Then finally, a deployment team pulls the different modules from a published location and deploys the modules to production.

We’ll start by understanding the use of reusable modules (dry code). As we progress in writing Infrastructure as Code, we need to share code between teams like production, development, and Quality Assurance environments. This exercise will create a “reusable module” for a VPC and another “reusable module” to create an EC2 instance as a website server.

We will then simulate a Development team using the reusable modules to deploy resources into a development environment. After deploying resources for the development team, we will simulate a Quality Assurance (QA) team using the same modules (with different variables) to deploy a QA environment. The Development Team and the QA team will use the same modules (dry code) to deploy different resources in the same region or regions using the same AWS account but different credentials or even launching from different accounts.

The Terraform method for reusable code is to use Terraform module source.

The source argument in a module block tells Terraform where to find the source code for the desired child module. Terraform uses this during the module installation step terraform init to download the source code to a directory on a local disk so that it can be used by other Terraform commands.
HashiCorp Terrafrom

in this exercise, we will place our Terraform code into a shared location and, as per normal practice, refer to the shared location as the “source module.” We can create a “source module” in any folder. The folder can be your local drive or a source code management system (SCM) like GitHub, Artifactory, or Bitbucket. A “source module” can be any network folder in your local area network (LAN) or Wide Area Network (WAN), so long as you, the user, has permission to read and write to the network shared folder.

I believe the best place for reusable code is a source code management system (SCM) like GitHub, BitBucket, GitLab, or Artifactory. At the time of this writing, my personal preference is to use GitHub.

We create a reference to a source module by putting a statement in Terraform like the following (which becomes the module configuration block):

Remember that the module’s “name” can be any name you desire when declaring the module. It does not have to be the same or similar to the source code for the source module to work.

Why are we using S3 remote Terraform State and DynamoDB

Let’s use an example of a three-tier application that is under development. The first tier is a front-end Web service for our customers. Another tier is the application layer that performs ERP¹ services, and the third tier will hold the database (back-end services).

We have a developer (Developer-A) responsible for developing and releasing changes to our front-end web service. Another developer (Developer-B) is responsible for developing the ERP¹application service. Both developers have access to make changes in the development environment. Both developers can launch, create and destroy resources in the development environment.

Both developers perform most of their work offline and use the AWS Cloud developer’s environment on a limited basis because most of the development is performed offline and not in the cloud environments. Developer A is ready to test his changes and performs Terraform Init and Terraform Apply to create the environment. So the development environment is now running in AWS and operational.

On the very same day, Developer B will make a major change to the ERP application server. Developer B wants to move the ERP server to a different subnet. Developer B modifies his version of a reusable module, and then Developer B executes the change by performing Terraform Init and Terraform Apply, thus moving the ERP server to a different subnet. Suddenly Developer A, who is working in the same environment, observed major errors on the Front End servers that he had already deployed because developer B had moved the application servers; hence, Developer B’s change impacted developer A’s development test.

Developer B went into our reusable module after Developer A had already used the same module to launch the AWS resources. Terraform happily made the changes which caused Developer A to see unexpected failures. If we use “Terraform Remote state” in an AWS S3 bucket and DynamoDB to lock our remote state, Developer B would be prevented from executing changes to AWS resources after Developer A has locked the Terraform State. Developer B would then need to communicate and coordinate any necessary change with Developer A.

By putting a Lock on the S3 remote state, we can prevent team members from making a change that impacts AWS resources without coordination between members.

DynamoDB’s locking of Terraform State doesn’t prevent us from making a change to our resources, it simply prevents other team members from making unexpected changes after a resource is deployed.

OK, let’s get started and set up the folders

Let’s create our folder structure before getting started. The first folder, named “Modules,” will hold the reusable modules, and the second folder, named “Teams,” will be used by our team members. The third folder holds a few things to help us manage our Terraform state.

You can place the “Modules” folder and the “Teams” folder anywhere. For example, you can put the “modules folder” and its content on a separate computer from the “Teams folder.”

For brevity, why don’t we keep it simple for now and place everything in a folder structure like the following:

reusable_modules_exercise
 └Modules
   └VPC
   └Docker_website
 └Teams
   └Quality_Assurance
   └Development
 └MGMT
   └S3_Bucket
     └create_S3_bucket
     └Dev_remote_state
     └QA_remote_state
     └Create_DynamoDB_Table
   └Bastion_Host

Creating the AWS S3 bucket, Terraform state file, and DynamoDB table

Before using an S3 bucket and Terraform remote state file. We should create the bucket and Terraform remote state file independently and, most importantly, create the DynamoDB for locking Terraform remote state before creating any AWS resources that utilize the Terraform remote state.

We will create one AWS S3 bucket. Two Terraform state files, one for our Development team and one for our Test team. And one DynamoDB table that keeps the data regarding locks put in place for our Terraform remote state.

Creating the S3 bucket

Change directory to the folder ~/reusable_modules_exercise/mgmt/S3_bucket/create_s3_bucket

S3_bucket.tf

Reminder! Be sure to change the name of the bucket into a “unique name” of your choice

After creating this file, perform terraform init, terraform validate, and terraform apply to create the S3 bucket.

A few things to consume about our “S3_bucket.tf”. The line lifecycle {prevent_destroy = true} prevents someone from accidentally deleting an S3 bucket.

resource "aws_s3_bucket_server_side_encryption_configuration" This block of code enables server-side encryption. You certainly want to read up on your choices regarding encryption choices to use either “Amazon S3-managed keys (SSE-S3)” or “AWS key management service key (SSE-KMS).” I recommend reading the Terraform registry and Amazon Docs. As you can see, I’m letting AWS create and manage the key for our bucket by configuring the block with the choice of “sse_algorithm.” Amazon S3-Managed Keys (SSE-S3).

resource "aws_s3_bucket_versioning" "bucket_versioning" This code block establishes if you want to use versioning in the S3 bucket. Versioning allows reverting to a previous version of Terraform state from a disaster recovery standpoint, it makes sense to use versioning. When teams use reusable modules without a DyanamoDB lock, you most definitely want to version your code with a source code management system like GitHub. Nothing wrong with enabling it by default. You might never need to revert to a previous version of Terraform remote state UNTIL you need it, and boy, you’ll wish you had versioning in place when that happens. Especially in a production deployment, maybe not so much in a development environment.

resource "aws_s3_bucket_public_access_block" You might see some examples applying this resource setting via Terraform. Personally, I recommend skipping this optional block of code for an S3 bucket. By default, Public Access is denied for all S3 buckets unless you specifically allow public access (For example – turning an S3 bucket into a static website). I recommend leaving it out, AWS by default denies public access, which is perfect for a Terraform Remote State S3 bucket.

Creating the Remote state files

Remote state for the development team

Change directory to the folder ~terraform/reusable_modules_exercise/mgmt/s3_bucket/Dev_remote_state

dev_remote_state.tf

After creating this file, perform terraform init, terraform validate, and terraform apply to create the remote state file for our development team.

Don’t worry if Terraform says that nothing happened. If this is the first time executing this code, it does, in fact, create the “tfstate” file.

The code should be in its own folder, separate from creating an S3 bucket. Because the bucket must also already exist to place the “tfstate” file in the bucket.

Remote state for the Test team

Change directory to the folder ~terraform /reusable_modules_exercise/mgmt/s3_bucket/QA_remote_state

test_remote_state.tf

After creating this file, perform terraform init, terraform validate, and terraform apply to create the remote state file for our QA team.

Creating the DyanmoDB database

Change directory to the folder ~terraform/reusable_modules_exercise/mgmt/s3_bucket/Create_DynamoDB_Table

Create_DynamoDB_table.tf

The secret to creating this DynamoDB table is in the “hash_key.” When Terraform is pointed to a DynamoDB table, it will place the Terraform remote state into DynamoDB’s NoSQL database using the HASH_KEY as the primary ID for each Terraform Remote State. Yes, that’s right, we need only ONE DyanamoDB table that can handle multiple Terraform states. We will be using the DynamoDB database twice in this exercise. Once the Development team with a unique “tfstate” file is placed into DyanamoDB, our QA team will have its own unique “tfstate” file in DynamoDB. Terraform will simply create a new unique “LockID” for each Terraform state file.

Once again, I recommend separating the code into its own folder from the above code. Primarily because we need only one Database for all our teams using the same DynamoDB. Development, Test, QA, and Production deployments can use the same DynamoDB database because each will have its own “tfstate” file and a unique LockID in the database.

Code for Reusable Modules

Reusable modules are “child modules” because when we execute terraform init , the reusable modules are downloaded into the calling directory (which becomes the “parent” module). There is a relationship between the parent module and the child module. The parent module uses the following:

When we initiate terraform init, Terraform knows where to get the child module because of the “Source” module configuration block (shown above). Terraform will download the reusable module from the source into the current directory and configure the downloaded module with values stipulated by the “variable =” as shown above.

The Terraform workflow is different from previous exercises. Here are a few pointers;

Remember the module that is doing all the work is the reusable module which is the “source (child) module” which is downloaded into the current directory.
The “Parent” module calls the “child module” (reusable module) and passes variables to the child module.
We are using Terraform Remote State, BUT there is a really big caveat as to how we use Terraform Remote State in this scenario;
- In the previous exercises, we used “inputs” and “outputs” to Terraform Remote State. In this case, while we are still using outputs, but in this case, we are using Terraform State to lock our configuration and not so much to pass inputs and outputs to/from our remote state file.

Code that creates our reusable modules

So now that we have created our S3 bucket, the Terraform state file, and a DynamoDB table, we are ready to set up some code as a reusable module.

Change directory to ~/reusable_modules_exercise/modules

Now let’s create our first reusable module, the VPC. We will start with a terraform_remote_state S3 bucket configuration. It is important to use variables for the bucket name and even more important to use a variable for the key name. Why you might ask? Well, that’s a great question; let me explain. ☺︎

It is recommended that each team use a unique terraform state file. Terraform writes the state data to a remote data store, which can then be shared between all members of a team. We want to separate our team environments because they usually have different requirements and process. Also each team usually requires its own control of release and configuration management. Therefore each team will use a unique terraform state file.
Me

We are going to use a lot of variables. Since we are using the same reusable code for different teams, we will need a method to cause a change of configuration for AWS resources per each team’s requirements. Hence, we use variables for each team to have the ability to apply a variance to an AWS resource.

Examples of variance

Size – A development team might use a “t2.micro” size for an AWS EC2 resource, but our Production team needs to assign a larger type “t3.large” instance type.

Stage – We need to differentiate between development, QA, and production, so we’ll use a variable called “Stage.” Creating a tag called “Stage” and assigning an appropriate variable identifying the team that owns the new resource. We will take advantage of this in other modules by using a filter to identify resources managed by which team.

Region – Our teams might be in different regions, so we’ll enable deployments into different regions using the same code but setting a “Region” variable.

Variables are key! Defining what needs to be configured for the different teams is a very important element when planning the use of reusable code.

Using Variables in reusable modules

Let’s start with an understanding of the usage of variables.

Reusable modules may have variables declared and used only in the module.
Reusable modules will have variables declared in the parent module and passed to the reusable module. This is exactly how we create a variance in deploying a reusable module.
- For example, a development team uses region (us-west-1), and the QA team uses region (us-east-1). We will create our variable in the reusable module, the parent module, and the parent module’s configuration block to accomplish the variance.
  - reusable module declares – variable “region” {}
  - parent module also declares – variable “region” {}
  - parent module assigns a value to the variable in the module’s configuration block. See below:

There is one more variable discussion. When we want to prevent sensitive information from being published on GitHub, we will move an assignment of a value into a private file like “terraform.tfvars”.

In the module configuration block below, we normally assign values to variables, in this case, “bucket” with a value “my-bucket-terraform-states.” However, I don’t want the general public to know the name of my S3 bucket. Instead, I assign a variable in the configuration block and input the value in a file named”terraform.tfvars” instead of the configuration block. We also set up a special file called “gitignore” to instruct GIT to ignore the file “terraform.tfvars” when pushing to GitHub. Hence, the bucket name will not be published on GitHub and thus becomes a privately assigned value.

For example, in the line of code (instance_type = var.instance_type) in the example above, we use a variable where we would normally assign a value.

With any module, a simple thing like creating a variable for “Instance_type” needs to be declared, assigned to a resource, and given a value.

But when using reusable modules, the variables declaration, assignment to a resource, and then giving the variable a value will be placed into at least three, possibly four, different files.

The first rule is to declare the variable in both the parent and child modules. We assign a value to the variable in a configuration block in the parent module.

Type	Module	File
Declare variable	Parent Module	teams/…/variables.tf
Declare variable	Reusable module	modules/vpc/variables.tf
assign variable to a resource	Reusable module	modules/vpc/vpc.tf
Assign a value to the variable	Parent Module (Module configuration block)	teams/…/vpc.tf

To summarize:

The parent module and the child module must both declare a variable that is going to be configured in the parent module and assigned to a resource in the child (reusable) module:
variable "instance_type" {}

The child (reusable) module will assign a variable to a resource:
instance_type = var.instance_type

Normally, the parent module then assigns a value to the variable in the parent module configuration block:

But when it’s sensitive information, we skip the above step and assign the value in Terraform’s environment file, “terraform.tfvars”.

Let’s pretend that “instance_type” is sensitive information, and we do not want the value of instance_type published to GitHub. So instead of assigning a value in the module’s configuration block, as shown above, we will pass the buck to “Terraform.tfvars.” We instead assign a variable once again in the configuration block and assign a value in “terraform.tfvars, as shown in the example below:

Then assign the value in the Terraform.tfvars file:
instance_type = "t2.micro"

So let’s start with the first reusable file

The first reusable module – will be an AWS Virtual Private Cloud (VPC) reusable module.

First, we must decide what is configurable when creating the VPC. Different teams will want some control over the VPC configuration. So what would they want to configure (variance):

We want the S3 remote state bucket, State key, bucket region, and DynamoDB assignment to be configurable, as we want each team to manage their own VPC and the VPC Remote State
We need a tag to identify which team the VPC belongs to and a tag as to who takes ownership of the VPC
We want the region to be configurable by our teams
We want the NAT instance to have configurable sizing as per Team requirements
We might want the SSH inbound CIDR block to change as our teams might be in different regions and networks. Therefore, we need the SSH inbound CIDR block (I call it SSH_location) to be configurable by our teams
We probably want a different EC2 Key pair per team, especially if they are in different regions. I’d go so far as to say that production should be managed from a different account, using different EC2 key pairs and unique IAM policies. So we need the EC2 key pair configurable with reusable code.

As per the above conversation, we must declare the following variables in the parent and child modules that allow different teams to apply their configuration (variance) to the reusable modules.

We will then assign a value to each variable in the parent module.

Remember: All folders are considered Modules in Terraform

So first, we create a “variables.tf’ file in ALL reusable (child) modules:
~/terraform/reusable_modules/modules/vpc/variables.tf
~/terraform/reusable_modules/modules/Docker_Website/variables.tf
and we’ll create the same variables file in ALL parent modules, we’ll create a variables file for the development team:
~/terraform/reusable_modules/team/development/variables.tf
and we’ll create the same file for the QA team:
~/terraform/reusable_modules/team/QA/variables.tf

Variables that are declared and configured only in the reusable module

Note: in a future version, I might try my hand at doing the same as some of the more famous community VPC modules where we can create a subnet per AZ and/or stipulate how many subnets, like two subnets vs. four subnets. For now, I have hard-coded into the VPC module

Note 2: We want to use our own VPC coding simply because we want to use NAT instances vs. NAT gateways. It’s not an option in any of the community modules.

VPC (reusable module)

Change directory to ~/terraform/reusable_modules_exercise/modules/vpc, and include the following files vpc.tf, variables.tf, security_groups.tg and outputs.tf (documented below and included in my GitHub repository)

variables.tf (in the reusable module)

Security_Groups.tf

The following code establishes security groups for our (VPC) reusable module.

The security group for NAT instances allows HTTP and HTTPS only from the private subnets (thus allowing any instances in the private subnets to reach out to the internet for updates, patches, and download new software).

The security group for Docker Server allows HTTP and HTTPS from my Public IP address (ssh_location variable) and all traffic outbound to the internet. Allowing all traffic outbound to the internet is typical of a “Public Subnet.”

We are placing our Docker server in the public subnet, which is Ok for this exercise. So technically, we don’t need the NAT instances or the private subnets because we only place one EC2 Instance in one public subnet. Just for grins, I kept the private subnets.

vpc.tf (reusable module)

Outputs.tf

Docker_website (reusable module)

Our teams will use this module to deploy an AWS EC2 instance with scripts to install Docker and launch one Docker container that I created and published publicly in Docker Hub.

Several features to understand about this reusable module.

There is a dependency that the team’s VPC is already deployed
The module first communicates with AWS API to get data about the team’s VPC
- For instance, data “aws_vpcs” “vpc” gets data for all VPCs in the region
- Our data query to the API includes a filter, which will filter our query to return only the VPC with an environment value whose value is set by the parent module. For instance, if the parent module sets var.enviroment = development , then our query to the API will return only the ID of the VPC created by our development team.
You will notice that we have similar queries to find the team’s public subnet and the team’s security group for a web server.

Change directory to ~/terraform/reusable_modules_exercise/modules/Docker_website and create the following files: docker.tf, variables.tf, bootstrap_docker_web.sh, outputs.tf

docker.tf

variables.tf

bootstrap_docker_web.sh

outputs.tf

Creating code for the parent modules

Now comes the fun part. This code might appear similar to some community modules developed and published by different companies. Many community modules are complex in trying to solve all possible permutations someone might require of their module. For instance, many community VPC modules try to accommodate someone who may or may not require a VPN or a DirectLink connection to their VPC. Most published community modules allow a VPC to choose how many availability zones to deploy a subnet.

The VPC module in this example, the child module, and the parent module have simple requirements because my goal is to demonstrate how to create a module and only just a simple demonstration. Simplicity is the easiest method to reach a broader audience, right?

I already have a more complex demonstration planned for my next blog post, which will be a method for different teams to deploy an auto-scaled and load-balanced WordPress website using EFS for persistent storage that can use the reusable modules for the development team or a QA team etc. Soon to be published.

So first, let’s look at the variables configuring the reusable module AWS resources specifically for each team’s requirement.

The development team requires its own S3 bucket and remote state file, so it will declare the necessary variables and assign values unique to the development team
The same applies to an EC2-key pair, EC2 instance type, in-bound SSH CIDR block (SSH-Location), etc.
Some of the variables will be assigned a value in the parent modules configuration blog
Some sensitive information variables will assign a value in our “terraform.tfvars” file.

Let’s start with the Development team

Change directory to ~/terraform/reusable_modules_exercise/teams/development and add the following files.

variables.tf (development team)

The variables for our Development team

terraform.tfvars

With sensitive values, our Development team’s values will be declared in the file “terraform.tfvars. Teams can utilize the same S3 bucket for Terraform Remote State; it is the “state-key” that must be unique for each team.”

main.tf (parent module for development)

We are going to declare the VPC module and the Docker_website module. In this file (parent module), we will declare the source (path) of the child modules and the configuration to be applied to the child modules (by giving values to variables).

Note: module configuration block named “module “Docker_web” below has the line depends_on = [module.dev_vpc]. When putting together different modules like first the VPC, followed by creating our docker website, Terraform does not easily determine the dependencies. Without the “depends_on,” Terraform will try to deploy both modules simultaneously, and without the VPC already in place, our docker website will fail. This is easily fixed by the “depends_on” statement, which tells Terraform the VPC module must be completed before executing the “Docker_web” module.

Parent Module outputs

Yes, we have already declared outputs in the reusable module. But with reusable modules, if you want to see the outputs, we have to declare the outputs in our Parent Module as well. Just like variables, outputs have to be declared both in the child and parent modules.

outputs.tf

Create Quality Assurance Parent Module

Change directory to ~/terraform/reusable_modules_exercise/teams/quality_assurance and add the following files: main.tf, variables.tf, terraform.tfvars, output.tf.

Variables for the quality assurance team

You might notice the “variables.tf” file for the QA team is exactly the same as the development team’s “variables.tf”. That is because both teams are calling the same reusable modules. The magic happens when we assign a value to the variables

variables.tf

terraform.tfvars

Again, this is where our QA team will create variances required by their team. You’ll not that I give an example of our QA team using the “US-West-2” region instead of “Us-West-1” like the development team uses for their region. Also, note I have stipulated an instance type of “t2.micro” to demonstrate another variance between teams.

main.tf (parent configuration module for QA team)

outputs.tf

Deployment

Be sure to update the “terraform.tfvars” file to your settings. The GitHub repository does not have these files, so you will have to create a file for the development team and another for the QA team.

Please change the directory to ~/terraform/reusable_modules_exercise/teams/development

Perform the following terraform actions:

terraform init
terraform validate
terraform apply

Once completed, Terraform will have deployed our reusable code into AWS inside of the region specified by the settings configured in the parent module

Then change the directory to ~/terraform/reusable_modules_exercise/teams/quality_assurance

And perform the following actions:

terraform init
terraform validate
terraform apply

Once completed, Terraform will have deployed reusable cod for Quality Assurance. If you configured the Quality Assurance configuration with a different region, the same type of AWS resources is installed in a different region using the same reusable code.

Once completed with this exercise, feel free to remove all resources by issuing the following command in the terminal:

Change the directory to each team’s directory and perform the following destroy task. We don’t want to leave our EC2 instances running and forget about them.

AWS allows 750 hours of free tier EC2 hours. If you leave this exercise running, it has six EC2 instances (three for each team); left running it will use up your allowance of free EC2 hours in 5 days.

terraform destroy

This is not for production!

All public websites should have an application firewall between the Web Server and its internet connection, this exercise doesn’t create a firewall. So do not use this configuration for production

Most cloud deployments should have monitoring in place to detect and alert someone should an event occur to any resources that require remediation. this exercise does not include any monitoring

It is a good idea to remove All resources when you have completed this exercise so as not to incur costs

¹ Enterprise resource planning (ERP) refers to a type of software that organizations use to manage day-to-day business activities such as accounting, procurement, project management, risk management and compliance, and supply chain operations.