Reading the title, you must have a fair idea of what we’re discussing in this note. Also, I followed a few best practices while creating the Amazon ElastiCache service, like enabling multi-availability zone, multi-node, logging, and encryption in transit and at rest. I have a link to my GitHub repository with the Terraform and GitHub Actions pipeline code.
Amazon ElastiCache service supports Redis and Memcached. Knowing about this Amazon service will benefit readers of this note. If you want an in-memory caching solution for your application, check out the AWS Docs. In this note, I focus only on how to provision an Amazon ElastiCache for the Redis cluster.
If you want to learn how to provision an Amazon ElastiCache for Memcached, head over to -create an Amazon ElastiCache for Memcached using Terraform and GitHub Actions.
The ElastiCache for Redis cluster exists inside an Amazon VPC; subnet groups (collection of subnets), to be precise. The service also requires a specific parameter group that holds key-value pairs of properties applicable to the ElastiCache cluster. You can also determine the instance type of the cache nodes, the number of node groups, and replicas per node group for the cluster. The cluster must be hosted on multiple availability zones to maintain high availability. The cache nodes of the cluster accept connections on a specific port; hence, the attached security group requires an ingress on that port. You can also enable logging (slow and engine logs) and publish them to AWS CloudWatch Logs for retrieval and analysis. The ElastiCache cluster also requires an AWS KMS key to encrypt the data at rest and an AUTH token to encrypt the data in transit. I know there are a lot of details, so let me explain them one by one, starting with the Amazon VPC.
Please note that there are two branches in the repository. The main
branch version includes the Terraform code to create the Amazon ElastiCache cluster and additional resources to test the access to the Amazon ElastiCache cluster from an Amazon EC2 instance in the same Amazon VPC. If you only want to learn about the resources to create the Amazon ElastiCache for the Redis cluster, refer to the branch create-amazon-elasticache
in the repository.
There are eight high-level steps to creating an Amazon ElastiCache for Redis cluster. Here's the link to my GitHub repository: amazon-elasticache-redis-tf, so you can follow along.
Step 1: Create an Amazon VPC and subnets to host the ElastiCache cluster
I created four subnets: - three private for each ElastiCache cluster node and one public. The ElastiCache cluster does not require the public subnet at all. (Note: Although I called that a public subnet, there isn't a route to 0.0.0.0/0
in the public subnet's route table; hence, it's as good as a private subnet.) Please note that I'm referring to the branch create-amazon-elasticache
in the repository.
Step 2: Create a subnet group to host the ElastiCache cluster
The cache nodes in the ElastiCache cluster are spread evenly among the subnets you specify in the subnet group. The below code adds the private
subnets to the subnet group.
Step 3: Create a security group to access the ElastiCache cluster
A security group controls access to an ElastiCache cluster's cache nodes. There are two blocks in a security group -the ingress and egress. Since I selected port number 6379 in the ElastiCache cluster (discussed later), I enabled ingress on that port in the attached security group with the source set to the Amazon VPC's CIDR. And for egress, I have it set to 0.0.0.0/0
.
Step 4: Create an AWS KMS key to encrypt the cache data at rest
The AWS KMS key encrypts the cache data stored in the cache nodes.
Step 5: Add an AWS KMS key policy
The KMS key policy has two statements. The first is the default key policy, and the second ensures the key encrypts AWS CloudWatch Log group logs.
Step 6: Create AWS CloudWatch Log groups
Amazon ElastiCache for Redis supports two types of logging via AWS CloudWatch logs: slow and engine logs. For more information on log delivery, refer to this note -ElastiCache-log-delivery.
I could also enable log encryption due to the AWS KMS key policy I created in the previous step.
Step 7: Create an AWS Secrets Manager secret to store the AUTH_TOKEN to encrypt the cache data in transit
The auth_token
is required to communicate with the ElastiCache cluster. I used the random_password
resource of the random provider{}
to create the token. Certain limitations exist on the length and permissible nonalphanumeric characters of the auth_token
that you may read at elasticache-auth-overview.
Since this is a sensitive value, I specified an AWS Secrets Manager secret to store that and directed Terraform to fetch the value from the Secrets Manager secret while creating the ElastiCache cluster.
Step 8: Create the Amazon ElastiCache for Redis cluster
Finally, the Amazon ElastiCache for Redis cluster is the last resource to create. The code block below specifies all the properties required to create it. For a list of all the properties, check Terraform AWS Registry: elasticache_replication_group. To learn about the supported node types, click here: Cache-Node-Types.
That is all the code required to create the ElastiCache cluster.
However, please note that to use the ElastiCache cluster, there are a few additional steps if you want to follow best practices. These are not necessary to create the ElastiCache cluster but are required to use the ElastiCache cluster.
Best practice 1: Store the ElastiCache endpoint and port number as parameters in the AWS Systems Manager Parameter Store.
Any application using the ElastiCache cluster will access it on the endpoint and the port number. Hence, the application would require those values. Although you can embed values in the application, an alternate approach is to store these as encrypted values in the Systems Manager Parameter Store as parameters and provide permissions to the application to fetch and decrypt them from there. That way, hardcoding the values is not required, and the application stays clear of confidential data.
Best practice 2: Create an IAM policy to access the ElastiCache cluster endpoint and port number from the Systems Manager Parameter Store.
After storing the endpoint and port number as Systems Manager parameters, I created an IAM policy that allows access to those. An application would require the same IAM policy to be attached to the role it assumes to communicate with the Systems Manager parameters. After attaching the IAM policy, the application can communicate with the Systems Manager parameters and get the values of the ElastiCache endpoint and the port number.
As you can see from the policy above, it is specific to the two resources (the endpoint and the port number) and access to decrypt it using the same AWS KMS Key used to encrypt it.
Best Practice 3: Create an IAM policy to access the ElastiCache cluster auth_token
from AWS Secrets Manager secret.
An application would require access to the ElastiCache auth_token
to communicate with the cache cluster. An auth_token
is sensitive data, so it should not be stored inside an application. I followed best practices earlier (Step# 7) and stored the auth_token
inside Secrets Manager as a secret. Then, I created an IAM policy to access the Secrets Manager secret. Like the previous one, this IAM policy must also be attached to the IAM role that an application would assume to communicate with the ElastiCache cluster.
This IAM policy, too, is restrictive (no wildcards) and allows access only to get the secret value and then decrypt that using the same AWS KMS Key used to encrypt it.
Finally, I also automated the provisioning of the AWS resources using GitHub Actions. If you want to learn how I managed the authentication process, head over to this note –securely-integrate-aws-credentials-with-github-actions-using-openid-connect. I also scanned my code for vulnerabilities using Bridgecrew Checkov, which generated a scan report. Here is a note on how to enable Checkov with GitHub Actions. I'm also interested in the cost of these resources and use Infracost for that, which I have covered in detail in this note –estimate AWS resource cost. The vulnerabilities and cost estimate are available in the pull request as comments.
Once I had added all the resources and their configurations to my Terraform code, I pushed the changes to my remote repository and created a pull request (open link). After I merged the pull request into the main branch, the GitHub Actions workflow concluded with the command terraform apply
enabled and provisioned the resources.
To verify, on the AWS Management Console, I navigated to ElastiCache → Redis clusters and selected the ElastiCache cluster that I created using the Terraform code. The below image lists some of the properties of the cluster.
Scrolling down, I saw the individual cache nodes, too.
That brings us to the end of this note. Let me know if you have any questions or suggestions. In my subsequent note, I demonstrate how to interact with the Redis cluster to connect using Python to the ElastiCache for Redis cluster from an Amazon EC2 instance in the same VPC.