A Lambda layer is a distribution mechanism for libraries, custom runtimes, or other dependencies required in the AWS Lambda functions. Cloud engineers can manage and reuse these libraries and dependencies across multiple functions by packaging them into a layer. By the end of this note, you will learn how to create a Lambda layer for a Python library and share it with all the AWS accounts in the organization. There are three primary benefits of using a Lambda layer. These are: 1. Lambda layers promote code reuse, allowing applications to share libraries and dependencies across multiple functions to simplify management. 2. Layers enable version control, allowing cloud engineers to maintain and roll back to specific versions of dependencies as needed. 3. Layers encourage standardization of dependencies and ensure consistency across different functions, minimizing compatibility issues and improving reliability. In this note, I’ll demonstrate how to create a Lambda layer using Docker, Terraform, and GitHub Actions.
But why use the Lambda layer?
The AWS Lambda service comes with a set of built-in libraries in its runtime environment. If you used the service for Python, you might have encountered the boto3
, json
, and logging
Python libraries. These are examples of built-in Python libraries; cloud engineer teams are not required to supply the library/ies with the Python file/s. So, while the AWS Lambda runtime contains a few libraries, many more need to be included based on the use case. The Lambda layer is a mechanism the AWS cloud service offers to add such libraries that aren’t included in the Lambda runtime environment, enabling error-free execution of the Python functions.
A Lambda layer can consist of one or many libraries. In the Python runtime, numpy
, pandas
, and requests
are some example libraries.
While proposing using the AWS Lambda layer, an alternate approach is to package the Python code with the libraries and upload the package to AWS Lambda.
However, cloud engineering teams can effectively manage dependencies and share libraries across multiple AWS Lambda functions using Lambda layers. This modular approach allows the project teams to keep the AWS Lambda function code clean and lightweight, streamline the update process for shared libraries, and improve the overall maintainability of the serverless applications.
Platform mismatch is a common issue when developing Python applications for AWS Lambda that require a library that is unavailable in the Lambda runtime. For example, if you’re using Windows to install Python3 packages via pip3
, the resulting binaries are tailored for the Windows environment. However, AWS Lambda runs on a Linux-based environment (specifically Amazon Linux), often leading to compatibility issues. Libraries that rely on compiled extensions or native dependencies can fail to work when deployed to AWS Lambda (either as a package with the Python code or as a Lambda layer), as they may not be compatible with the underlying Linux OS. Creating a deployment package (Python libraries) in a Linux environment (such as Docker) is advisable to avoid incompatibility. This approach ensures that the libraries operate correctly in the serverless environment.
In my quest for a solution, I found the following article from CapitalOne: Creating Lambda Layers.
I recommend that you follow the steps articulated in that article. If you are interested and want to learn how I created an AWS Lambda layer following the guidance, please continue to read. I have the Terraform code in my GitHub repository: kunduso/aws-lambda-layer-terraform. You must install Docker on your local laptop and decide on the Python3 version for the AWS Lambda runtime. The AWS Lambda service currently supports Python3.9 to Python3.12 (support for Python 3.8 ends by October 2024).
1. Build the Docker Image As suggested in the referenced article, the first step is to access the GitHub repository: aws-lambda-base-images and then clone it to your local. Having a copy (fork) of the GitHub repository might also be a good idea. Please also utilize this opportunity to go through the repository ReadMe. You will learn why this repository exists and how AWS plans to help the development community.
After cloning the GitHub repository, I selected the appropriate branch based on the Python runtime. On this note, I am demonstrating with Python3.10, and hence, I checked out that branch git checkout python3.10
Then, I followed the reference article’s suggestion and commented on the last line in the Dockerfile.python3.10
file.
Then, I built the image from the Dockerfile using the following command:
docker build -t aws-python3.10:local -f Dockerfile.python3.10 .
That created a Docker image called
aws-python3:10:local
in the Docker desktop instance running on my laptop.
2. Create a directory for the Python libraries
Per the reference article, the next step is to create the Python directory to store the libraries. In this use case, I am referencing two GitHub repositories. Repository#1 is aws-lambda-base-images
where I executed the Docker commands and created the Docker image. Repository#2 is aws-lambda-layer-terraform
where I have the Terraform code and the Python3.10 libraries to create the Lambda layer.
After creating the Docker image from the aws-lambda-base-images
repository, I accessed the aws-lambda-layer-terraform
repository in another VS Code window and created the Python directory. The below image shows where I created the folder. Please note that the top-level folder (layer3.10) could have any name, but the root folder must be named python
. That is an AWS requirement. For more information, please refer to packaging-layers-path.
Per the reference article, the author created a
requirements.txt
. I was using only one Python3 library -psycopg2
and hence skipped that step.
3. Create a Docker container
Then, I executed the following step from my GitHub repository:
docker run --name customlayer --rm --env HTTP_PROXY --env HTTPS_PROXY --env NO_PROXY --mount type=bind,source="$(pwd)"/layer3.10,target=/var/task/lambdalayer -it aws-python3.10:local bash
and that provided me the interactive terminal to access the container’s shell.
Using this step, I created the runtime used by the AWS Lambda service.
4. Install the required Python3 libraries
The following command was to navigate to the python
folder and then install the library:
pip3 install --target=. --only-binary=:all: --upgrade psycopg2-binary
Then, after removing all the unnecessary files and folders (cache, docs), I reviewed the
python
folder to find the required libraries.
With the packages available, I exited the container interactive terminal by typing
exit
.
At this point, I no longer required the container for this use case. I also stopped following the reference article because I used Terraform to create the Lambda layer.
5. Terraform: Create a Lambda layer
The Terraform code creates a zip file with the python
folder inside and then creates a Lambda layer. Although you can have multiple compatibility runtimes, I have noticed that, at least for the psycopg2
library, if the library is built with the Python3.10 version, it does not work with any other runtime. Hence, having a 1:1 mapping between the compatibility runtimes and the layers is cleaner.
6. Terraform: Share the Lambda layer with the organization
The beauty of AWS Lambda layers is that they can be shared across different AWS accounts within the same AWS region. This feature promotes collaboration and reusability, allowing teams to utilize libraries and dependencies without duplication.
You will notice that I am using a variable for the
organization_id
. Since it is a sensitive value, it is stored as a GitHub Actions secret and passed to Terraform during the plan
and apply
step in the pipeline. If you implement the solution for your organization, you can access the value from the AWS Console → AWS Organizations. The value would be a regular expression pattern: o-[a-z0-9]{10,32}
.
Terraform and GitHub Actions create the Lambda layer after the code is committed and pushed to a remote repository. I have a separate note on automating the provisioning process for AWS cloud resources using GitHub Actions.
If you want to learn how I managed the authentication process, head over to this note –securely-integrate-aws-credentials-with-github-actions-using-openid-connect. I also scanned my code for vulnerabilities using Bridgecrew Checkov, which generated a scan report. Here is a note on how to enable Checkov with GitHub Actions. I’m also interested in the cost of these resources and use Infracost for that, which I have covered in detail in this note – estimate AWS resource cost. The vulnerabilities and cost estimate are available in the pull request as comments.
After the Lambda layer and the layer permission were created, any AWS Lambda function from any AWS Organization accounts could reference them. And there is a caveat to how the layer is referenced from the resource aws_lambda_function
.
If you are referencing the Lambda layer in the same AWS account where it is created, you may access it via the Terraform data {}
construct. However, if you are accessing it from another AWS account (member of the AWS Organizations), the complete ARN has to be used in the Lambda layer. Hence, the reference will look similar to the one below:
That brings us to the end of this note. Since you can access both GitHub code repositories, try provisioning the AWS cloud resources. Let me know if you have any questions or suggestions.