The python dependencies for an AWS lambda application have exceeded the 250 MB limit for AWS Lambdas. One of these dependencies is rasterio which depends on gdal. I'm attempting to build a docker image to get round the 250 MB limit and deploy our code to an AWS Lambda (using serverless.com).
Approach 1: pip install rasterio
Currently I have a Dockerfile
with:
FROM public.ecr.aws/lambda/python:3.10
RUN pip install rasterio # Fails with error (see below)
WARNING:root:Failed to get options via gdal-config: [Errno 2] No such file or directory: 'gdal-config'
ERROR: A GDAL API version must be specified. Provide a path to gdal-config using a GDAL_CONFIG environment variable or use a GDAL_VERSION environment variable.
Approach 2: yum install gdal-devel
tl; dr: "No package gdal-devel available."
Approach 3: build gdal
tl; dr: a lot of dependencies. Nervous those dependencies will have dependencies that need to be built too.
Approach 4: yum install epel-release then gdal-devel
- Needs fortran:
yum -y install libgfortran
worked but installed libgfortran.so.4 yum -y install gdal-devel
still erroring e.g. "Error: Package: openblas-openmp-0.3.3-2.el7.aarch64 (epel) Requires: libgfortran.so.3(GFORTRAN_1.0)(64bit)"- I'm not certain the problem is with having version 4 instead of version 3 of libgfortran but I couldn't easily install
libgfortran.so.3
.
Approach 5: use aws/sam/build-python
container
aws/sam/build-python
containerpython dependencies install without problem(no rasterio installs without error but then the lambda fails when it's run on AWS with the error "No module named 'rasterio._version'") when using theserverless.com
serverless-python-requirements i.e. runningserverless deploy
with the following serverless.yml file:
service: aws-python-docker-demo
frameworkVersion: "3"
plugins:
- serverless-python-requirements
custom:
pythonRequirements:
usePipenv: true
layer: true
provider:
name: aws
runtime: python3.10
deploymentBucket:
blockPublicAccess: true
functions:
hello:
handler: src/main.lambda_handler
layers:
- !Ref PythonRequirementsLambdaLayer
-
this
serverless-python-requirements
plugin seems to use a docker containerpublic.ecr.aws/sam/build-python3.10
to install the python dependencies and zip them up for the lambda- (which then fails because the lambda's dependencies & code are >= 250 MB size limit)
-
Plan:understand howserverless-python-requirements
:installs python dependencies insidepublic.ecr.aws/sam/build-python3.10
containerzips python dependencies (which will be > 250 MB)
copy that zip into the docker image for the AWS lambda.… ?
I'm not sure if this is a good approach, I'm sure there are better solutions. Any advice welcome.
** Update ** regarding new approach (no 6) and in response to @Rob's kind answer.
Approach 6: Try to use an old gdal/lambda docker image
Work in progress is here using https://hub.docker.com/r/remotepixel/amazonlinux-gdal/ . Next step: get this to work then iterate from there to:
- update gdal
- use latest lambda container
- use python 3.10 (as required for our application)
Currently planning to re-answer / update the answer to this StackOverflow question: https://stackoverflow.com/questions/36772111/how-can-i-install-a-recent-version-of-gdal-on-amazon-linux#comment135429542_44907360
Currently erroring with:
{
"errorType": "Runtime.InvalidEntrypoint",
"errorMessage": "RequestId: 2cda4291-3b02-4079-8d59-f1ab111f8dab Error: exec: \"main.lambda_handler\": executable file not found in $PATH"
}
Response to Rob's potential answer
When I run that it errors with the following:
cat Dockerfile2
FROM public.ecr.aws/lambda/python:3.10
RUN pip install rasterio
docker --version
Docker version 24.0.6, build ed223bc
MacOS 12.7.2
docker build -t testing-run-api-dependencies-2 -f ./Dockerfile2 . --progress=plain --no-cache
#0 building with "desktop-linux" instance using docker driver
#1 [internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.0s
#2 [internal] load build definition from Dockerfile2
#2 transferring dockerfile: 101B done
#2 DONE 0.0s
#3 [internal] load metadata for public.ecr.aws/lambda/python:3.10
#3 DONE 1.1s
#4 [1/2] FROM public.ecr.aws/lambda/python:3.10@sha256:f95780930513037d252b6b6165720381a1014096c3be9f2eac620776c8f0d167
#4 CACHED
#5 [2/2] RUN pip install rasterio
#5 1.173 Collecting rasterio
#5 1.229 Downloading rasterio-1.3.9.tar.gz (411 kB)
#5 1.309 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 411.7/411.7 kB 5.5 MB/s eta 0:00:00
#5 1.406 Installing build dependencies: started
#5 8.663 Installing build dependencies: finished with status 'done'
#5 8.666 Getting requirements to build wheel: started
#5 8.934 Getting requirements to build wheel: finished with status 'error'
#5 8.939 error: subprocess-exited-with-error
#5 8.939
#5 8.939 × Getting requirements to build wheel did not run successfully.
#5 8.939 │ exit code: 1
#5 8.939 ╰─> [3 lines of output]
#5 8.939 <string>:22: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
#5 8.939 WARNING:root:Failed to get options via gdal-config: [Errno 2] No such file or directory: 'gdal-config'
#5 8.939 ERROR: A GDAL API version must be specified. Provide a path to gdal-config using a GDAL_CONFIG environment variable or use a GDAL_VERSION environment variable.
#5 8.939 [end of output]
#5 8.939
#5 8.939 note: This error originates from a subprocess, and is likely not a problem with pip.
#5 8.942 error: subprocess-exited-with-error
#5 8.942
#5 8.942 × Getting requirements to build wheel did not run successfully.
#5 8.942 │ exit code: 1
#5 8.942 ╰─> See above for output.
#5 8.942
#5 8.942 note: This error originates from a subprocess, and is likely not a problem with pip.
#5 8.947
#5 8.947 [notice] A new release of pip is available: 23.0.1 -> 24.0
#5 8.947 [notice] To update, run: pip install --upgrade pip
#5 ERROR: process "/bin/sh -c pip install rasterio" did not complete successfully: exit code: 1
------
> [2/2] RUN pip install rasterio:
8.942 error: subprocess-exited-with-error
8.942
8.942 × Getting requirements to build wheel did not run successfully.
8.942 │ exit code: 1
8.942 ╰─> See above for output.
8.942
8.942 note: This error originates from a subprocess, and is likely not a problem with pip.
8.947
8.947 [notice] A new release of pip is available: 23.0.1 -> 24.0
8.947 [notice] To update, run: pip install --upgrade pip
------
Dockerfile2:2
--------------------
1 | FROM public.ecr.aws/lambda/python:3.10
2 | >>> RUN pip install rasterio
--------------------
ERROR: failed to solve: process "/bin/sh -c pip install rasterio" did not complete successfully: exit code: 1
Best Answer
Perhaps I am misunderstanding, but maybe this is something specific to your build machine/Docker version. When I try your Approach 1 above verbatim to build the container locally it succeeds:
Now, in theory, docker builds shouldn't have any dependencies on the local machine, a docker build that works in one place should work on another, but maybe you have a stale cached dependency? Maybe try
docker build
on a different machine if you have access to another environment ordocker system prune -a
(expensive, clears all your unused cached images) and rebuild.