Deploying PyTorch on AWS Lambda

By Davy Neven on April 14th, 2020

Deploying PyTorch on AWS Lambda

Deploying PyTorch models cost-efficiently in the cloud is not straightforward. While GPU-accelerated servers can deliver results in real-time, they are quite expensive. CPU-only servers on the other hand are cheaper, but lack performance due to the computation intensive nature of deep learning. Serverless functions like AWS Lambda provide a good alternative, making up for their slower performance with the ability of drastic parallelization (up to 1000 in parallel), while only being charged for the time they are used. However, it is not trivial to adhere to Lambda’s memory constraints due to PyTorch’s large codebase.

In this article, we will show you how to import the PyTorch library into your Lambda functions by implementing an image classifier as an example. In the following article, we will discuss a faster, lightweight alternative by converting PyTorch models to ONNX and weigh the pros and cons.

Image classification with PyTorch on AWS Lambda

While PyTorch is a popular deep learning research library, it is not optimized for a production setting. Its large codebase of around 370 MB does not meet AWS Lambda 250 MB memory limit. In this section we will cover the steps to circumvent this limit, albeit at the expense of the function’s initialization time  ( the dreaded cold-start delay for which serverless functions are infamous). However, with the right “pre-warm” strategy it might just work for you. Let’s begin!

Exporting the PyTorch model

The first thing to do is to export/trace your PyTorch model into a TorchScript representation. This bundles the model definition together with the weights into a compact graph-like representation. In this example we use a pretrained ResNet-34 model, but the same steps apply to other models as well:

1
2
3
4
5
6
7
8
import torch
from torchvision.models import resnet
# define the model
model = resnet.resnet34(pretrained=True)
model.eval()
# trace model with a dummy input
traced_model = torch.jit.trace(model, torch.randn(1,3,224,224))
traced_model.save('resnet34.pt')

The exported model is then uploaded into an S3-bucket, so that it can be accessed from within the Lambda function.

Preparing the deployment package

To upload our codebase, Lambda requires us to create a deployment package, a zip archive containing the python code combined with the necessary dependencies. Therefore we create a new virtual environment and activate it:

1
2
$ python3 -m venv venv
$ source venv/bin/activate

By using this new environment, it will be easier to include the right dependencies later on . We also create a deploy.sh file and a folder “code”, initialized with two files, main.py and unzip_torch.py, which we will fill in later. The file structure should look as follows:

1
2
3
4
5
|code
    |main.py
    |unzip_torch.py
|venv
deploy.sh

Next we install PyTorch(v1.4.0) and TorchVision(v0.5.0):

1
$ pip install torch==1.4.0+cpu torchvision==0.5.0+cpu -f https://download.pytorch.org/whl/torch_stable.html

Defining the Lambda function

In the main.py file we write the code the Lambda function will execute. In this example we implement a basic image classifier which will return the class id of the requested image:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
try:
    import unzip_requirements
except ImportError:
    pass
import io
import os
import time

import boto3
import requests
import torch
from PIL import Image
from torchvision import transforms

s3_resource = boto3.resource('s3')

img_tranforms = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406],
                         [0.229, 0.224, 0.225])
])


def download_image(url):
    try:
        r = requests.get(url)
        if r.status_code == 200:
            f = io.BytesIO(r.content)
            img = Image.open(f)
            return img
        else:
            return None
    except:
        return None


def download_model(bucket='', key=''):
    location = f'/tmp/{os.path.basename(key)}'
    if not os.path.exists(location):
        s3_resource.Object(bucket, key).download_file(location)
    return location


def classify_image(model_path, img):
    model = torch.jit.load(model_path)
    img = img_tranforms(img).unsqueeze(0)
    cl = model(img).argmax().item()
    return cl


def lambda_handler(event, context):
    # download model
    model_path = download_model(
        bucket='segmentsai-dl', key='models/pytorch_model.pt')
    # download image
    img = download_image(event['url'])
    # classify image
    if img:
        cl = classify_image(model_path, img)
        return {
            'statusCode': 200,
            'class': cl
        }
    else:
        return {
            'statusCode': 404,
            'class': None
        }

Unfortunately, the PyTorch codebase is too large (~370 MB) to fit within the 250 MB size limit of the deployment package. Therefore, we compress the library by zipping it, decreasing its size to 120 MB. At runtime we unzip the library under the /tmp directory, where we have an additional 500 MB of storage.

The unzipping is defined in unzip_torch.py, and called at the top of main.py:

1
2
3
4
5
6
7
8
9
10
11
12
13
import os
import shutil
import sys
import zipfile
torch_dir = '/tmp/torch'
# append the torch_dir to PATH so python can find it
sys.path.append(torch_dir)
if not os.path.exists(torch_dir):
   tempdir = '/tmp/_torch'
   if os.path.exists(tempdir):
       shutil.rmtree(tempdir)
   zipfile.ZipFile('torch.zip', 'r').extractall(tempdir)
   os.rename(tempdir, torch_dir)

Zipping up the deployment package

The Lambda’s deployment package consists of the Python files located in the code directory, as well as the required libraries, located under /venv/lib64/python3.6/site-packages/. As mentioned, we will zip the torch library to adhere to the size limit. We also remove some unnecessary files to make the package a bit smaller. Next, we upload the deployment package to an S3 bucket and update our Lambda function accordingly. All of this is done in the deploy.sh file, which contains the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#!/bin/bash
mkdir -p packages
cp -r ./venv/lib64/python3.6/site-packages/* packages
cd packages
find . -type d -name "tests" -exec rm -rf {} +
find . -type d -name "__pycache__" -exec rm -rf {} +
rm -rf ./{caffe2,wheel,wheel-*,pkg_resources,boto*,aws*,pip,pip-*,pipenv,setuptools}
rm -rf ./{*.egg-info,*.dist-info}
find . -name \*.pyc -delete
#zip up torch
zip -r9 torch.zip torch
rm -r torch
# zip everything up
zip -r9 ${OLDPWD}/pytorch_fn.zip .
cd $OLDPWD;
cd ./code
zip -rg ${OLDPWD}/pytorch_fn.zip .
cd $OLDPWD
rm -r packages
# copy to s3 and update lambda function
aws s3 cp pytorch_fn.zip s3://lambda-functions/
aws lambda update-function-code --function-name pytorch_example \   --s3-bucket lambda-functions --s3-key pytorch_fn.zip

Timings

Once the Lambda function is updated, we can test its performance. Here we will make a distinction between a “cold-start”, which means that there is no cached container in memory, and a “warm-start”, where there is still a container waiting to be executed. Since our deployment package is quite large it takes a lot of time before Lambda has created our environment. In our example it took ~30 seconds on average for this initialization, which is not even worth considering for a production setting. On the other hand, a warm start was quite fast, averaging around 670ms in total, thanks to PyTorch and the model file already being cached in memory.

Conclusion

We managed to include PyTorch into our Lambda functions by adding it as a zipped archive to our deployment package and unzipping it on the fly. However, a cold-start averaging around 30 seconds is not practical for production settings. In contrast, a warm-start is quite fast (670ms for ResNet-34) and can be achieved by applying a good “pre-warm” strategy. Hence, if really necessary, you can use PyTorch within your Lambda functions, albeit at the expense of warming-up calls. In all other cases we suggest exporting your PyTorch model to a leaner runtime like ONNX, which we will discuss in a separate article (stay tuned!).

Davy
Davy Neven
Share: