10 Jul, 2019

How We Built It -- CORS and AWS API Gateway

by Jonn Callahan

This post is the first in a series entitled: “How We Built It.” The goal is to highlight how nVisium, a security-centric company, took the lessons learned from years of application security work and applied them towards building secure code within our own software platform. In this first post, I’ll be exploring how we built a robust Cross-Origin Resource Sharing (CORS) implementation within an AWS Lambda and API Gateway serverless architecture. Specifically, I’ll demonstrate how we handle whitelisting a set of domains, as opposed to just a single domain.

CORS in a Nutshell

CORS is the specification that controls how browsers decide when scripts from domainA.com are allowed to invoke services hosted on domainB.com. This is a security-centric specification, designed to protect users from rogue or malicious scripts on a given domain from abusing a user’s active sessions on a different domain – a CSRF-like protection mechanism.

While I don’t want to get into the weeds of the CORS spec, assuming most readers are already familiar, I do want to give a brief overview of the challenge I (and many others) face. If your forehead is already dented from slamming the keyboard dozens of times from previously fighting the CORS spec, then save yourself the pain of rehashing it and skip this section entirely.

As a brief overview of the CORS spec, it primarily involves three HTTP response headers:

  • Access-Control-Allow-Origin
  • Access-Control-Allow-Credentials
  • Access-Control-Allow-Headers

I won’t spend the time deep diving the behavior of these headers control, but still, want to comment on a bizarre behavior of the first header. When a script from domainA.com attempts to make an HTTP request to domainB.com, the browser first sends a pre-flight request to domainB.com with domainA.com passed via the Origin header. This, in effect, is the browser asking for permission for the script to interact with it. The service on domainB.com will then return an Access-Control-Allow-Origin response header with a domain. If the domain matches the one hosting the script, then the browser sends the HTTP request. Otherwise, no request is sent.

Pretty neat, right? At the surface, this appears to be a simple, but effective, little security hurdle, implemented via just a few HTTP response headers. But what happens when you need to allow multiple domains from interacting with service? If you’re like most people, your first thought is probably along the lines of:

No problem, atticuss! Let me pass the list of allowed domains back via the ACAO header.

Nope! Sorry, but the spec does not allow you to pass a list of authorized domains. I don’t know why this decision was made, but I’d bet a substantial sum of money that it’s the sole cause for so many services ending up with CORS configurations that are quite dangerous. Some developers, in interest to get things working, decided that they’ll just send a wildcard as the ACAO response. This actually isn’t terrible, due to how browsers treat wildcards and the Access-Control-Allow-Credentials header. But that’s another story. The real issue is when developers, clearly hellbent on ruining my ability to have a peaceful night’s rest, decide they just want to reflect the value of the Origin header back within the ACAO response header.

Well here’s a great irony: that’s exactly what I’ve done. And I feel dirty.

So You Wanna CORS Your API Gateway, Eh?

AWS API Gateway and Lambda is an incredibly powerful architecture. It provides infinite horizontal scaling and flexible billing, only charging you for function execution time. Conveniently, it also provides a turnkey CORS configuration. The problem, however, is that the turnkey solution only allows you to specify a single domain to be allowed via the ACAO response header – if you need to support multiple domains, this solution won’t work.

nVisium recently ran into this issue during the re-architecting of our Secure Development Training Platform, frequently referred to as “ODTP” within source code samples. The end goal was to allow any subdomain of nvisium.com to be permitted via CORS. Additionally, we wanted to allow for a debug mode which allowed any domain. The solution is really straightforward, but I wanted to share it as there doesn’t seem to be a lot of soup-to-nuts solution write-ups around.

The first thing to note is that the architecture leverages the new Lambda Layers functionality for sharing common code across all Lambda functions. As such, putting the CORS response logic within a Layer was the obvious choice. Also, since Python is the primary language in use, we make heavy use of decorators to wrap Lambda handlers. All of this led to the creation of the @add_cors_response decorator:

import os

from functools import wraps

def add_cors_response(f):
    @wraps(f)
    def wrapper(*args, **kwargs):
        resp = f(*args, **kwargs)

        reflect_origin = False
        # can't actually pass bools in via env vars, so gotta str -> bool cast
        if os.environ.get("CORS_REFLECT_ORIGIN", "") == "true":
            reflect_origin = True

        origin = None
        event = args[0]
        if "Origin" in event["headers"]:
            origin = event["headers"]["Origin"]
        elif "origin" in event["headers"]:
            origin = event["headers"]["origin"]

        if origin is None:
            return resp

        if "headers" not in resp.keys():
            resp["headers"] = {}

        top_domain = '.'.join(origin.rsplit(".", 2)[-2:]) #foo.bar.baz.com -> baz.com
        if top_domain == "nvisium.com" or reflect_origin:
            resp["headers"]["Access-Control-Allow-Origin"] = origin
        else:
            resp["headers"]["Access-Control-Allow-Origin"] = "https://nvisium.com"

        resp["headers"]["Access-Control-Allow-Credentials"] = "true"
        resp["headers"]["Access-Control-Allow-Headers"] = "ODTP_SESSION"
        resp["headers"]["Access-Control-Allow-Methods"] = "GET, POST, PUT, DELETE"

        return resp

    return wrapper

Most of this code is self-explanatory, but let’s walk through it. First up, calling resp = f(*args, **kwargs) will execute the function I wrap before hooking into the response. Next, I check the value of the CORS_REFLECT_ORIGIN environment variable:

reflect_origin = False
# can't actually pass bools in via env vars, so gotta str -> bool cast
if os.environ.get("CORS_REFLECT_ORIGIN", "") == "true":
    reflect_origin = True

Functionally, this acts as a debug mode switch, only ever being set to "true" within dev environments. Next, I pull the event parameter out of args in order to fish out the request’s Origin header:

origin = None
event = args[0]
if "Origin" in event["headers"]:
    origin = event["headers"]["Origin"]
elif "origin" in event["headers"]:
    origin = event["headers"]["origin"]

if origin is None:
    return resp

Oddly, I ran into cases when the keys of the event["headers"] dictionary were lowercased. I never pinned down exactly when this occurred, but checking for both "origin" and "Origin" resolved it. Additionally, the check for if origin is None is because of how GET requests don’t actually trigger pre-flight requests. If no Origin header is sent in the request, we don’t set any CORS headers in the response. Finally, I parse the origin domain and add our CORS headers as appropriate:

top_domain = '.'.join(origin.rsplit(".", 2)[-2:]) #foo.bar.baz.com -> baz.com
if top_domain == "nvisium.com" or reflect_origin:
    resp["headers"]["Access-Control-Allow-Origin"] = origin
else:
    resp["headers"]["Access-Control-Allow-Origin"] = "https://nvisium.com"

resp["headers"]["Access-Control-Allow-Credentials"] = "true"
resp["headers"]["Access-Control-Allow-Headers"] = "ODTP_SESSION"
resp["headers"]["Access-Control-Allow-Methods"] = "GET, POST, PUT, DELETE"

return resp

Specifically, pull out the TLD of the origin domain and check to see if it’s nvisium.com. If it is, or if we’re in debug mode, we reflect the origin domain. Otherwise, we just return nvisium.com to block the request.

Integrating with Lambda

Each service contained a cors.py file which handled explicit pre-flight CORS requests:

try:
    from odtp_cors import add_cors_response
except ModuleNotFoundError:
    from odtplayers.util_layer.python.odtp_cors import add_cors_response

@add_cors_response
def lambda_handler(event, context, **kwargs):
    return {"statusCode": 200}

Pretty straightforward. Return a 200 on all OPTIONS pre-flight requests and let the decorator do the heavy lifting. Centralizing this code as a decorator within a Layer also makes it easy to update the allowed domains down the road, instead of requiring me to update a stand-alone CORS response function spread across five different repositories.

Additionally, this decorator is added to any Lambda functions which handle GET requests. For example:

@add_cors_response
@configure_logger()
@default_error_catch
@require_all_claims(["custom:Role"])
@require_one_role_of(["ODTPAdmin", "OrgAdmin"])
def lambda_handler(event, context, **kwargs):
    logger = kwargs["logger"]
    org_id = event["pathParameters"]["orgId"]

If you’re curious what those other decorators do, or why we’re trying two different imports for the same resource, then stay tuned for future write-ups.

SAM Integration

nVisium has also been leveraging AWS Serverless Application Model (SAM), which is a templating syntax that’s a superset of CloudFormation. This syntax is incredibly powerful, but also very verbose. Due to how the CORS spec works, an OPTIONS method must be defined for every API Gateway route that consumes a POST, PUT, or DELETE.

Originally, I was going to dump a small SAM template that defined a single login function with CORS enabled. However, that “small” template ended up being nearly 200 lines long. Instead, let’s walk through the CORS-specific things you need to define.

First off, I need a way to dynamically specify the Lambda Layer ARN, so let’s use a parameter:

Parameters: 
  UtilLayerArn: 
    Type: String
    Description: The ODTP Util Layer ARN

Next, you’ll need to add the OPTIONS method to each route. Since nVisium also leverages request body verification, our templates use the more verbose API Gateway definitions. For example:

ODTPAuthnAPI:
  Type: AWS::Serverless::Api
  Properties:
    Name: ODTPAuthnAPI
    StageName: Prod
    DefinitionBody:
      swagger: "2.0"
      info:
        title: odtpauthnSwaggerDoc
      x-amazon-apigateway-request-validators:
        body:
          validateRequestBody: true
          validateRequestParameters: false
      paths:
        "/login":
          post:
            consumes:
              - application/json
            x-amazon-apigateway-request-validator: body
            parameters:
              - in: body
                name: Login
                required: true
                schema:
                  type: object
                  properties:
                    username:
                      type: string
                    password:
                      type: string
                  required:
                    - username
                    - password
            x-amazon-apigateway-integration:
              httpMethod: POST
              type: aws_proxy
              uri:
                Fn::Sub: arn:aws:apigateway:${AWS::Region}:lambda:path/2015-03-31/functions/${LoginFunction.Arn}/invocations
            responses: {}
          options:
            consumes:
              - application/json
            x-amazon-apigateway-integration:
              httpMethod: POST
              type: aws_proxy
              uri:
                Fn::Sub: arn:aws:apigateway:${AWS::Region}:lambda:path/2015-03-31/functions/${CORSFunction.Arn}/invocations
            responses: {}

Then we need to define the CORS Lambda itself:

CORSFunction:
  Type: AWS::Serverless::Function
  Properties:
    CodeUri: ./odtpauthn/
    Handler: cors.lambda_handler
    FunctionName: ODTP-Authn-CORS

If this looks a bit terse, it’s because several Properties fields are set in the Globals section:

Globals:
  Function:
    Timeout: 10
    Environment:
      Variables:
        CORS_REFLECT_ORIGIN: false
    Layers:
    - !Ref UtilLayerArn
    Runtime: python3.6

Finally, you need to grant permission to API Gateway to invoke the Lambda:

CORSAPIGWLambdaPermission:
  Type: "AWS::Lambda::Permission"
  DependsOn:
    - ODTPAuthnAPI
    - CORSFunction
  Properties:
    Action: lambda:InvokeFunction
    FunctionName: !Ref CORSFunction
    Principal: apigateway.amazonaws.com

And that’s it. Now you’ve got a CORS implementation for your AWS serverless architecture that is far more robust than the turnkey solution offered by AWS.

Looking For More?

This is the first post in a new series we’ll be starting, describing how we solved various security challenges on a serverless AWS architecture. Future installments will include how I set up centralized logging, tracing Lambda invocations, Cognito-backed and decorator wrapped authorization, and more. Stay tuned!