What I Learned from 1 Year of AWS Custom Resource


The title of this article could have also been, “Using AWS Custom Resource’s.” or maybe “AWS Custom Resource Gotchas and how to avoid them.” Either way, the goal of this article is to talk about usage and gotchas. With that said, this article is meant to be in addition to the AWS Official Documentation, linked below the References. I will highlight a few important aspects of AWS Custom Resources with the way that they work.

What is an ACR

An AWS Custom Resource1 (ACR) is a mechanism in AWS Cloudformation (CFN) that allows an arbitray payload to be sent to a compute function.

There are two ACR types:

  • AWS Lambda - sync - must always return a response (pass or fail)
  • AWS SQS Queue - async

This article covers AWS Lambda as this is the common case.

ACRs have a Request2 and Response3 structure.

The first thing you learn

The first major lesson of AWS Lambda ACRs is that they must always return a Response. If an ACR errors and fails to send a Response, the CloudFormation (CFN) parent stack invoking the ACR will spin for 1 hour, and eventually timeout, and fail.

The ACR Error Kernel

The ACR always sending a Response could be descibed as the “Error Kernel”, which is a concept that Joe Armstrong11 introduces in his talk “The Do’s and Don’ts of Error Handling”12

How to always send a Response

In order to achieve always sending a Response, you should wrap the invocation of your code in a top level try/catch where all errors be caught and send a ACR Response of fail

You could wrap your code in the following:

try:
  f(event)
  success
catch:
  fail

Long Responses

There may be some cases where a long response is generated by your code. You must truncate your response to a max size of 4096 bytes per the current docs, or else the CFN stack will fail with “Custom Resource Response Too Long”, and your CFN stack deploy will fail with no valuable error message from the custom resource.

Success case

In the success case, ACR Response data can be returned, which can be an arbitrary single level key/value map.

This data is then referenceable using the CFN intrisic function: ``Fn::Getatt`8

Fail case

In the failure case, an Error Reason and Error Message can be sent.

This will result in a ROLLBACK by your parent CFN stack.

ACR Request

An ACR Request consists of the ACR endpoint, e.g. the AWS Lambda ARN, optional key/value arguments, and metadata.

The Lambda ARN

This payload will be sent as an Event to the Lambda ARN.

Optional Key/Value arguments

These arguments will appear as top level key/value arguments in the ResourceProperties section of the Lambda Event.

The value of these arguments can be a nested data structure.

Metadata

AWS metadata is also sent in the ACR Request. This will include the parent CFN Stack Id, PhysicalResourceId, and so on.

The PhysicalResourceId

This value can be updated in order to trigger certain desired behavior. First CFN will process a RequestType of UPDATE, based on the new PhysicalResourceId in the Event. Then CFN sends an Event with a RequestType of DELETE with the old PhysicalResourceId and previous Event. This is sent after the UPDATE has succeded during the UPDATE_COMPLETE_CLEANUP_IN_PROGRESS phase.

ACR RequestType

There are 3 request types, and these match based on what the parent CFN stack is doing:

  • CREATE
  • UPDATE
  • DELETE

Deploy it

An ACR, like any other AWS Resource can be deployed with CFN4, AWS SAM5, or some other deployment method.

Replace an ACR

An ACR can NOT be replaced via tearing down and rebuilding if it is referenced in another CFN Stack, regardless of referencing the ACR directly or as variable using an SSM Parameter9 or CFN ImportValue10. You must solve for this.

Rollback

Rollback by default will use the same AWS Lambda code. This will be a problem if the Lambda suffers a bad update, e.g. a coding error, as the ACR will fail, then on rollback, if the same code path is used, rollback will also fail. You must solve for this as well.

Some strategies for this could be to use a stable version of the Lambda on rollback, or to always roll forward, and so on.

Finally

Okay, now that I know everything about ACRs, can I go write some code?

Yes!

Thank you.

References

[1] - AWS Custom Resource

[2] - AWS Custom resource request objects

[3] - AWS Custom resource response objects

[4] - AWS::CloudFormation::CustomResource

[5] - AWS Serverless Application Model (SAM)

[6] - AWS::Lambda::Function

[7] - AWS::Lambda::LayerVersion

[8] - Fn::GetAtt

[9] - AWS::SSM::Parameter

[10] - Fn::ImportValue

[11] - Joe Armstrong

[12] - The Do’s and Don’ts of Error Handling

Get fresh articles in your inbox

If you liked this article, you might want to subscribe. If you don't like what you get, unsubscribe with one click.