Managing shared resources across AWS CDK stacks can lead to deployment deadlocks due to cross-stack references. This article explores strategies to decouple stacks to prevent the “Deadly Embrace”.
Example Code
Introduction
This article covers intermediate topics related to AWS CloudFormation and AWS CDK. Before diving into the topic, ensure you have a solid understanding of:
- AWS CloudFormation (CFn) basic concepts and Infrastructure as Code (IaC) in general.
- How to synthesize and deploy CFn templates using AWS CDK.
- It also helps to have experience with a CDK-supported programming language like TypeScript.
If you're new to any of these topics, don't worry. Check out the AWS documentation and experiment a bit with CloudFormation and CDK or follow one of the linked tutorials. After that you should feel confident tackling the deadly embrace.
When starting a new CDK project, managing all resources in a single stack might seem ideal—and under the right conditions, it can be. However, as requirements grow, so does complexity.
You may need to split your single stack into multiple stacks for several reasons. Perhaps you want to speed up deployments by only updating specific resources. You might want to handle stateful resources like databases more carefully by separating them and implementing specific approval processes. Grouping resources into separate stacks reduces the blast radius of each deployment and improves scalability and maintenance.
Team structure provides another reason for splitting stacks. As your project and infrastructure grow, your team typically expands too. Different teams may be responsible for distinct parts of your system—networking, databases, application services, and so on. Structuring stacks around these ownership domains enables teams to work independently and enforces clear separation of concerns, reducing conflicts.
Sometimes you might even hit the resource limit of your single stack and need to refactor it. Whatever the reason, you'll likely need to handle multiple stacks and cross-stack dependencies sooner or later.
The Problem
When working with AWS CDK you might have been delighted by the fact, that unlike CloudFormation, AWS CDK makes cross-stack resource references straightforward. Instead of explicitly creating an Output in the "producing" stack and importing it with Fn::ImportValue in the "consuming" stack, you can simply pass references between stacks as arguments in your chosen programming language. However, this convenience comes with risks. The template output and import still exist in the synthesized CDK code—they're just hidden. While initial deployment works well, removing cross-stack references can cause problems.
Let's look at an example: A ProducingStack creates an S3 Bucket MyBucket, and a ConsumingStack creates a Lambda Function MyLambda. The Lambda Function needs to read from MyBucket, so we use the Bucket construct's grantRead method to grant access. We pass the Bucket construct to the ConsumingStack and call props.bucket.grantRead(lambdaFunction).
class ProducingStack extends Stack {
public readonly bucket: Bucket;
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
this.bucket = new Bucket(this, 'MyBucket', {
// omitted for brevity
});
}
}
class ConsumingStack extends Stack {
constructor(scope: Construct, id: string, props: ConsumingStackProps) {
super(scope, id, props);
const lambdaFunction = new Function(this, 'MyLambda', {
// omitted for brevity
})
props.bucket.grantRead(lambdaFunction);
}
}
const app = new cdk.App();
const producingStack = new ProducingStack(app, 'ProducingStack');
new ConsumingStack(app, 'ConsumingStack', {
bucket: producingStack.bucket,
});
Later, when MyLambda is modified to read from a different source, we remove the grantRead call since it's no longer needed. But when we try to deploy, we hit a problem. The ProducingStack deployment fails with "Export cannot be deleted as it is in use by another Stack." This is unexpected since we haven't changed the ProducingStack code. The issue lies in the CFn template: removing grantRead removed the implicit Output. CloudFormation tries to delete this output, but can't because ConsumingStack still references it. And we can't modify ConsumingStack because ProducingStack must be deployed first—creating a deadly embrace.
Escaping the "Deadly Embrace"
When you are reading this blog post, there is a good chance, that you've are already run into the deadly embrace and are currently stuck with cross-stack references blocking your deployment. While this article is more concerned with preventing the deadlock by reorganizing your infrastructure code, you might be interested in this great article by Adam Ruka how to unblock cross-stack references.
Preventing the "Deadly Embrace"
Now on the topic of actually preventing this insidious road block, I want to present you 3 strategies from the trenches that might help you. Be aware though, that all of these strategies bring their own pitfalls.
Strategy 1: Don't Share
No seriously, avoid cross-stack references as much as possible. Use nested Stacks if you hit the Resource limit in your synthesized CloudFormation templates or create custom Constructs if you want to create logical groups of resources in order to structure your code. Also consider duplicating resources you can afford to duplicate. Of course some resources like VPCs, Databases, ECS/EKS Clusters and such still often have to be shared across stacks, but other resources like IAM Roles and Policies, CloudWatch Log Groups or S3 buckets (with non-critical data) can easily be duplicated.
Pro:
- Eliminates or at least greatly reduces cross-stack dependencies and as such also the risk for a deadly embrace.
- Self-contained stacks are easier to debug and maintain.
- Self-contained stacks can be deployed in parallel, thereby increasing deployment speed.
Contra:
- Having to duplicate resources can lead to higher costs.
- Synchronizing data across your system might prove challenging.
- There are some resources you have to manage in a centralized location. Specifically, networking and routing related resources like VPCs. You can't fully decouple these types of stacks.
When to use this strategy?
Use this strategy for resources, which can easily be duplicated without much overhead or cost.
When to avoid this strategy?
Don’t use this strategy when resource duplication could cause you troubles such as increased costs.
Strategy 2: Decouple by hard-coding predictable values
When using CDK you will notice, that most Constructs offer methods to create other Constructs that represent existing resources in your code. These helper methods will generally require you to provide an ARN (Amazon Resource Name) or a resource identifier (like a bucket name). Both ARNs and resources names tend to be predictable and (in practice) rarely change.
In your infrastructure code you can take advantage of this, when importing your shared resources and hard-code the required identifiers. This way you will avoid hard dependencies on other stacks, that may lead into a deadly embrace. Be aware that you still have a (soft) dependency to the imported resource. If the resource your stack depends on does not exist, than your deployment will still fail.
Considering again the previous code example, the shared S3 bucket has a predictable ARN, it will always follow the following format: arn:<PARTITION>:s3:::<NAME-OF-YOUR-BUCKET> e.g.
arn:aws:s3:::MyBucket. But, in case of a bucket resource we can even just reference it by it's bucket name.
class ProducingStack extends Stack {
public readonly bucket: Bucket;
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
this.bucket = new Bucket(this, 'MyBucket', {
// omitted for brevity
});
}
}
class ConsumingStack extends Stack {
constructor(scope: Construct, id: string) {
super(scope, id);
const lambdaFunction = new Function(this, 'MyLambda', {
// omitted for brevity
})
// const myBucket = Bucket.fromBucketArn(this, 'MyBucket', 'arn:aws:s3:::MyBucket');
const myBucket = Bucket.fromBucketName(this, 'MyBucket', 'MyBucket');
myBucket.grantRead(lambdaFunction);
}
}
const app = new cdk.App();
new ProducingStack(app, 'ProducingStack');
new ConsumingStack(app, 'ConsumingStack');
Pro:
- Makes imports explicit.
- Eliminates cross-stack dependencies and as such also the risk for a deadly embrace.
- Self-contained stacks are easier to debug and maintain.
- Self-contained stacks can be deployed in parallel, thereby increasing deployment speed.
Contra
- This strategy is only suitable for "stable” resources that will rarely or never change.
- In case the resource does change you have to remember to update the references as well.
- Can lead to misconfiguration and configuration drift if not properly documented.
When to use this strategy?
Use this strategy for stable resources with predictable names, that very likely never change.
When to avoid this strategy?
Don't use this strategy for resources that frequently change and also never expose secrets in your code.
Strategy 3: Decouple using SSM Parameter Store
In a very similar approach to Strategy 2 you can also make use of the AWS SSM Parameter Store to make resource identifiers accessible across stacks. The Parameter Store is an AWS service that allows you to securely store configuration data and secrets in a centralised location. Storing your configurations or secrets here will enable you to reference them in your code and change any values on demand without having to change your code. This approach is generally superior to hardcoding values in your code as it offers you more flexibility by allowing updates without modifying your stack.
In the code example below you can see, that we now fetch the bucket’s name from the SSM Parameter Store instead of hard-coding it. Be aware though, that this will result in a request to your target AWS account. This might require additional permissions in your CI/CD environment to be setup in order to enable your pipelines to have read access to the Parameter Store. Also there is the risk of misconfiguration as CDK will cache fetched parameter values in the local cdk.context.json.
class ProducingStack extends Stack {
public readonly bucket: Bucket;
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
const bucket = new Bucket(this, 'MyBucket', {
// omitted for brevity
});
new StringParameter(this, 'BucketArn', {
parameterName: 'MyBucketArn',
stringValue: bucket.bucketArn,
});
new StringParameter(this, 'BucketName', {
parameterName: 'MyBucketName',
stringValue: bucket.bucketName,
})
}
}
class ConsumingStack extends Stack {
constructor(scope: Construct, id: string) {
super(scope, id);
const lambdaFunction = new Function(this, 'MyLambda', {
// omitted for brevity
})
// const myBucketArnParam = StringParameter.fromStringParameterName(this, 'MyBucketArn', 'MyBucketArn');
const myBucketNameParam = StringParameter.fromStringParameterName(this, 'MyBucketName', 'MyBucketName');
// const myBucket = Bucket.fromBucketArn(this, 'MyBucket', myBucketArnParam.stringValue);
const myBucket = Bucket.fromBucketName(this, 'MyBucket', myBucketNameParam.stringValue);
myBucket.grantRead(lambdaFunction);
}
}
const app = new cdk.App();
new ProducingStack(app, 'ProducingStack');
new ConsumingStack(app, 'ConsumingStack');
Pro:
- Makes imports explicit.
- Decouples the stack, but in a more flexible way then hardcoding values.
- Referenced identifiers can be updated in a centralized location
- Self-contained stacks are easier to debug and maintain.
- Self-contained stacks can be deployed in parallel, thereby increasing deployment speed.
Contra
- Can lead to misconfiguration when not properly documented.
- Can lead to misconfiguration when your
cdk.context.jsonbecomes stale.
When to use this strategy?
Use this strategy to decouple stacks that used shared resources which might change over time. It might also be a good idea to commit your cdk.context.json in your project's git repository to prevent misconfigurations.
When to avoid this strategy?
Be careful, as this approach might create significant overhead when used at scale - particularly in an environment with multiple AWS accounts. Each new value in the Parameter Store becomes a piece of configuration you have to document, secure and maintain.
Conclusion
To recap: when working with AWS CDK and by proxy with AWS CloudFormation you will sooner or later experience the need to split your infrastructure into multiple Stacks. These Stacks will likely have dependencies between each other. Now it's your job to handle these cross-stack dependencies without causing any deadlocks. In this article we covered three strategies to prevent the so called deadly embrace:
- Don't Share: avoid cross-stack dependencies in general where ever possible.
- Decouple by hard-coding predictable values: avoid cross-stack dependencies by hard-coding values rather than sharing them.
- Decouple using SSM Parameter Store: avoid cross-stack dependencies by replacing them with references to Parameter Store rather than sharing them.
You will very likely use a mixture of these strategies based on your specific architecture, team structure and requirements. In the end, a deliberate decision has to be made about how much risk of a “deadly-embrace” can be tolerated, considering all potential challenges of the presented strategies.