Custom Runtime Cold Start Optimization
AWS Lambda cold starts can kill application performance. Here’s how we achieved 90% faster cold starts using custom runtimes and shell scripts.
Understanding Lambda Cold Starts
A cold start happens when AWS Lambda creates a new execution environment for your function. This includes:
- Environment setup - downloading your code
- Runtime initialization - starting the language runtime
- Function initialization - loading your code and dependencies
Standard runtimes like Node.js, Python, and Java have significant initialization overhead. Custom runtimes let you optimize this process.
Cold Start Performance Comparison
Real Production Metrics
Node.js 18.x Runtime:
Init Duration: 247.82 ms
Duration: 1,240 ms
Memory Used: 152 MB
Python 3.11 Runtime:
Init Duration: 189.45 ms
Duration: 890 ms
Memory Used: 128 MB
Shell Custom Runtime:
Init Duration: 22.11 ms
Duration: 316 ms
Memory Used: 36 MB
Results:
- 90% faster cold starts vs Node.js
- 88% faster cold starts vs Python
- 75% less memory usage
- Significantly lower costs
Why Custom Runtimes Are Faster
1. Minimal Bootstrap Process
Standard runtimes initialize entire language environments. Custom runtimes start with just what you need:
#!/bin/bash
# Custom runtime bootstrap - 20 lines total
while true; do
EVENT=$(curl -sS "$AWS_LAMBDA_RUNTIME_API/2018-06-01/runtime/invocation/next")
RESPONSE=$(./handler.sh "$EVENT")
curl -sS "$AWS_LAMBDA_RUNTIME_API/2018-06-01/runtime/invocation/$REQUEST_ID/response" \
-d "$RESPONSE"
done
2. No Dependency Loading
Node.js loads npm modules, Python imports packages. Shell scripts use pre-compiled binaries:
# These are already compiled C binaries
curl -sS "https://api.example.com/data" | jq '.results[]'
3. Smaller Image Sizes
Our custom runtime images:
- tiny: 132MB (jq, curl, http-cli)
- micro: 221MB (adds AWS tools)
- full: 417MB (complete AWS CLI)
Compare to standard runtimes:
- Node.js: 500MB+
- Python: 400MB+
- Java: 800MB+
Lambda Cold Start Optimization Techniques
1. Use Provisioned Concurrency Strategically
For critical functions, provisioned concurrency eliminates cold starts entirely:
resource "aws_lambda_provisioned_concurrency_config" "example" {
function_name = aws_lambda_function.example.function_name
provisioned_concurrent_executions = 5
qualifier = aws_lambda_function.example.version
}
Cost consideration: Only use for high-traffic, latency-sensitive functions.
2. Optimize Memory Allocation
More memory = faster CPU = faster initialization:
resource "aws_lambda_function" "optimized" {
memory_size = 1024 # Sweet spot for most functions
timeout = 30
}
Benchmark your functions to find the optimal memory setting.
3. Minimize Package Size
Smaller packages download faster:
# Multi-stage build for minimal images
FROM alpine:latest as builder
RUN apk add --no-cache curl jq
FROM scratch
COPY --from=builder /usr/bin/curl /usr/bin/curl
COPY --from=builder /usr/bin/jq /usr/bin/jq
4. Use Lambda Layers Efficiently
Layers are cached across functions, reducing cold start time:
resource "aws_lambda_layer_version" "tools" {
filename = "tools-layer.zip"
layer_name = "common-tools"
source_code_hash = filebase64sha256("tools-layer.zip")
compatible_runtimes = ["provided.al2"]
}
Building Your Own Fast Custom Runtime
Step 1: Create the Bootstrap
#!/bin/bash
# bootstrap
set -euo pipefail
while true; do
HEADERS=$(mktemp)
EVENT=$(curl -sS -D "$HEADERS" \
"$AWS_LAMBDA_RUNTIME_API/2018-06-01/runtime/invocation/next")
REQUEST_ID=$(grep -Fi Lambda-Runtime-Aws-Request-Id "$HEADERS" | tr -d '\r' | cut -d: -f2 | tr -d ' ')
RESPONSE=$(timeout 30s ./handler.sh "$EVENT" 2>&1 || echo '{"error":"timeout"}')
curl -sS "$AWS_LAMBDA_RUNTIME_API/2018-06-01/runtime/invocation/$REQUEST_ID/response" \
-d "$RESPONSE"
rm -f "$HEADERS"
done
Step 2: Create Your Handler
#!/bin/bash
# handler.sh
EVENT="$1"
NAME=$(echo "$EVENT" | jq -r '.name // "World"')
echo '{"message": "Hello, '"$NAME"'!", "timestamp": "'$(date -Iseconds)'"}'
Step 3: Build Optimized Container
FROM public.ecr.aws/lambda/provided:al2023
RUN yum update -y && yum install -y jq curl && yum clean all
COPY bootstrap handler.sh ./
RUN chmod +x bootstrap handler.sh
CMD ["handler.sh"]
Advanced Cold Start Optimization
1. Connection Pooling
Reuse connections across invocations:
# Keep connection alive in /tmp
CONN_FILE="/tmp/api_connection"
if [[ ! -f "$CONN_FILE" ]]; then
curl -sS --connect-timeout 5 "https://api.example.com/health" > "$CONN_FILE"
fi
2. Lazy Loading
Load resources only when needed:
load_config() {
if [[ -z "${CONFIG:-}" ]]; then
CONFIG=$(aws ssm get-parameter --name "/app/config" --query 'Parameter.Value' --output text)
fi
echo "$CONFIG"
}
3. Warm-up Strategies
Keep functions warm with scheduled invocations:
resource "aws_cloudwatch_event_rule" "warmup" {
name = "lambda-warmup"
description = "Keep Lambda functions warm"
schedule_expression = "rate(5 minutes)"
}
resource "aws_cloudwatch_event_target" "lambda" {
rule = aws_cloudwatch_event_rule.warmup.name
target_id = "WarmupTarget"
arn = aws_lambda_function.example.arn
input = jsonencode({"warmup": true})
}
Monitoring Cold Start Performance
CloudWatch Metrics
Track these key metrics:
- InitDuration - cold start time
- Duration - total execution time
- MemoryUtilized - actual memory usage
Custom Metrics
# In your function
START_TIME=$(date +%s%3N)
# ... function logic ...
END_TIME=$(date +%s%3N)
DURATION=$((END_TIME - START_TIME))
aws cloudwatch put-metric-data \
--namespace "Lambda/Performance" \
--metric-data MetricName=ExecutionTime,Value=$DURATION,Unit=Milliseconds
Cost Impact of Cold Start Optimization
Before optimization (Node.js):
- Average duration: 1,240ms
- Memory: 152MB
- Monthly cost (1M requests): $52.18
After optimization (Custom runtime):
- Average duration: 316ms
- Memory: 36MB
- Monthly cost (1M requests): $8.53
Savings: 84% cost reduction
When Cold Start Optimization Matters Most
Critical use cases:
- User-facing APIs - every millisecond counts
- Real-time processing - IoT, streaming data
- Synchronous workflows - API Gateway integrations
- Cost-sensitive applications - high-volume, low-margin
Less critical:
- Batch processing - cold start amortized over long runs
- Scheduled tasks - can use provisioned concurrency
- Internal tools - user tolerance for latency
Getting Started
Try our optimized custom runtime:
# Use our pre-built runtime
docker pull ghcr.io/ql4b/lambda-shell-runtime:tiny
# Or build your own
git clone https://github.com/ql4b/lambda-shell-runtime
cd lambda-shell-runtime
docker build -t my-fast-runtime .
Result: 90% faster cold starts, 75% lower costs, same functionality.
Lambda cold start optimization isn’t just about performance - it’s about cost efficiency and user experience. Custom runtimes give you the control to optimize for what matters most.