Custom Runtime Cold Start Optimization

AWS Lambda cold starts can kill application performance. Here’s how we achieved 90% faster cold starts using custom runtimes and shell scripts.

Understanding Lambda Cold Starts

A cold start happens when AWS Lambda creates a new execution environment for your function. This includes:

Environment setup - downloading your code
Runtime initialization - starting the language runtime
Function initialization - loading your code and dependencies

Standard runtimes like Node.js, Python, and Java have significant initialization overhead. Custom runtimes let you optimize this process.

Cold Start Performance Comparison

Real Production Metrics

Node.js 18.x Runtime:

Init Duration: 247.82 ms
Duration: 1,240 ms
Memory Used: 152 MB

Python 3.11 Runtime:

Init Duration: 189.45 ms  
Duration: 890 ms
Memory Used: 128 MB

Shell Custom Runtime:

Init Duration: 22.11 ms
Duration: 316 ms  
Memory Used: 36 MB

Results:

90% faster cold starts vs Node.js
88% faster cold starts vs Python
75% less memory usage
Significantly lower costs

Why Custom Runtimes Are Faster

1. Minimal Bootstrap Process

Standard runtimes initialize entire language environments. Custom runtimes start with just what you need:

#!/bin/bash
# Custom runtime bootstrap - 20 lines total
while true; do
  EVENT=$(curl -sS "$AWS_LAMBDA_RUNTIME_API/2018-06-01/runtime/invocation/next")
  RESPONSE=$(./handler.sh "$EVENT")
  curl -sS "$AWS_LAMBDA_RUNTIME_API/2018-06-01/runtime/invocation/$REQUEST_ID/response" \
    -d "$RESPONSE"
done

2. No Dependency Loading

Node.js loads npm modules, Python imports packages. Shell scripts use pre-compiled binaries:

# These are already compiled C binaries
curl -sS "https://api.example.com/data" | jq '.results[]'

3. Smaller Image Sizes

Our custom runtime images:

tiny: 132MB (jq, curl, http-cli)
micro: 221MB (adds AWS tools)
full: 417MB (complete AWS CLI)

Compare to standard runtimes:

Node.js: 500MB+
Python: 400MB+
Java: 800MB+

Lambda Cold Start Optimization Techniques

1. Use Provisioned Concurrency Strategically

For critical functions, provisioned concurrency eliminates cold starts entirely:

resource "aws_lambda_provisioned_concurrency_config" "example" {
  function_name                     = aws_lambda_function.example.function_name
  provisioned_concurrent_executions = 5
  qualifier                        = aws_lambda_function.example.version
}

Cost consideration: Only use for high-traffic, latency-sensitive functions.

2. Optimize Memory Allocation

More memory = faster CPU = faster initialization:

resource "aws_lambda_function" "optimized" {
  memory_size = 1024  # Sweet spot for most functions
  timeout     = 30
}

Benchmark your functions to find the optimal memory setting.

3. Minimize Package Size

Smaller packages download faster:

# Multi-stage build for minimal images
FROM alpine:latest as builder
RUN apk add --no-cache curl jq
FROM scratch
COPY --from=builder /usr/bin/curl /usr/bin/curl
COPY --from=builder /usr/bin/jq /usr/bin/jq

4. Use Lambda Layers Efficiently

Layers are cached across functions, reducing cold start time:

resource "aws_lambda_layer_version" "tools" {
  filename         = "tools-layer.zip"
  layer_name       = "common-tools"
  source_code_hash = filebase64sha256("tools-layer.zip")
  
  compatible_runtimes = ["provided.al2"]
}

Building Your Own Fast Custom Runtime

Step 1: Create the Bootstrap

#!/bin/bash
# bootstrap
set -euo pipefail

while true; do
  HEADERS=$(mktemp)
  EVENT=$(curl -sS -D "$HEADERS" \
    "$AWS_LAMBDA_RUNTIME_API/2018-06-01/runtime/invocation/next")
  
  REQUEST_ID=$(grep -Fi Lambda-Runtime-Aws-Request-Id "$HEADERS" | tr -d '\r' | cut -d: -f2 | tr -d ' ')
  
  RESPONSE=$(timeout 30s ./handler.sh "$EVENT" 2>&1 || echo '{"error":"timeout"}')
  
  curl -sS "$AWS_LAMBDA_RUNTIME_API/2018-06-01/runtime/invocation/$REQUEST_ID/response" \
    -d "$RESPONSE"
  
  rm -f "$HEADERS"
done

Step 2: Create Your Handler

#!/bin/bash
# handler.sh
EVENT="$1"
NAME=$(echo "$EVENT" | jq -r '.name // "World"')
echo '{"message": "Hello, '"$NAME"'!", "timestamp": "'$(date -Iseconds)'"}'

Step 3: Build Optimized Container

FROM public.ecr.aws/lambda/provided:al2023
RUN yum update -y && yum install -y jq curl && yum clean all
COPY bootstrap handler.sh ./
RUN chmod +x bootstrap handler.sh
CMD ["handler.sh"]

Advanced Cold Start Optimization

1. Connection Pooling

Reuse connections across invocations:

# Keep connection alive in /tmp
CONN_FILE="/tmp/api_connection"
if [[ ! -f "$CONN_FILE" ]]; then
  curl -sS --connect-timeout 5 "https://api.example.com/health" > "$CONN_FILE"
fi

2. Lazy Loading

Load resources only when needed:

load_config() {
  if [[ -z "${CONFIG:-}" ]]; then
    CONFIG=$(aws ssm get-parameter --name "/app/config" --query 'Parameter.Value' --output text)
  fi
  echo "$CONFIG"
}

3. Warm-up Strategies

Keep functions warm with scheduled invocations:

resource "aws_cloudwatch_event_rule" "warmup" {
  name                = "lambda-warmup"
  description         = "Keep Lambda functions warm"
  schedule_expression = "rate(5 minutes)"
}

resource "aws_cloudwatch_event_target" "lambda" {
  rule      = aws_cloudwatch_event_rule.warmup.name
  target_id = "WarmupTarget"
  arn       = aws_lambda_function.example.arn
  input     = jsonencode({"warmup": true})
}

Monitoring Cold Start Performance

CloudWatch Metrics

Track these key metrics:

InitDuration - cold start time
Duration - total execution time
MemoryUtilized - actual memory usage

Custom Metrics

# In your function
START_TIME=$(date +%s%3N)
# ... function logic ...
END_TIME=$(date +%s%3N)
DURATION=$((END_TIME - START_TIME))

aws cloudwatch put-metric-data \
  --namespace "Lambda/Performance" \
  --metric-data MetricName=ExecutionTime,Value=$DURATION,Unit=Milliseconds

Cost Impact of Cold Start Optimization

Before optimization (Node.js):

Average duration: 1,240ms
Memory: 152MB
Monthly cost (1M requests): $52.18

After optimization (Custom runtime):

Average duration: 316ms
Memory: 36MB
Monthly cost (1M requests): $8.53

Savings: 84% cost reduction

When Cold Start Optimization Matters Most

Critical use cases:

User-facing APIs - every millisecond counts
Real-time processing - IoT, streaming data
Synchronous workflows - API Gateway integrations
Cost-sensitive applications - high-volume, low-margin

Less critical:

Batch processing - cold start amortized over long runs
Scheduled tasks - can use provisioned concurrency
Internal tools - user tolerance for latency

Getting Started

Try our optimized custom runtime:

# Use our pre-built runtime
docker pull ghcr.io/ql4b/lambda-shell-runtime:tiny

# Or build your own
git clone https://github.com/ql4b/lambda-shell-runtime
cd lambda-shell-runtime
docker build -t my-fast-runtime .

Result: 90% faster cold starts, 75% lower costs, same functionality.

Lambda cold start optimization isn’t just about performance - it’s about cost efficiency and user experience. Custom runtimes give you the control to optimize for what matters most.