tessl/npm-pulumi--aws

A Pulumi package for creating and managing Amazon Web Services (AWS) cloud resources with infrastructure-as-code.

—

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview

Eval results

Files

AWS Database Services Overview

Name: tessl/npm-pulumi--aws
Author: tessl

Guide to selecting and using AWS database services with Pulumi.

Service Categories

Relational Databases (SQL)

Structured data with ACID transactions

RDS - Managed relational databases (MySQL, PostgreSQL, MariaDB, Oracle, SQL Server)
Aurora - MySQL/PostgreSQL-compatible, up to 5x performance
Redshift - Petabyte-scale data warehouse

NoSQL Databases

Flexible schemas, horizontal scaling

DynamoDB - Key-value and document database, millisecond latency
DocumentDB - MongoDB-compatible document database
Keyspaces - Apache Cassandra-compatible

In-Memory Caching

Microsecond latency, session stores

ElastiCache - Redis and Memcached
MemoryDB for Redis - Redis-compatible, durable in-memory database
DAX - DynamoDB Accelerator, microsecond latency

Specialized Databases

Purpose-built for specific workloads

Neptune - Graph database (Gremlin, SPARQL)
QLDB - Ledger database, immutable transaction log
Timestream - Time-series database
OpenSearch - Search and analytics

Decision Tree

Choose Your Database Service

Start: What's your data model?

┌─ Relational (SQL)
│  ├─ Need AWS-optimized? → Yes: Aurora | No: RDS
│  ├─ Analytics/warehousing? → Redshift
│  └─ Specific engine?
│     ├─ PostgreSQL → Aurora PostgreSQL or RDS PostgreSQL
│     ├─ MySQL → Aurora MySQL or RDS MySQL
│     ├─ Oracle → RDS Oracle
│     └─ SQL Server → RDS SQL Server
│
┌─ NoSQL
│  ├─ Key-value? → DynamoDB (serverless, millisecond latency)
│  ├─ Document?
│  │  ├─ MongoDB compatible? → DocumentDB
│  │  └─ Simple key-value → DynamoDB
│  ├─ Wide-column? → Keyspaces (Cassandra)
│  └─ Graph? → Neptune
│
┌─ Caching
│  ├─ DynamoDB acceleration? → DAX
│  ├─ Redis features? → ElastiCache Redis or MemoryDB
│  └─ Simple caching? → ElastiCache Memcached
│
└─ Specialized
   ├─ Time-series data → Timestream
   ├─ Search/analytics → OpenSearch
   ├─ Immutable ledger → QLDB
   └─ Graph relationships → Neptune

Service Selection Guide

Use RDS When

Need traditional relational database
ACID transactions required
Complex queries and joins
Existing application using MySQL, PostgreSQL, Oracle, or SQL Server
Want managed service with automated backups, patches

Engine selection:

PostgreSQL - Advanced features, JSON, full-text search
MySQL - Most popular, wide compatibility
MariaDB - MySQL fork, additional features
Oracle - Enterprise features, Oracle compatibility
SQL Server - Windows applications, .NET integration

Instance classes:

db.t3/t4g - Burstable, development/test
db.m5/m6g - General purpose, balanced
db.r5/r6g - Memory-optimized, large datasets
db.x2 - Extreme memory, in-memory databases

When to use Aurora instead:

Need 5x MySQL or 3x PostgreSQL performance
High availability requirements
Read-heavy workloads (15 read replicas)
Global database requirements
Serverless workloads (Aurora Serverless)

Use Aurora When

Need high performance (5x MySQL, 3x PostgreSQL)
High availability critical (6 copies across 3 AZs)
Read-heavy workloads (up to 15 read replicas)
Global applications (Aurora Global Database)
Variable workloads (Aurora Serverless v2)
Want automatic scaling storage (up to 128 TB)

Aurora vs RDS:

Aurora: Higher performance, better HA, auto-scaling storage, more expensive
RDS: Standard performance, manual scaling, lower cost

Aurora Serverless v2:

Automatically scales compute capacity
Scales in fine-grained increments
Good for variable workloads
Instant scaling without warmup

Use DynamoDB When

Need single-digit millisecond latency
Serverless, fully managed database
Massive scale (10 trillion requests/day)
Simple access patterns (key-value, simple queries)
Variable workload patterns
Want automatic scaling

Access patterns:

Point lookups - Get item by primary key
Range queries - Query items by partition key + sort key
Secondary indexes - Alternative access patterns (GSI, LSI)
Avoid complex joins and aggregations

Capacity modes:

On-demand - Pay per request, automatic scaling
Provisioned - Reserve capacity, predictable cost

Best practices:

Design partition keys for even data distribution
Use composite sort keys for flexibility
Implement access patterns with GSIs
Use DynamoDB Streams for change data capture
Consider DynamoDB Accelerator (DAX) for caching

Use ElastiCache When

Need sub-millisecond latency
Session store for web applications
Real-time analytics dashboard
Caching database query results
Leaderboards and gaming

Redis vs Memcached:

Redis:

Rich data structures (strings, hashes, lists, sets, sorted sets)
Persistence (snapshots, AOF)
Replication and automatic failover
Pub/sub messaging
Lua scripting
Transactions

Memcached:

Simple key-value caching
Multi-threaded performance
Less memory overhead
No persistence
Good for simple caching needs

When to use MemoryDB instead:

Need Redis durability (writes persisted)
Primary database with Redis compatibility
Microsecond read, single-digit millisecond writes
Multi-AZ durability

Use Redshift When

Data warehousing and analytics
OLAP (Online Analytical Processing)
Business intelligence queries
Petabyte-scale datasets
Complex aggregations and joins
Historical data analysis

Redshift vs RDS:

Redshift: Columnar storage, analytics, complex queries, petabyte scale
RDS: Row-based storage, OLTP, simple queries, smaller datasets

Deployment options:

Provisioned clusters - Reserved capacity, predictable cost
Serverless - Automatic scaling, pay per use

Best practices:

Use columnar compression
Define sort and distribution keys
Leverage Redshift Spectrum for S3 queries
Use materialized views for repeated queries
Implement workload management

Use DocumentDB When

Need MongoDB compatibility
Migrating from MongoDB
Document-based data model
JSON/BSON documents
Want managed MongoDB-compatible service

DocumentDB vs DynamoDB:

DocumentDB: MongoDB API, complex queries, transactions
DynamoDB: AWS-native, simpler API, higher scale

Use Neptune When

Graph data model (nodes and relationships)
Social networks
Recommendation engines
Knowledge graphs
Fraud detection patterns
Network analysis

Query languages:

Apache TinkerPop Gremlin - Graph traversal
SPARQL - RDF graphs
openCypher - Property graph queries

Use Timestream When

Time-series data (IoT, DevOps, analytics)
High write throughput
Time-based queries and aggregations
Automatic data lifecycle management
Built-in time-series analytics

Use cases:

IoT sensor data
Application monitoring
DevOps metrics
Financial tick data

Common Patterns

Pattern 1: Web Application Database

RDS Multi-AZ with Read Replicas

import * as aws from "@pulumi/aws";

// Subnet group for RDS
const subnetGroup = new aws.rds.SubnetGroup("db-subnet", {
    subnetIds: privateSubnetIds,
    tags: { Name: "Main DB subnet group" },
});

// Security group
const dbSg = new aws.ec2.SecurityGroup("db-sg", {
    vpcId: vpc.id,
    ingress: [{
        protocol: "tcp",
        fromPort: 5432,
        toPort: 5432,
        securityGroups: [appSg.id],
    }],
});

// Primary database instance
const db = new aws.rds.Instance("primary-db", {
    engine: "postgres",
    engineVersion: "15.4",
    instanceClass: "db.t3.medium",
    allocatedStorage: 100,
    storageType: "gp3",
    dbName: "myapp",
    username: "admin",
    password: dbPassword.result, // From secrets manager
    multiAz: true, // High availability
    dbSubnetGroupName: subnetGroup.name,
    vpcSecurityGroupIds: [dbSg.id],
    backupRetentionPeriod: 7,
    backupWindow: "03:00-04:00",
    maintenanceWindow: "sun:04:00-sun:05:00",
    enabledCloudwatchLogsExports: ["postgresql", "upgrade"],
    storageEncrypted: true,
    skipFinalSnapshot: false,
    finalSnapshotIdentifier: "final-snapshot",
});

// Read replica for scaling reads
const readReplica = new aws.rds.Instance("read-replica", {
    replicateSourceDb: db.id,
    instanceClass: "db.t3.medium",
    publiclyAccessible: false,
});

export const dbEndpoint = db.endpoint;
export const replicaEndpoint = readReplica.endpoint;

Use when: Traditional web applications, ACID requirements

Pattern 2: Serverless API Backend

DynamoDB + Lambda

// DynamoDB table
const table = new aws.dynamodb.Table("users", {
    attributes: [
        { name: "userId", type: "S" },
        { name: "email", type: "S" },
        { name: "createdAt", type: "N" },
    ],
    hashKey: "userId",
    billingMode: "PAY_PER_REQUEST", // On-demand scaling
    globalSecondaryIndexes: [{
        name: "EmailIndex",
        hashKey: "email",
        projectionType: "ALL",
    }],
    streamEnabled: true,
    streamViewType: "NEW_AND_OLD_IMAGES",
    pointInTimeRecovery: { enabled: true },
    serverSideEncryption: { enabled: true },
    tags: { Name: "Users table" },
});

// Lambda function
const handler = new aws.lambda.Function("api", {
    runtime: "nodejs20.x",
    handler: "index.handler",
    role: lambdaRole.arn,
    code: new pulumi.asset.FileArchive("./app"),
    environment: {
        variables: {
            TABLE_NAME: table.name,
        },
    },
});

// DynamoDB stream processor
const streamProcessor = new aws.lambda.Function("stream-processor", {
    runtime: "nodejs20.x",
    handler: "stream.handler",
    role: streamRole.arn,
    code: new pulumi.asset.FileArchive("./stream"),
});

const eventSourceMapping = new aws.lambda.EventSourceMapping("dynamodb-stream", {
    eventSourceArn: table.streamArn,
    functionName: streamProcessor.arn,
    startingPosition: "LATEST",
});

export const tableName = table.name;

Use when: Serverless applications, variable workloads, high scale

Pattern 3: High-Performance API

Aurora Serverless v2 + Connection Pooling

// Aurora Serverless v2 cluster
const cluster = new aws.rds.Cluster("aurora-cluster", {
    engine: "aurora-postgresql",
    engineMode: "provisioned",
    engineVersion: "15.4",
    databaseName: "myapp",
    masterUsername: "admin",
    masterPassword: dbPassword.result,
    dbSubnetGroupName: subnetGroup.name,
    vpcSecurityGroupIds: [dbSg.id],
    serverlessv2ScalingConfiguration: {
        minCapacity: 0.5, // 0.5 ACU minimum
        maxCapacity: 16,  // 16 ACU maximum
    },
    enabledCloudwatchLogsExports: ["postgresql"],
    backupRetentionPeriod: 7,
    storageEncrypted: true,
});

// Serverless v2 instances
const writer = new aws.rds.ClusterInstance("writer", {
    clusterIdentifier: cluster.id,
    instanceClass: "db.serverless",
    engine: cluster.engine,
    engineVersion: cluster.engineVersion,
});

const reader = new aws.rds.ClusterInstance("reader", {
    clusterIdentifier: cluster.id,
    instanceClass: "db.serverless",
    engine: cluster.engine,
    engineVersion: cluster.engineVersion,
});

// RDS Proxy for connection pooling
const proxy = new aws.rds.Proxy("db-proxy", {
    name: "aurora-proxy",
    engineFamily: "POSTGRESQL",
    auths: [{
        authScheme: "SECRETS",
        secretArn: dbSecret.arn,
    }],
    roleArn: proxyRole.arn,
    vpcSubnetIds: privateSubnetIds,
    vpcSecurityGroupIds: [proxySg.id],
    requireTls: true,
});

const proxyTarget = new aws.rds.ProxyDefaultTargetGroup("target", {
    dbProxyName: proxy.name,
    connectionPoolConfig: {
        maxConnectionsPercent: 100,
        maxIdleConnectionsPercent: 50,
        connectionBorrowTimeout: 120,
    },
});

const proxyTargetGroupAttachment = new aws.rds.ProxyTarget("attachment", {
    dbProxyName: proxy.name,
    targetGroupName: proxyTarget.name,
    dbClusterIdentifier: cluster.id,
});

export const proxyEndpoint = proxy.endpoint;

Use when: Variable workloads, Lambda functions, high connection counts

Pattern 4: Caching Layer

ElastiCache Redis + RDS

// ElastiCache subnet group
const cacheSubnetGroup = new aws.elasticache.SubnetGroup("cache-subnet", {
    subnetIds: privateSubnetIds,
});

// Redis cluster
const redis = new aws.elasticache.ReplicationGroup("redis-cluster", {
    replicationGroupId: "app-cache",
    description: "Application cache layer",
    engine: "redis",
    engineVersion: "7.0",
    nodeType: "cache.t3.micro",
    numCacheClusters: 2, // Primary + 1 replica
    automaticFailoverEnabled: true,
    multiAzEnabled: true,
    subnetGroupName: cacheSubnetGroup.name,
    securityGroupIds: [cacheSg.id],
    atRestEncryptionEnabled: true,
    transitEncryptionEnabled: true,
    authToken: cachePassword.result,
    snapshotRetentionLimit: 5,
    snapshotWindow: "03:00-05:00",
    maintenanceWindow: "sun:05:00-sun:07:00",
    autoMinorVersionUpgrade: true,
});

// Application uses cache-aside pattern
export const redisEndpoint = redis.primaryEndpointAddress;
export const redisPort = redis.port;

// Example application code pattern:
// 1. Check Redis for cached data
// 2. If miss, query RDS
// 3. Store result in Redis with TTL
// 4. Return data

Use when: Read-heavy workloads, reduce database load, session storage

Pattern 5: Analytics Data Warehouse

Redshift + S3 Data Lake

// Redshift subnet group
const redshiftSubnetGroup = new aws.redshift.SubnetGroup("redshift-subnet", {
    subnetIds: privateSubnetIds,
    tags: { Name: "Redshift subnet group" },
});

// Redshift cluster
const dataWarehouse = new aws.redshift.Cluster("analytics", {
    clusterIdentifier: "data-warehouse",
    databaseName: "analytics",
    masterUsername: "admin",
    masterPassword: redshiftPassword.result,
    nodeType: "dc2.large",
    numberOfNodes: 2,
    clusterSubnetGroupName: redshiftSubnetGroup.name,
    vpcSecurityGroupIds: [redshiftSg.id],
    encrypted: true,
    enhancedVpcRouting: true,
    automatedSnapshotRetentionPeriod: 7,
    skipFinalSnapshot: false,
    finalSnapshotIdentifier: "final-snapshot",
});

// S3 bucket for data lake
const dataLake = new aws.s3.BucketV2("data-lake", {
    bucket: "analytics-data-lake",
});

// IAM role for Redshift to access S3
const redshiftRole = new aws.iam.Role("redshift-role", {
    assumeRolePolicy: JSON.stringify({
        Version: "2012-10-17",
        Statement: [{
            Effect: "Allow",
            Principal: { Service: "redshift.amazonaws.com" },
            Action: "sts:AssumeRole",
        }],
    }),
});

const s3Policy = new aws.iam.RolePolicyAttachment("s3-access", {
    role: redshiftRole.name,
    policyArn: "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess",
});

const clusterIamRole = new aws.redshift.ClusterIamRoles("cluster-roles", {
    clusterIdentifier: dataWarehouse.id,
    iamRoleArns: [redshiftRole.arn],
});

// Query S3 data with Redshift Spectrum
// CREATE EXTERNAL SCHEMA spectrum_schema
// FROM DATA CATALOG DATABASE 'spectrum_db'
// IAM_ROLE 'arn:aws:iam::account:role/role'
// CREATE EXTERNAL DATABASE IF NOT EXISTS;

export const redshiftEndpoint = dataWarehouse.endpoint;

Use when: Business intelligence, analytics, data warehousing

Performance Guidelines

RDS Performance

Use provisioned IOPS (io1/io2) for high-performance workloads
Enable Performance Insights for monitoring
Use read replicas to scale read traffic
Optimize queries with proper indexing
Consider Aurora for higher performance

DynamoDB Performance

Design partition keys for even distribution
Use eventually consistent reads when possible
Implement caching with DAX for read-heavy workloads
Use batch operations for multiple items
Monitor consumed capacity with CloudWatch

ElastiCache Performance

Use cluster mode for Redis (horizontal scaling)
Choose appropriate node types for workload
Use connection pooling in applications
Monitor cache hit ratio
Set appropriate TTLs

Aurora Performance

Use reader endpoints for read traffic
Enable query cache
Use parallel query for analytics
Implement connection pooling with RDS Proxy
Monitor with Performance Insights

Cost Optimization

RDS Cost Reduction

Right-size instances - Use Performance Insights
Use Reserved Instances - 1 or 3-year commitments (up to 69% savings)
Delete unused snapshots - Set retention policies
Stop dev/test instances - Save ~70% when not in use
Use Aurora Serverless - For variable workloads
Leverage read replicas - Scale reads instead of upgrading primary

DynamoDB Cost Reduction

Use on-demand for variable workloads - Pay per request
Use provisioned for predictable traffic - Lower cost per request
Enable auto-scaling - Match capacity to demand
Archive old data - Export to S3 + DynamoDB archival
Use DynamoDB Standard-IA - For infrequently accessed data
Delete unused tables and GSIs - Reduce storage costs

ElastiCache Cost Reduction

Right-size nodes - Monitor memory and CPU
Use Reserved Nodes - Up to 55% savings
Use t3/t4g for dev/test - Burstable instances
Delete unused clusters - Monitor idle resources
Use Graviton nodes - Better price/performance

Redshift Cost Reduction

Use Reserved Nodes - Up to 75% savings
Use Redshift Serverless - Pay only for usage
Compress data - Reduce storage costs
Delete old snapshots - Set retention policies
Use S3 for cold data - Redshift Spectrum queries

tessl/npm-pulumi--aws

overview.mddocs/database/

AWS Database Services Overview

Service Categories

Relational Databases (SQL)

NoSQL Databases

In-Memory Caching

Specialized Databases

Decision Tree

Choose Your Database Service

Service Selection Guide

Use RDS When

Use Aurora When

Use DynamoDB When

Use ElastiCache When

Use Redshift When

Use DocumentDB When

Use Neptune When

Use Timestream When

Common Patterns

Pattern 1: Web Application Database

Pattern 2: Serverless API Backend

Pattern 3: High-Performance API

Pattern 4: Caching Layer

Pattern 5: Analytics Data Warehouse

Performance Guidelines

RDS Performance

DynamoDB Performance

ElastiCache Performance

Aurora Performance

Cost Optimization

RDS Cost Reduction

DynamoDB Cost Reduction

ElastiCache Cost Reduction

Redshift Cost Reduction

Quick Links

Core Services

Related Services

Guides

See Also

tessl/npm-pulumi--aws

overview.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/database/

AWS Database Services Overview

Service Categories

Relational Databases (SQL)

NoSQL Databases

In-Memory Caching

Specialized Databases

Decision Tree

Choose Your Database Service

Service Selection Guide

Use RDS When

Use Aurora When

Use DynamoDB When

Use ElastiCache When

Use Redshift When

Use DocumentDB When

Use Neptune When

Use Timestream When

Common Patterns

Pattern 1: Web Application Database

Pattern 2: Serverless API Backend

Pattern 3: High-Performance API

Pattern 4: Caching Layer

Pattern 5: Analytics Data Warehouse

Performance Guidelines

RDS Performance

DynamoDB Performance

ElastiCache Performance

Aurora Performance

Cost Optimization

RDS Cost Reduction

DynamoDB Cost Reduction

ElastiCache Cost Reduction

Redshift Cost Reduction

Quick Links

Core Services

Related Services

Guides

See Also

overview.mddocs/database/