CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-pulumi--aws

A Pulumi package for creating and managing Amazon Web Services (AWS) cloud resources with infrastructure-as-code.

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

overview.mddocs/storage/

AWS Storage Services Overview

Guide to selecting and using AWS storage services with Pulumi.

Service Categories

Object Storage

Scalable storage for unstructured data - Files, images, backups, data lakes

  • S3 - Industry-leading object storage with 99.999999999% durability
  • S3 Glacier - Long-term archival storage, low cost

Block Storage

High-performance storage for EC2 - Databases, file systems, boot volumes

  • EBS - Persistent block storage volumes for EC2 instances
  • EC2 Instance Store - Temporary block storage physically attached to host

File Storage

Shared file systems - Multi-instance access, POSIX compliance

  • EFS - Elastic, scalable NFS file system
  • FSx for Lustre - High-performance computing file system
  • FSx for Windows - Fully managed Windows file servers
  • FSx for NetApp ONTAP - Enterprise file storage
  • FSx for OpenZFS - High-performance ZFS file system

Backup & Archive

Data protection and compliance

  • AWS Backup - Centralized backup management
  • S3 Glacier Deep Archive - Lowest-cost archival storage

Data Transfer & Sync

Move data to and from AWS

  • DataSync - Automated data transfer
  • Transfer Family - SFTP, FTPS, FTP for S3 and EFS
  • Storage Gateway - Hybrid cloud storage

Decision Tree

Choose Your Storage Service

Start: What type of data?

┌─ Files/Objects (unstructured data)
│  ├─ Need frequent access?
│  │  ├─ Yes → Access patterns?
│  │  │  ├─ Web/mobile apps, analytics → S3 Standard
│  │  │  ├─ Infrequent (< 1/month) → S3 Infrequent Access
│  │  │  └─ Archive (< 1/year) → S3 Glacier
│  │  └─ No → How fast retrieval?
│  │     ├─ Minutes → S3 Glacier Instant/Flexible
│  │     └─ Hours → S3 Glacier Deep Archive
│  │
│  └─ Need shared file system?
│     ├─ Linux/NFS → EFS
│     ├─ Windows SMB → FSx for Windows
│     ├─ High-performance computing → FSx for Lustre
│     ├─ Enterprise features → FSx for NetApp ONTAP
│     └─ ZFS compatibility → FSx for OpenZFS
│
┌─ Block storage (for EC2)
│  ├─ Performance needs?
│  │  ├─ Highest IOPS (> 64,000) → EBS io2 Block Express
│  │  ├─ High IOPS (< 64,000) → EBS io2 or io1
│  │  ├─ Balanced → EBS gp3 (default choice)
│  │  ├─ Throughput-optimized → EBS st1
│  │  └─ Cold storage → EBS sc1
│  │
│  ├─ Temporary data? → EC2 Instance Store
│  └─ Boot volumes → EBS gp3
│
└─ Backup/DR
   ├─ Centralized management → AWS Backup
   ├─ Application-specific → S3 + lifecycle policies
   └─ Long-term retention → S3 Glacier

Service Selection Guide

Use S3 When

  • Storing files, images, videos, logs
  • Building data lakes
  • Hosting static websites
  • Backing up data
  • Distributing software/media
  • Serving content via CloudFront

Storage classes:

  • S3 Standard - Frequent access, low latency (default)
  • S3 Intelligent-Tiering - Automatic cost optimization
  • S3 Standard-IA - Infrequent access, immediate retrieval
  • S3 One Zone-IA - Infrequent, single AZ (recreatable data)
  • S3 Glacier Instant Retrieval - Archive, millisecond retrieval
  • S3 Glacier Flexible Retrieval - Archive, minutes-hours retrieval
  • S3 Glacier Deep Archive - Lowest cost, 12-hour retrieval

Cost optimization:

  • Use lifecycle policies to transition data
  • Enable S3 Intelligent-Tiering for unpredictable access
  • Delete incomplete multipart uploads
  • Use S3 Storage Lens for visibility

Use EBS When

  • Running databases on EC2 (MySQL, PostgreSQL, MongoDB)
  • Boot volumes for EC2 instances
  • Need consistent, low-latency performance
  • Require point-in-time snapshots
  • Single-instance storage

Volume types:

  • gp3 - General purpose SSD (best price/performance)
  • gp2 - General purpose SSD (legacy, use gp3)
  • io2 Block Express - Highest performance, > 64,000 IOPS
  • io2/io1 - Provisioned IOPS SSD, up to 64,000 IOPS
  • st1 - Throughput-optimized HDD (big data, log processing)
  • sc1 - Cold HDD (infrequent access)

Sizing guidance:

  • Start with gp3 for most workloads
  • Use io2 for IOPS-intensive databases
  • Enable encryption by default
  • Use snapshots for backups
  • Consider RAID 0 for higher throughput

Use EFS When

  • Need shared file system across multiple EC2 instances
  • Container storage (ECS, EKS)
  • Content management systems
  • Web serving
  • Development environments
  • Home directories

Performance modes:

  • General Purpose - Latency-sensitive (default)
  • Max I/O - Higher aggregate throughput

Throughput modes:

  • Elastic - Automatic scaling (recommended)
  • Bursting - Scales with file system size
  • Provisioned - Fixed throughput independent of size

Storage classes:

  • Standard - Frequent access
  • Infrequent Access (IA) - Cost-optimized for infrequent access
  • Use lifecycle policies to automatically transition

Use FSx When

FSx for Lustre:

  • High-performance computing (HPC)
  • Machine learning training
  • Media processing
  • Financial modeling
  • Genomics research
  • Need 100s of GB/s throughput and millions of IOPS

FSx for Windows File Server:

  • Windows applications requiring SMB
  • Active Directory integration
  • SQL Server databases
  • SharePoint
  • Windows home directories
  • Legacy Windows apps

FSx for NetApp ONTAP:

  • Multi-protocol access (NFS, SMB, iSCSI)
  • Advanced data management features
  • SnapMirror replication
  • Application migration from on-premises
  • Enterprise features (deduplication, compression)

FSx for OpenZFS:

  • NFS workloads
  • Need ZFS features (snapshots, cloning)
  • Sub-millisecond latencies
  • Point-in-time data recovery

Use Glacier When

  • Archival storage (compliance, regulatory)
  • Disaster recovery backups
  • Media archives
  • Scientific data
  • Financial records
  • Medical imaging archives

Retrieval times:

  • Instant Retrieval - Milliseconds, quarterly access
  • Flexible Retrieval - Minutes to hours
    • Expedited: 1-5 minutes
    • Standard: 3-5 hours
    • Bulk: 5-12 hours
  • Deep Archive - 12 hours, lowest cost

Use AWS Backup When

  • Need centralized backup management
  • Backing up multiple services (EBS, RDS, DynamoDB, EFS)
  • Compliance requirements for backup retention
  • Cross-region and cross-account backups
  • Automated backup scheduling

Supported services:

  • EC2, EBS, EFS, FSx
  • RDS, Aurora, DynamoDB, DocumentDB, Neptune
  • S3, Storage Gateway
  • And more

Common Patterns

Pattern 1: Static Website Hosting

S3 + CloudFront

import * as aws from "@pulumi/aws";

// S3 bucket for website content
const bucket = new aws.s3.BucketV2("website", {
    bucket: "my-website.com",
});

// Enable website hosting
const websiteConfig = new aws.s3.BucketWebsiteConfigurationV2("website-config", {
    bucket: bucket.id,
    indexDocument: { suffix: "index.html" },
    errorDocument: { key: "error.html" },
});

// Public access (behind CloudFront)
const publicAccess = new aws.s3.BucketPublicAccessBlock("public", {
    bucket: bucket.id,
    blockPublicAcls: false,
    blockPublicPolicy: false,
    ignorePublicAcls: false,
    restrictPublicBuckets: false,
});

// CloudFront distribution
const cdn = new aws.cloudfront.Distribution("cdn", {
    origins: [{
        domainName: bucket.bucketRegionalDomainName,
        originId: "S3Origin",
    }],
    enabled: true,
    defaultRootObject: "index.html",
    defaultCacheBehavior: {
        targetOriginId: "S3Origin",
        viewerProtocolPolicy: "redirect-to-https",
        allowedMethods: ["GET", "HEAD"],
        cachedMethods: ["GET", "HEAD"],
        forwardedValues: {
            queryString: false,
            cookies: { forward: "none" },
        },
    },
});

export const websiteUrl = cdn.domainName;

Use when: Hosting static sites, SPAs, documentation

Pattern 2: Data Lake

S3 + Lifecycle Policies + Athena

// Data lake bucket
const dataLake = new aws.s3.BucketV2("data-lake", {
    bucket: "my-data-lake",
});

// Lifecycle policy for cost optimization
const lifecycle = new aws.s3.BucketLifecycleConfigurationV2("lifecycle", {
    bucket: dataLake.id,
    rules: [{
        id: "archive-old-data",
        status: "Enabled",
        transitions: [
            {
                days: 30,
                storageClass: "STANDARD_IA",
            },
            {
                days: 90,
                storageClass: "GLACIER_INSTANT_RETRIEVAL",
            },
            {
                days: 365,
                storageClass: "DEEP_ARCHIVE",
            },
        ],
        noncurrentVersionTransitions: [{
            noncurrentDays: 30,
            storageClass: "GLACIER_FLEXIBLE_RETRIEVAL",
        }],
    }],
});

// Enable versioning for data protection
const versioning = new aws.s3.BucketVersioningV2("versioning", {
    bucket: dataLake.id,
    versioningConfiguration: { status: "Enabled" },
});

// Server-side encryption
const encryption = new aws.s3.BucketServerSideEncryptionConfigurationV2("encryption", {
    bucket: dataLake.id,
    rules: [{
        applyServerSideEncryptionByDefault: {
            sseAlgorithm: "AES256",
        },
    }],
});

Use when: Analytics, data warehousing, log aggregation

Pattern 3: Database Storage

EC2 + EBS with Snapshots

// High-performance EBS volume for database
const dbVolume = new aws.ebs.Volume("db-volume", {
    availabilityZone: "us-east-1a",
    size: 100, // GB
    type: "io2",
    iops: 10000,
    encrypted: true,
    tags: { Name: "production-db", Purpose: "database" },
});

// Attach to EC2 instance
const attachment = new aws.ec2.VolumeAttachment("db-attachment", {
    deviceName: "/dev/xvdf",
    volumeId: dbVolume.id,
    instanceId: instance.id,
});

// DLM policy for automated snapshots
const snapshotRole = new aws.iam.Role("dlm-role", {
    assumeRolePolicy: JSON.stringify({
        Version: "2012-10-17",
        Statement: [{
            Effect: "Allow",
            Principal: { Service: "dlm.amazonaws.com" },
            Action: "sts:AssumeRole",
        }],
    }),
});

const snapshotPolicy = new aws.dlm.LifecyclePolicy("db-snapshots", {
    description: "Daily database snapshots",
    executionRoleArn: snapshotRole.arn,
    state: "ENABLED",
    policyDetails: {
        resourceTypes: ["VOLUME"],
        schedules: [{
            name: "Daily snapshots",
            createRule: { interval: 24, intervalUnit: "HOURS", times: ["03:00"] },
            retainRule: { count: 7 },
            tagsToAdd: { SnapshotType: "automated" },
        }],
        targetTags: { Purpose: "database" },
    },
});

Use when: Running databases on EC2, need high IOPS

Pattern 4: Shared File System

EFS for Multi-Instance Access

// EFS file system
const fileSystem = new aws.efs.FileSystem("shared-fs", {
    encrypted: true,
    lifecyclePolicies: [{
        transitionToIa: "AFTER_30_DAYS",
    }],
    tags: { Name: "shared-storage" },
});

// Mount targets in each AZ
const mountTargets = subnetIds.map((subnetId, index) =>
    new aws.efs.MountTarget(`mount-${index}`, {
        fileSystemId: fileSystem.id,
        subnetId: subnetId,
        securityGroups: [securityGroup.id],
    })
);

// Access point for application
const accessPoint = new aws.efs.AccessPoint("app-access", {
    fileSystemId: fileSystem.id,
    posixUser: {
        uid: 1000,
        gid: 1000,
    },
    rootDirectory: {
        path: "/app-data",
        creationInfo: {
            ownerUid: 1000,
            ownerGid: 1000,
            permissions: "755",
        },
    },
});

// Use in ECS task definition
const taskDef = new aws.ecs.TaskDefinition("task", {
    family: "app",
    volumes: [{
        name: "shared-storage",
        efsVolumeConfiguration: {
            fileSystemId: fileSystem.id,
            transitEncryption: "ENABLED",
            authorizationConfig: {
                accessPointId: accessPoint.id,
            },
        },
    }],
    containerDefinitions: JSON.stringify([{
        name: "app",
        image: "nginx",
        mountPoints: [{
            sourceVolume: "shared-storage",
            containerPath: "/data",
        }],
    }]),
    requiresCompatibilities: ["FARGATE"],
    cpu: "256",
    memory: "512",
    networkMode: "awsvpc",
});

Use when: Shared storage for containers, web servers, CMS

Pattern 5: Backup Strategy

AWS Backup for Centralized Protection

// Backup vault
const vault = new aws.backup.Vault("main-vault", {
    name: "primary-backup-vault",
    kmsKeyArn: kmsKey.arn,
});

// Backup plan
const plan = new aws.backup.Plan("daily-backup", {
    name: "daily-backup-plan",
    rules: [{
        ruleName: "DailyBackup",
        targetVaultName: vault.name,
        schedule: "cron(0 5 ? * * *)", // 5 AM UTC daily
        lifecycle: {
            deleteAfter: 30,
            coldStorageAfter: 7, // Move to cold storage after 7 days
        },
    }],
});

// Backup selection (what to back up)
const selection = new aws.backup.Selection("resources", {
    name: "production-resources",
    planId: plan.id,
    iamRoleArn: backupRole.arn,
    resources: [
        rdsInstance.arn,
        efsFileSystem.arn,
        dynamoTable.arn,
    ],
});

// Or use tags to select resources
const tagSelection = new aws.backup.Selection("by-tags", {
    name: "tag-based-selection",
    planId: plan.id,
    iamRoleArn: backupRole.arn,
    selectionTags: [{
        type: "STRINGEQUALS",
        key: "Backup",
        value: "true",
    }],
});

Use when: Centralized backup management, compliance requirements

Performance Guidelines

S3 Performance

  • Throughput: 3,500 PUT/COPY/POST/DELETE, 5,500 GET/HEAD per prefix per second
  • No limit on prefixes: Virtually unlimited aggregate throughput
  • Use multipart upload for files > 100 MB
  • Use S3 Transfer Acceleration for global uploads
  • Consider CloudFront for read-heavy workloads

EBS Performance

  • gp3: Up to 16,000 IOPS, 1,000 MB/s throughput (independent)
  • io2: Up to 64,000 IOPS, 1,000 MB/s (256,000 IOPS with io2 Block Express)
  • Use EBS-optimized instances
  • Pre-warm volumes created from snapshots
  • Use RAID 0 for higher throughput (no redundancy)

EFS Performance

  • Elastic throughput: Automatically scales (recommended)
  • Bursting: Up to 100 MB/s per TiB stored
  • Provisioned: Fixed throughput regardless of size
  • Use General Purpose mode for latency-sensitive workloads
  • Use Max I/O for maximum aggregate throughput

FSx Performance

  • Lustre: 100s of GB/s, millions of IOPS
  • Windows: Up to 2 GB/s, 100,000s IOPS
  • ONTAP: Up to 2 GB/s, 160,000 IOPS
  • OpenZFS: Up to 1 GB/s, 160,000 IOPS

Cost Optimization

S3 Cost Reduction

  1. Use lifecycle policies - Automatic transitions to cheaper storage classes
  2. Enable Intelligent-Tiering - Automatic optimization for unpredictable access
  3. Delete old versions - Use lifecycle rules for non-current versions
  4. Analyze with S3 Storage Lens - Identify optimization opportunities
  5. Use Requester Pays - Have data consumers pay for transfers

EBS Cost Reduction

  1. Delete unused volumes - Identify unattached volumes
  2. Snapshot and delete old volumes - Archive infrequently used data
  3. Use gp3 instead of gp2 - 20% cheaper, better performance
  4. Right-size volumes - Use CloudWatch metrics
  5. Delete old snapshots - Automated via Data Lifecycle Manager

EFS Cost Reduction

  1. Use lifecycle management - Move to IA storage class after 30 days
  2. Use Elastic throughput - Pay only for what you use
  3. Monitor with CloudWatch - Track actual usage
  4. Consider S3 for cold data - Much cheaper for infrequent access

Quick Links

Core Services

  • S3 - Object Storage
  • EBS - Block Storage
  • EFS - File Systems
  • FSx - Managed File Systems
  • Glacier - Archive Storage
  • AWS Backup

Related Services

Guides

See Also

Install with Tessl CLI

npx tessl i tessl/npm-pulumi--aws@7.16.0

docs

index.md

quickstart.md

README.md

tile.json