CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-pulumi--aws

A Pulumi package for creating and managing Amazon Web Services (AWS) cloud resources with infrastructure-as-code.

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

emr.mddocs/services/

Amazon EMR (Elastic MapReduce)

Amazon EMR is a cloud big data platform for processing vast amounts of data using open source tools such as Apache Spark, Hive, HBase, Flink, and Presto.

Package

import * as aws from "@pulumi/aws";
import * as emr from "@pulumi/aws/emr";

Key Resources

Cluster

EMR cluster for big data processing.

const cluster = new aws.emr.Cluster("data-processing", {
    name: "data-processing-cluster",
    releaseLabel: "emr-6.15.0",
    applications: ["Spark", "Hadoop", "Hive"],
    serviceRole: emrServiceRole.arn,
    masterInstanceGroup: {
        instanceType: "m5.xlarge",
        instanceCount: 1,
    },
    coreInstanceGroup: {
        instanceType: "m5.xlarge",
        instanceCount: 2,
    },
    ec2Attributes: {
        keyName: keyPair.keyName,
        emrManagedMasterSecurityGroup: masterSg.id,
        emrManagedSlaveSecurityGroup: slaveSg.id,
        instanceProfile: instanceProfile.arn,
        subnetId: subnet.id,
    },
    logUri: pulumi.interpolate`s3://${logBucket.id}/emr-logs/`,
    tags: {
        Environment: "production",
    },
});

Instance Fleet

Alternative to instance groups with spot instances.

const fleetCluster = new aws.emr.Cluster("fleet-cluster", {
    name: "spot-fleet-cluster",
    releaseLabel: "emr-6.15.0",
    applications: ["Spark"],
    serviceRole: emrServiceRole.arn,
    masterInstanceFleet: {
        instanceTypeConfigs: [{
            instanceType: "m5.xlarge",
        }],
        targetOnDemandCapacity: 1,
    },
    coreInstanceFleet: {
        instanceTypeConfigs: [
            {
                instanceType: "m5.xlarge",
                bidPriceAsPercentageOfOnDemandPrice: 80,
            },
            {
                instanceType: "m5.2xlarge",
                bidPriceAsPercentageOfOnDemandPrice: 80,
            },
        ],
        targetSpotCapacity: 4,
    },
});

Use Cases

  • Big Data Processing: Spark, Hadoop workloads
  • Machine Learning: Distributed ML training
  • ETL Pipelines: Large-scale data transformation
  • Log Processing: Analyze massive log datasets

Related Services

  • S3 - Data storage
  • Glue - Data catalog and ETL
  • Athena - SQL queries

Install with Tessl CLI

npx tessl i tessl/npm-pulumi--aws@7.16.0

docs

index.md

quickstart.md

README.md

tile.json