A Pulumi package for creating and managing Amazon Web Services (AWS) cloud resources with infrastructure-as-code.
—
Quality
Pending
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
AWS analytics services for big data processing, streaming, search, and business intelligence.
Query data in S3 using standard SQL without infrastructure management.
import { athena } from "@pulumi/aws";
// Create a workgroup for query isolation
const workgroup = new athena.Workgroup("analytics", {
configuration: {
resultConfiguration: {
outputLocation: `s3://${queryResultsBucket.id}/results/`,
},
},
});
// Define a named query
const salesQuery = new athena.NamedQuery("daily-sales", {
database: "analytics_db",
query: "SELECT date, SUM(amount) FROM sales GROUP BY date",
workgroup: workgroup.id,
});Key Resources: Database, NamedQuery, Workgroup, DataCatalog, PreparedStatement
Use Cases: Ad-hoc SQL queries, log analysis, data exploration, cost-effective analytics
Process vast amounts of data using Hadoop, Spark, Hive, and other big data frameworks.
import { emr } from "@pulumi/aws";
// Create an EMR cluster for Spark jobs
const cluster = new emr.Cluster("data-processing", {
releaseLabel: "emr-6.15.0",
applications: ["Spark", "Hadoop", "Hive"],
serviceRole: emrServiceRole.arn,
ec2Attributes: {
instanceProfile: emrInstanceProfile.arn,
subnetId: subnet.id,
},
masterInstanceGroup: {
instanceType: "m5.xlarge",
},
coreInstanceGroup: {
instanceType: "m5.xlarge",
instanceCount: 2,
},
});Key Resources: Cluster, InstanceFleet, InstanceGroup, SecurityConfiguration, ManagedScalingPolicy, Studio
Use Cases: Large-scale data processing, machine learning pipelines, ETL workflows, genomics analysis
Collect, process, and analyze real-time streaming data at scale.
import { kinesis } from "@pulumi/aws";
// Create a data stream
const stream = new kinesis.Stream("events", {
shardCount: 2,
retentionPeriod: 24,
streamModeDetails: {
streamMode: "PROVISIONED",
},
});
// Create Kinesis Data Firehose for S3 delivery
const firehose = new kinesis.FirehoseDeliveryStream("to-s3", {
destination: "extended_s3",
extendedS3Configuration: {
roleArn: firehoseRole.arn,
bucketArn: dataBucket.arn,
prefix: "data/year=!{timestamp:yyyy}/month=!{timestamp:MM}/",
bufferingSize: 5,
bufferingInterval: 300,
},
});
// Kinesis Analytics for real-time processing
const application = new kinesis.AnalyticsApplication("processor", {
inputs: {
namePrefix: "SOURCE_SQL_STREAM",
kinesisStream: {
resourceArn: stream.arn,
roleArn: analyticsRole.arn,
},
schema: {
recordColumns: [
{ name: "event_type", sqlType: "VARCHAR(64)" },
{ name: "timestamp", sqlType: "TIMESTAMP" },
],
},
},
});Key Resources: Stream, FirehoseDeliveryStream, AnalyticsApplication, ResourcePolicy, StreamConsumer
Use Cases: Clickstream analysis, IoT data ingestion, log aggregation, real-time dashboards
Serverless ETL service with integrated data catalog for data discovery.
import { glue } from "@pulumi/aws";
// Create a Glue database
const database = new glue.CatalogDatabase("analytics", {
name: "analytics_db",
});
// Define a crawler to discover schema
const crawler = new glue.Crawler("s3-data", {
databaseName: database.name,
role: crawlerRole.arn,
s3Targets: [{
path: `s3://${dataBucket.id}/raw/`,
}],
schedule: "cron(0 1 * * ? *)",
});
// Create an ETL job
const etlJob = new glue.Job("transform", {
roleArn: glueRole.arn,
glueVersion: "4.0",
command: {
scriptLocation: `s3://${scriptsBucket.id}/transform.py`,
pythonVersion: "3",
},
defaultArguments: {
"--TempDir": `s3://${tempBucket.id}/temp/`,
"--job-language": "python",
},
});Key Resources: CatalogDatabase, CatalogTable, Crawler, Job, Trigger, Connection, Workflow, Schema, Registry
Use Cases: Data cataloging, schema discovery, ETL pipelines, data preparation, data quality
Managed search and analytics engine (formerly Elasticsearch) for log analytics, application monitoring, and search.
import { opensearch } from "@pulumi/aws";
// Create OpenSearch domain
const domain = new opensearch.Domain("logs", {
engineVersion: "OpenSearch_2.11",
clusterConfig: {
instanceType: "r6g.large.search",
instanceCount: 3,
zoneAwarenessEnabled: true,
zoneAwarenessConfig: {
availabilityZoneCount: 3,
},
},
ebsOptions: {
ebsEnabled: true,
volumeSize: 100,
volumeType: "gp3",
},
encryptionAtRest: {
enabled: true,
},
nodeToNodeEncryption: {
enabled: true,
},
});Key Resources: Domain, DomainPolicy, DomainSamlOptions, ServerlessCollection, ServerlessAccessPolicy
Use Cases: Full-text search, log analytics, application monitoring, security analytics, operational intelligence
Cloud-native BI service for interactive dashboards and visualizations.
import { quicksight } from "@pulumi/aws";
// Create a data source
const dataSource = new quicksight.DataSource("athena-source", {
dataSourceId: "athena-analytics",
name: "Athena Analytics",
type: "ATHENA",
parameters: {
athena: {
workGroup: athenaWorkgroup.name,
},
},
});
// Create a dataset
const dataset = new quicksight.DataSet("sales", {
dataSetId: "sales-data",
name: "Sales Data",
importMode: "DIRECT_QUERY",
physicalTableMaps: [{
physicalTableMapId: "sales-table",
relationalTable: {
dataSourceArn: dataSource.arn,
schema: "analytics_db",
name: "sales",
},
}],
});Key Resources: Analysis, Dashboard, DataSet, DataSource, Template, Theme, User, Group
Use Cases: Interactive dashboards, ad-hoc analysis, embedded analytics, mobile BI
Catalog, discover, share, and govern data across your organization.
import { datazone } from "@pulumi/aws";
// Create a DataZone domain
const domain = new datazone.Domain("data-portal", {
name: "enterprise-data",
description: "Enterprise data catalog",
domainExecutionRole: dataZoneRole.arn,
});
// Create a project
const project = new datazone.Project("analytics-project", {
domainIdentifier: domain.id,
name: "Analytics Team",
description: "Analytics team project",
});Key Resources: Domain, Project, Environment, EnvironmentBlueprintConfiguration, FormType, Glossary
Use Cases: Data cataloging, data discovery, access control, data governance, collaboration
Centrally manage permissions and set up data lakes on S3.
import { lakeformation } from "@pulumi/aws";
// Register S3 location as data lake
const resource = new lakeformation.Resource("data-lake", {
arn: dataBucket.arn,
});
// Grant permissions on database
const permissions = new lakeformation.Permissions("analytics-access", {
principal: analyticsRole.arn,
permissions: ["SELECT", "DESCRIBE"],
database: {
name: glueDatabaseName,
},
});Key Resources: Resource, Permissions, DataLakeSettings, LfTag, LfTagAssociation
Use Cases: Data lake security, fine-grained access control, centralized permissions, data sharing
Fully managed Apache Kafka service for streaming data pipelines and applications.
import { msk } from "@pulumi/aws";
// Create MSK cluster
const cluster = new msk.Cluster("events", {
clusterName: "event-streaming",
kafkaVersion: "3.5.1",
numberOfBrokerNodes: 3,
brokerNodeGroupInfo: {
instanceType: "kafka.m5.large",
clientSubnets: subnetIds,
storageInfo: {
ebsStorageInfo: {
volumeSize: 1000,
},
},
},
encryptionInfo: {
encryptionInTransit: {
clientBroker: "TLS",
inCluster: true,
},
},
});Key Resources: Cluster, Configuration, ClusterPolicy, Replicator, ServerlessCluster, VpcConnection
Use Cases: Event streaming, log aggregation, real-time analytics, change data capture, microservices communication
Fully managed, petabyte-scale data warehouse for analytics.
import { redshift } from "@pulumi/aws";
// Create Redshift cluster
const cluster = new redshift.Cluster("warehouse", {
clusterIdentifier: "analytics-warehouse",
nodeType: "ra3.xlplus",
numberOfNodes: 2,
databaseName: "analytics",
masterUsername: "admin",
masterPassword: adminPassword,
encrypted: true,
kmsKeyId: kmsKey.id,
});
// Create Redshift Serverless namespace
const namespace = new redshift.ServerlessNamespace("analytics", {
namespaceName: "analytics-serverless",
dbName: "analytics",
adminUsername: "admin",
adminUserPassword: adminPassword,
});
const workgroup = new redshift.ServerlessWorkgroup("analytics", {
workgroupName: "analytics-workgroup",
namespaceName: namespace.namespaceName,
baseCapacity: 32,
});Key Resources: Cluster, ServerlessNamespace, ServerlessWorkgroup, ParameterGroup, SubnetGroup, SnapshotSchedule
Use Cases: Business intelligence, complex queries, historical data analysis, data consolidation
For complete service list, see All Services A-Z.
Install with Tessl CLI
npx tessl i tessl/npm-pulumi--aws@7.16.0