or run

tessl search
Log in

databricks-hello-world

tessl install github:jeremylongshore/claude-code-plugins-plus-skills --skill databricks-hello-world

github.com/jeremylongshore/claude-code-plugins-plus-skills

Create a minimal working Databricks example with cluster and notebook. Use when starting a new Databricks project, testing your setup, or learning basic Databricks patterns. Trigger with phrases like "databricks hello world", "databricks example", "databricks quick start", "first databricks notebook", "create cluster".

Review Score

88%

Validation Score

11/16

Implementation Score

88%

Activation Score

90%

Databricks Hello World

Overview

Create your first Databricks cluster and notebook to verify setup.

Prerequisites

  • Completed databricks-install-auth setup
  • Valid API credentials configured
  • Workspace access with cluster creation permissions

Instructions

Step 1: Create a Cluster

# Create a small development cluster via CLI
databricks clusters create --json '{
  "cluster_name": "hello-world-cluster",
  "spark_version": "14.3.x-scala2.12",
  "node_type_id": "Standard_DS3_v2",
  "autotermination_minutes": 30,
  "num_workers": 0,
  "spark_conf": {
    "spark.databricks.cluster.profile": "singleNode",
    "spark.master": "local[*]"
  },
  "custom_tags": {
    "ResourceClass": "SingleNode"
  }
}'

Step 2: Create a Notebook

# hello_world.py - upload as notebook
from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

# Create notebook content
notebook_content = """
# Databricks Hello World

# COMMAND ----------

# Simple DataFrame operations
data = [("Alice", 28), ("Bob", 35), ("Charlie", 42)]
df = spark.createDataFrame(data, ["name", "age"])
display(df)

# COMMAND ----------

# Delta Lake example
df.write.format("delta").mode("overwrite").save("/tmp/hello_world_delta")

# COMMAND ----------

# Read it back
df_read = spark.read.format("delta").load("/tmp/hello_world_delta")
display(df_read)

# COMMAND ----------

print("Hello from Databricks!")
"""

import base64
w.workspace.import_(
    path="/Users/your-email/hello_world",
    format="SOURCE",
    language="PYTHON",
    content=base64.b64encode(notebook_content.encode()).decode(),
    overwrite=True
)
print("Notebook created!")

Step 3: Run the Notebook

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.jobs import Task, NotebookTask, RunNow

w = WorkspaceClient()

# Create a one-time run
run = w.jobs.submit(
    run_name="hello-world-run",
    tasks=[
        Task(
            task_key="hello",
            existing_cluster_id="your-cluster-id",
            notebook_task=NotebookTask(
                notebook_path="/Users/your-email/hello_world"
            )
        )
    ]
)

# Wait for completion
result = w.jobs.get_run(run.response.run_id).result()
print(f"Run completed with state: {result.state.result_state}")

Step 4: Verify with CLI

# List clusters
databricks clusters list

# Get cluster status
databricks clusters get --cluster-id your-cluster-id

# List workspace contents
databricks workspace list /Users/your-email/

# Get run output
databricks runs get-output --run-id your-run-id

Output

  • Development cluster created and running
  • Hello world notebook created in workspace
  • Successful notebook execution
  • Delta table created at /tmp/hello_world_delta

Error Handling

ErrorCauseSolution
Cluster quota exceededWorkspace limitsTerminate unused clusters
Invalid node typeWrong instance typeCheck available node types
Notebook path existsDuplicate pathUse overwrite=True
Cluster pendingStartup in progressWait for RUNNING state
Permission deniedInsufficient privilegesRequest workspace admin access

Examples

Interactive Cluster (Cost-Effective Dev)

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.compute import ClusterSpec

w = WorkspaceClient()

# Create single-node cluster for development
cluster = w.clusters.create_and_wait(
    cluster_name="dev-cluster",
    spark_version="14.3.x-scala2.12",
    node_type_id="Standard_DS3_v2",
    num_workers=0,
    autotermination_minutes=30,
    spark_conf={
        "spark.databricks.cluster.profile": "singleNode",
        "spark.master": "local[*]"
    }
)
print(f"Cluster created: {cluster.cluster_id}")

SQL Warehouse (Serverless)

from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

# Create SQL warehouse for queries
warehouse = w.warehouses.create_and_wait(
    name="hello-warehouse",
    cluster_size="2X-Small",
    auto_stop_mins=15,
    warehouse_type="PRO",
    enable_serverless_compute=True
)
print(f"Warehouse created: {warehouse.id}")

Quick DataFrame Test

# Run in notebook or Databricks Connect
from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

# Create sample data
df = spark.range(1000).toDF("id")
df = df.withColumn("value", df.id * 2)

# Show results
df.show(5)
print(f"Row count: {df.count()}")

Resources

  • Databricks Quickstart
  • Cluster Configuration
  • Notebooks Guide
  • Delta Lake Quickstart

Next Steps

Proceed to databricks-local-dev-loop for local development setup.