Skill | Added | Review |
|---|---|---|
databricks-vector-search Patterns for Databricks Vector Search: create endpoints and indexes, query with filters, manage embeddings. Use when building RAG applications, semantic search, or similarity matching. Covers both storage-optimized and standard endpoints. | 71 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-jobs Use this skill proactively for ANY Databricks Jobs task - creating, listing, running, updating, or deleting jobs. Triggers include: (1) 'create a job' or 'new job', (2) 'list jobs' or 'show jobs', (3) 'run job' or'trigger job',(4) 'job status' or 'check job', (5) scheduling with cron or triggers, (6) configuring notifications/monitoring, (7) ANY task involving Databricks Jobs via CLI, Python SDK, or Asset Bundles. ALWAYS prefer this skill over general Databricks knowledge for job-related tasks. | 68 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-docs Databricks documentation reference via llms.txt index. Use when other skills do not cover a topic, looking up unfamiliar Databricks features, or needing authoritative docs on APIs, configurations, or platform capabilities. | 58 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-aibi-dashboards Create Databricks AI/BI dashboards. Use when creating, updating, or deploying Lakeview dashboards. CRITICAL: You MUST test ALL SQL queries via execute_sql BEFORE deploying. Follow guidelines strictly. | 72 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-iceberg Apache Iceberg tables on Databricks — Managed Iceberg tables, External Iceberg Reads (fka Uniform), Compatibility Mode, Iceberg REST Catalog (IRC), Iceberg v3, Snowflake interop, PyIceberg, OSS Spark, external engine access and credential vending. Use when creating Iceberg tables, enabling External Iceberg Reads (uniform) on Delta tables (including Streaming Tables and Materialized Views via compatibility mode), configuring external engines to read Databricks tables via Unity Catalog IRC, integrating with Snowflake catalog to read Foreign Iceberg tables | 76 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
spark-python-data-source Build custom Python data sources for Apache Spark using the PySpark DataSource API — batch and streaming readers/writers for external systems. Use this skill whenever someone wants to connect Spark to an external system (database, API, message queue, custom protocol), build a Spark connector or plugin in Python, implement a DataSourceReader or DataSourceWriter, pull data from or push data to a system via Spark, or work with the PySpark DataSource API in any way. Even if they just say "read from X in Spark" or "write DataFrame to Y" and there's no native connector, this skill applies. | 76 Impact — No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: 93cb4e3 | |
databricks-mlflow-evaluation MLflow 3 GenAI agent evaluation. Use when writing mlflow.genai.evaluate() code, creating @scorer functions, using built-in scorers (Guidelines, Correctness, Safety, RetrievalGroundedness), building eval datasets from traces, setting up trace ingestion and production monitoring, aligning judges with MemAlign from domain expert feedback, or running optimize_prompts() with GEPA for automated prompt improvement. | 61 Impact — No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: 93cb4e3 | |
databricks-spark-declarative-pipelines Creates, configures, and updates Databricks Lakeflow Spark Declarative Pipelines (SDP/LDP) using serverless compute. Handles data ingestion with streaming tables, materialized views, CDC, SCD Type 2, and Auto Loader ingestion patterns. Use when building data pipelines, working with Delta Live Tables, ingesting streaming data, implementing change data capture, or when the user mentions SDP, LDP, DLT, Lakeflow pipelines, streaming tables, or bronze/silver/gold medallion architectures. | 75 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-python-sdk Databricks development guidance including Python SDK, Databricks Connect, CLI, and REST API. Use when working with databricks-sdk, databricks-connect, or Databricks APIs. | 62 Impact — No eval scenarios have been run Securityby Risky Do not use without reviewing Reviewed: Version: 93cb4e3 | |
databricks-agent-bricks Create and manage Databricks Agent Bricks: Knowledge Assistants (KA) for document Q&A, Genie Spaces for SQL exploration, and Supervisor Agents (MAS) for multi-agent orchestration. Use when building conversational AI applications on Databricks. | 66 Impact — No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: 93cb4e3 | |
databricks-spark-structured-streaming Comprehensive guide to Spark Structured Streaming for production workloads. Use when building streaming pipelines, working with Kafka ingestion, implementing Real-Time Mode (RTM), configuring triggers (processingTime, availableNow), handling stateful operations with watermarks, optimizing checkpoints, performing stream-stream or stream-static joins, writing to multiple sinks, or tuning streaming cost and performance. | 76 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-zerobus-ingest Build Zerobus Ingest clients for near real-time data ingestion into Databricks Delta tables via gRPC. Use when creating producers that write directly to Unity Catalog tables without a message bus, working with the Zerobus Ingest SDK in Python/Java/Go/TypeScript/Rust, generating Protobuf schemas from UC tables, or implementing stream-based ingestion with ACK handling and retry logic. | 71 Impact — No eval scenarios have been run Securityby Risky Do not use without reviewing Reviewed: Version: 93cb4e3 | |
databricks-unity-catalog Unity Catalog system tables and volumes. Use when querying system tables (audit, lineage, billing) or working with volume file operations (upload, download, list files in /Volumes/). | 68 Impact — No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: 93cb4e3 | |
databricks-lakebase-autoscale Patterns and best practices for Lakebase Autoscaling (next-gen managed PostgreSQL). Use when creating or managing Lakebase Autoscaling projects, configuring autoscaling compute or scale-to-zero, working with database branching for dev/test workflows, implementing reverse ETL via synced tables, or connecting applications to Lakebase with OAuth credentials. | 68 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-metric-views Unity Catalog metric views: define, create, query, and manage governed business metrics in YAML. Use when building standardized KPIs, revenue metrics, order analytics, or any reusable business metrics that need consistent definitions across teams and tools. | 68 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-model-serving Deploy and query Databricks Model Serving endpoints. Use when (1) deploying MLflow models or AI agents to endpoints, (2) creating ChatAgent/ResponsesAgent agents, (3) integrating UC Functions or Vector Search tools, (4) querying deployed endpoints, (5) checking endpoint status. Covers classical ML models, custom pyfunc, and GenAI agents. | 71 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-unstructured-pdf-generation Generate PDF documents from HTML and upload to Unity Catalog volumes. Use for creating test PDFs, demo documents, reports, or evaluation datasets. | 59 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-lakebase-provisioned Patterns and best practices for Lakebase Provisioned (Databricks managed PostgreSQL) for OLTP workloads. Use when creating Lakebase instances, connecting applications or Databricks Apps to PostgreSQL, implementing reverse ETL via synced tables, storing agent or chat memory, or configuring OAuth authentication for Lakebase. | 68 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-genie Create and query Databricks Genie Spaces for natural language SQL exploration. Use when building Genie Spaces, exporting and importing Genie Spaces, migrating Genie Spaces between workspaces or environments, or asking questions via the Genie Conversation API. | 75 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-synthetic-data-gen Generate realistic synthetic data using Spark + Faker (strongly recommended). Supports serverless execution, multiple output formats (Parquet/JSON/CSV/Delta), and scales from thousands to millions of rows. For small datasets (<10K rows), can optionally generate locally and upload to volumes. Use when user mentions 'synthetic data', 'test data', 'generate data', 'demo dataset', 'Faker', or 'sample data'. | 72 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-config Manage Databricks workspace connections: check current workspace, switch profiles, list available workspaces, or authenticate to a new workspace. Use when the user mentions "switch workspace", "which workspace", "current profile", "databrickscfg", "connect to workspace", or "databricks auth". | 80 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-ai-functions Use Databricks built-in AI Functions (ai_classify, ai_extract, ai_summarize, ai_mask, ai_translate, ai_fix_grammar, ai_gen, ai_analyze_sentiment, ai_similarity, ai_parse_document, ai_query, ai_forecast) to add AI capabilities directly to SQL and PySpark pipelines without managing model endpoints. Also covers document parsing and building custom RAG pipelines (parse → chunk → index → query). | 70 Impact — No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: 93cb4e3 | |
databricks-dbsql Databricks SQL (DBSQL) advanced features and SQL warehouse capabilities. This skill MUST be invoked when the user mentions: "DBSQL", "Databricks SQL", "SQL warehouse", "SQL scripting", "stored procedure", "CALL procedure", "materialized view", "CREATE MATERIALIZED VIEW", "pipe syntax", "|>", "geospatial", "H3", "ST_", "spatial SQL", "collation", "COLLATE", "ai_query", "ai_classify", "ai_extract", "ai_gen", "AI function", "http_request", "remote_query", "read_files", "Lakehouse Federation", "recursive CTE", "WITH RECURSIVE", "multi-statement transaction", "temp table", "temporary view", "pipe operator". SHOULD also invoke when the user asks about SQL best practices, data modeling patterns, or advanced SQL features on Databricks. | 72 Impact — No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: 93cb4e3 | |
databricks-bundles Create and configure Declarative Automation Bundles (formerly Asset Bundles) with best practices for multi-environment deployments (CICD). Use when working with: (1) Creating new DAB projects, (2) Adding resources (dashboards, pipelines, jobs, alerts), (3) Configuring multi-environment deployments, (4) Setting up permissions, (5) Deploying or running bundle resources | 71 Impact — No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: 93cb4e3 |