Skill | Added | Review |
|---|---|---|
databricks-vector-search Patterns for Databricks Vector Search: create endpoints and indexes, query with filters, manage embeddings. Use when building RAG applications, semantic search, or similarity matching. Covers both storage-optimized and standard endpoints. | 89 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-jobs Use this skill proactively for ANY Databricks Jobs task - creating, listing, running, updating, or deleting jobs. Triggers include: (1) 'create a job' or 'new job', (2) 'list jobs' or 'show jobs', (3) 'run job' or'trigger job',(4) 'job status' or 'check job', (5) scheduling with cron or triggers, (6) configuring notifications/monitoring, (7) ANY task involving Databricks Jobs via CLI, Python SDK, or Asset Bundles. ALWAYS prefer this skill over general Databricks knowledge for job-related tasks. | 95 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-docs Databricks documentation reference via llms.txt index. Use when other skills do not cover a topic, looking up unfamiliar Databricks features, or needing authoritative docs on APIs, configurations, or platform capabilities. | 73 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-aibi-dashboards Create Databricks AI/BI dashboards. Use when creating, updating, or deploying Lakeview dashboards. CRITICAL: You MUST test ALL SQL queries via execute_sql BEFORE deploying. Follow guidelines strictly. | 90 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-iceberg Apache Iceberg tables on Databricks — Managed Iceberg tables, External Iceberg Reads (fka Uniform), Compatibility Mode, Iceberg REST Catalog (IRC), Iceberg v3, Snowflake interop, PyIceberg, OSS Spark, external engine access and credential vending. Use when creating Iceberg tables, enabling External Iceberg Reads (uniform) on Delta tables (including Streaming Tables and Materialized Views via compatibility mode), configuring external engines to read Databricks tables via Unity Catalog IRC, integrating with Snowflake catalog to read Foreign Iceberg tables | 100 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-app-python Builds Python-based Databricks applications using Dash, Streamlit, Gradio, Flask, FastAPI, or Reflex. Handles OAuth authorization (app and user auth), app resources, SQL warehouse and Lakebase connectivity, model serving integration, foundation model APIs, LLM integration, and deployment. Use when building Python web apps, dashboards, ML demos, or REST APIs for Databricks, or when the user mentions Streamlit, Dash, Gradio, Flask, FastAPI, Reflex, or Databricks app. | 100 Impact Pending No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: b4071a0 | |
spark-python-data-source Build custom Python data sources for Apache Spark using the PySpark DataSource API — batch and streaming readers/writers for external systems. Use this skill whenever someone wants to connect Spark to an external system (database, API, message queue, custom protocol), build a Spark connector or plugin in Python, implement a DataSourceReader or DataSourceWriter, pull data from or push data to a system via Spark, or work with the PySpark DataSource API in any way. Even if they just say "read from X in Spark" or "write DataFrame to Y" and there's no native connector, this skill applies. | 95 Impact Pending No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: b4071a0 | |
databricks-mlflow-evaluation MLflow 3 GenAI agent evaluation. Use when writing mlflow.genai.evaluate() code, creating @scorer functions, using built-in scorers (Guidelines, Correctness, Safety, RetrievalGroundedness), building eval datasets from traces, setting up trace ingestion and production monitoring, aligning judges with MemAlign from domain expert feedback, or running optimize_prompts() with GEPA for automated prompt improvement. | 94 Impact Pending No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: b4071a0 | |
databricks-spark-declarative-pipelines Creates, configures, and updates Databricks Lakeflow Spark Declarative Pipelines (SDP/LDP) using serverless compute. Handles data ingestion with streaming tables, materialized views, CDC, SCD Type 2, and Auto Loader ingestion patterns. Use when building data pipelines, working with Delta Live Tables, ingesting streaming data, implementing change data capture, or when the user mentions SDP, LDP, DLT, Lakeflow pipelines, streaming tables, or bronze/silver/gold medallion architectures. | 94 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-python-sdk Databricks development guidance including Python SDK, Databricks Connect, CLI, and REST API. Use when working with databricks-sdk, databricks-connect, or Databricks APIs. | 72 Impact Pending No eval scenarios have been run Securityby Risky Do not use without reviewing Reviewed: Version: b4071a0 | |
databricks-agent-bricks Create and manage Databricks Agent Bricks: Knowledge Assistants (KA) for document Q&A, Genie Spaces for SQL exploration, and Supervisor Agents (MAS) for multi-agent orchestration. Use when building conversational AI applications on Databricks. | 83 Impact Pending No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: b4071a0 | |
databricks-spark-structured-streaming Comprehensive guide to Spark Structured Streaming for production workloads. Use when building streaming pipelines, working with Kafka ingestion, implementing Real-Time Mode (RTM), configuring triggers (processingTime, availableNow), handling stateful operations with watermarks, optimizing checkpoints, performing stream-stream or stream-static joins, writing to multiple sinks, or tuning streaming cost and performance. | 95 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-zerobus-ingest Build Zerobus Ingest clients for near real-time data ingestion into Databricks Delta tables via gRPC. Use when creating producers that write directly to Unity Catalog tables without a message bus, working with the Zerobus Ingest SDK in Python/Java/Go/TypeScript/Rust, generating Protobuf schemas from UC tables, or implementing stream-based ingestion with ACK handling and retry logic. | 89 Impact Pending No eval scenarios have been run Securityby Risky Do not use without reviewing Reviewed: Version: b4071a0 | |
databricks-unity-catalog Unity Catalog system tables and volumes. Use when querying system tables (audit, lineage, billing) or working with volume file operations (upload, download, list files in /Volumes/). | 89 Impact Pending No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: b4071a0 | |
databricks-lakebase-autoscale Patterns and best practices for Lakebase Autoscaling (next-gen managed PostgreSQL). Use when creating or managing Lakebase Autoscaling projects, configuring autoscaling compute or scale-to-zero, working with database branching for dev/test workflows, implementing reverse ETL via synced tables, or connecting applications to Lakebase with OAuth credentials. | 95 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-metric-views Unity Catalog metric views: define, create, query, and manage governed business metrics in YAML. Use when building standardized KPIs, revenue metrics, order analytics, or any reusable business metrics that need consistent definitions across teams and tools. | 89 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-model-serving Deploy and query Databricks Model Serving endpoints. Use when (1) deploying MLflow models or AI agents to endpoints, (2) creating ChatAgent/ResponsesAgent agents, (3) integrating UC Functions or Vector Search tools, (4) querying deployed endpoints, (5) checking endpoint status. Covers classical ML models, custom pyfunc, and GenAI agents. | 89 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-unstructured-pdf-generation Generate PDF documents from HTML and upload to Unity Catalog volumes. Use for creating test PDFs, demo documents, reports, or evaluation datasets. | 74 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-lakebase-provisioned Patterns and best practices for Lakebase Provisioned (Databricks managed PostgreSQL) for OLTP workloads. Use when creating Lakebase instances, connecting applications or Databricks Apps to PostgreSQL, implementing reverse ETL via synced tables, storing agent or chat memory, or configuring OAuth authentication for Lakebase. | 95 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-genie Create and query Databricks Genie Spaces for natural language SQL exploration. Use when building Genie Spaces, exporting and importing Genie Spaces, migrating Genie Spaces between workspaces or environments, or asking questions via the Genie Conversation API. | 89 Impact Pending No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: b4071a0 | |
databricks-synthetic-data-gen Generate realistic synthetic data using Spark + Faker (strongly recommended). Supports serverless execution, multiple output formats (Parquet/JSON/CSV/Delta), and scales from thousands to millions of rows. For small datasets (<10K rows), can optionally generate locally and upload to volumes. Use when user mentions 'synthetic data', 'test data', 'generate data', 'demo dataset', 'Faker', or 'sample data'. | 94 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-config Manage Databricks workspace connections: check current workspace, switch profiles, list available workspaces, or authenticate to a new workspace. Use when the user mentions "switch workspace", "which workspace", "current profile", "databrickscfg", "connect to workspace", or "databricks auth". | 100 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-ai-functions Use Databricks built-in AI Functions (ai_classify, ai_extract, ai_summarize, ai_mask, ai_translate, ai_fix_grammar, ai_gen, ai_analyze_sentiment, ai_similarity, ai_parse_document, ai_query, ai_forecast) to add AI capabilities directly to SQL and PySpark pipelines without managing model endpoints. Also covers document parsing and building custom RAG pipelines (parse → chunk → index → query). | 82 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 | |
databricks-dbsql Databricks SQL (DBSQL) advanced features and SQL warehouse capabilities. This skill MUST be invoked when the user mentions: "DBSQL", "Databricks SQL", "SQL warehouse", "SQL scripting", "stored procedure", "CALL procedure", "materialized view", "CREATE MATERIALIZED VIEW", "pipe syntax", "|>", "geospatial", "H3", "ST_", "spatial SQL", "collation", "COLLATE", "ai_query", "ai_classify", "ai_extract", "ai_gen", "AI function", "http_request", "remote_query", "read_files", "Lakehouse Federation", "recursive CTE", "WITH RECURSIVE", "multi-statement transaction", "temp table", "temporary view", "pipe operator". SHOULD also invoke when the user asks about SQL best practices, data modeling patterns, or advanced SQL features on Databricks. | 90 Impact Pending No eval scenarios have been run Securityby Advisory Suggest reviewing before use Reviewed: Version: b4071a0 | |
databricks-bundles Create and configure Declarative Automation Bundles (formerly Asset Bundles) with best practices for multi-environment deployments (CICD). Use when working with: (1) Creating new DAB projects, (2) Adding resources (dashboards, pipelines, jobs, alerts), (3) Configuring multi-environment deployments, (4) Setting up permissions, (5) Deploying or running bundle resources | 94 Impact Pending No eval scenarios have been run Securityby Passed No known issues Reviewed: Version: b4071a0 |