Deploy and query Databricks Model Serving endpoints. Use when (1) deploying MLflow models or AI agents to endpoints, (2) creating ChatAgent/ResponsesAgent agents, (3) integrating UC Functions or Vector Search tools, (4) querying deployed endpoints, (5) checking endpoint status. Covers classical ML models, custom pyfunc, and GenAI agents.
71
86%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly defines its scope around Databricks Model Serving endpoints. It provides specific actions, comprehensive trigger scenarios via a numbered 'Use when' list, and uses domain-specific terminology that makes it highly distinguishable. The description is concise yet thorough, covering both the breadth of capabilities and the conditions for activation.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: deploying MLflow models, creating ChatAgent/ResponsesAgent agents, integrating UC Functions or Vector Search tools, querying deployed endpoints, and checking endpoint status. Also specifies model types covered (classical ML, custom pyfunc, GenAI agents). | 3 / 3 |
Completeness | Clearly answers both 'what' (deploy and query Databricks Model Serving endpoints, covering classical ML, pyfunc, and GenAI agents) and 'when' with an explicit 'Use when' clause listing five specific trigger scenarios. | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'Databricks', 'Model Serving', 'endpoints', 'MLflow', 'ChatAgent', 'ResponsesAgent', 'UC Functions', 'Vector Search', 'deploy', 'query', 'pyfunc', 'GenAI agents'. These cover the terms a user working in this domain would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche around Databricks Model Serving specifically. The combination of Databricks-specific terminology (UC Functions, Vector Search, MLflow, ChatAgent/ResponsesAgent) makes it very unlikely to conflict with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
72%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-organized skill with strong progressive disclosure and high actionability through concrete code examples and MCP tool calls. The main weaknesses are some content redundancy (duplicate query examples across sections) and missing validation checkpoints in the multi-step deployment workflow. The Foundation Model API table is comprehensive but adds significant length that could be offloaded to a reference file.
Suggestions
Add explicit validation checkpoints to the GenAI Agent quick start workflow (e.g., 'Verify upload: manage_workspace_files(action="list")' after Step 3, expected test output criteria after Step 4)
Remove the 'Common Workflows' section since its examples duplicate the MCP Tools query examples above it, or consolidate into a single examples section
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is generally well-structured but includes some redundancy—the 'Common Workflows' section largely duplicates examples already shown in the MCP Tools section. The Foundation Model API endpoint table is very long and could arguably be in a reference file, though it serves the stated purpose of preventing guessing. Some sections like 'Quick Start: Deploy a Classical ML Model' are lean and efficient. | 2 / 3 |
Actionability | The skill provides concrete, executable code examples throughout—MCP tool calls with exact parameters, Python code for ML model training, specific package versions to install, and exact endpoint names. The ResponsesAgent output format section with WRONG vs CORRECT examples is particularly actionable. | 3 / 3 |
Workflow Clarity | The GenAI Agent quick start has a clear 7-step sequence, but lacks explicit validation checkpoints between steps—there's no 'verify upload succeeded' after Step 3, no 'check test output' criteria after Step 4, and Step 6 defers entirely to another file. The deployment workflow involves destructive/batch operations but doesn't include feedback loops for error recovery within the main skill. | 2 / 3 |
Progressive Disclosure | Excellent structure with a decision table up front, concise overview content in the main file, and clearly signaled one-level-deep references to 9 topic-specific files with 'When to Read' guidance. The Reference Files table makes navigation easy and the content appropriately splits detailed topics into separate files. | 3 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
93cb4e3
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.