Migrate an MLflow ResponsesAgent from Databricks Model Serving to Databricks Apps. Use when: (1) User wants to migrate from Model Serving to Apps, (2) User has a ResponsesAgent with predict()/predict_stream() methods, (3) User wants to convert to @invoke/@stream decorators.
84
81%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly defines a narrow, specific task with concrete technical details. It uses proper third-person voice, includes an explicit 'Use when' clause with three well-defined trigger conditions, and contains highly specific terminology that makes it virtually impossible to confuse with other skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: migrating an MLflow ResponsesAgent, converting from Model Serving to Apps, converting predict()/predict_stream() methods to @invoke/@stream decorators. These are highly specific technical actions. | 3 / 3 |
Completeness | Clearly answers both 'what' (migrate an MLflow ResponsesAgent from Model Serving to Apps) and 'when' with an explicit 'Use when:' clause listing three specific trigger conditions. | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'MLflow', 'ResponsesAgent', 'Databricks Model Serving', 'Databricks Apps', 'predict()', 'predict_stream()', '@invoke', '@stream', 'migrate'. These are the exact terms a user working on this task would use. | 3 / 3 |
Distinctiveness Conflict Risk | Extremely specific niche combining MLflow, ResponsesAgent, Databricks Model Serving, and Databricks Apps migration. This is unlikely to conflict with any other skill due to the highly specialized domain and specific technology stack. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is highly actionable with excellent workflow clarity — every step has concrete commands, clear sequencing, and validation checkpoints. However, it is significantly over-verbose for an LLM audience, explaining basic concepts (async/await patterns, file copying, what generators are) and repeating patterns unnecessarily. The monolithic structure would benefit from splitting reference material, migration patterns, and troubleshooting into separate files.
Suggestions
Reduce verbosity by 40-50%: remove explanations of concepts Claude knows (async patterns, what generators are, basic bash commands like cp/mkdir), trim the overview/deliverables section, and consolidate the sync/async comparison tables that appear multiple times.
Extract the 'Reference: Common Migration Patterns' and 'Troubleshooting' sections into separate files (e.g., PATTERNS.md, TROUBLESHOOTING.md) and reference them from the main skill to improve progressive disclosure.
Consolidate the repeated sync vs async branching — instead of showing both paths inline at every step, define the transformation rules once and reference them, or use a single conditional note per step rather than full duplicate code blocks.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~600+ lines. It explains concepts Claude already knows (what async/await is, what generators are, basic file copying with cp), includes extensive troubleshooting sections, repeats the same patterns multiple times (sync vs async shown redundantly), and has significant padding like explaining what deliverables are and what directory structures look like. The task list management instructions and tool usage (AskUserQuestion, TaskCreate) add substantial overhead that could be dramatically condensed. | 1 / 3 |
Actionability | The skill provides fully executable bash commands and Python code throughout — from downloading artifacts with mlflow, to curl commands for testing, to databricks CLI commands for deployment. Code examples are copy-paste ready with clear placeholder conventions (<profile>, <app-name>), and the resource mapping table provides concrete field-level guidance. | 3 / 3 |
Workflow Clarity | The multi-step migration process is clearly sequenced (Steps 1-6) with explicit validation checkpoints: authenticate before proceeding, verify downloaded artifacts, test locally before deploying, validate bundle before deploy, and a verification checklist in Step 5.5. The task tracking system provides additional progress visibility, and there are feedback loops for error recovery (re-authenticate, troubleshooting sections). | 3 / 3 |
Progressive Disclosure | The content is largely monolithic — all migration patterns, troubleshooting, reference tables, and detailed examples are inline in a single massive document. The Reference sections at the end are a good structural choice, but the core steps contain extensive inline detail (e.g., full async vs sync pattern comparisons, stateful agent handling) that could be split into separate referenced files. The mention of a 'deploy skill' for troubleshooting is good progressive disclosure, but most content that should be separate is inline. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (968 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
dfeb4ac
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.