Execute Databricks secondary workflow: MLflow model training and deployment. Use when building ML pipelines, training models, or deploying to production. Trigger with phrases like "databricks ML", "mlflow training", "databricks model", "feature store", "model registry".
80
77%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/saas-packs/databricks-pack/skills/databricks-core-workflow-b/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid skill description with strong trigger terms and completeness. Its main weakness is that the specificity of capabilities could be improved—'execute secondary workflow' is vague, and the concrete actions (beyond training and deployment) are not well articulated. The description would benefit from listing more specific actions like 'register models in MLflow registry, manage feature store tables, track experiments'.
Suggestions
Replace the vague 'Execute Databricks secondary workflow' with specific concrete actions such as 'Trains ML models, registers them in MLflow model registry, manages feature store tables, and deploys models to production endpoints'.
Clarify what 'secondary workflow' means or remove it—this term adds confusion rather than clarity about the skill's purpose.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (Databricks/MLflow) and some actions (model training, deployment), but doesn't list multiple concrete specific actions. 'Execute secondary workflow' is vague, and the actual capabilities like feature store operations or model registry interactions are only mentioned as trigger terms, not described as actions. | 2 / 3 |
Completeness | Clearly answers both 'what' (MLflow model training and deployment on Databricks) and 'when' (explicit 'Use when' clause for ML pipelines, training, deploying, plus explicit trigger phrases). Both components are present and explicit. | 3 / 3 |
Trigger Term Quality | Includes a good range of natural trigger terms: 'databricks ML', 'mlflow training', 'databricks model', 'feature store', 'model registry'. These cover multiple variations a user might naturally say when needing this skill. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with the Databricks + MLflow combination creating a clear niche. The specific trigger terms like 'feature store', 'model registry', and 'databricks ML' are unlikely to conflict with generic ML or generic Databricks skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
64%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a comprehensive, highly actionable skill with executable code covering the full Databricks ML lifecycle. Its main weaknesses are the lack of validation checkpoints between critical steps (feature table creation → training → registry → deployment) and the monolithic structure that packs too much detail into a single file. Trimming some verbosity and adding explicit verification steps would significantly improve it.
Suggestions
Add explicit validation checkpoints between steps—e.g., verify feature table exists after Step 1, confirm model version registered after Step 2, check endpoint state is READY after Step 4 with a retry loop.
Split the batch inference (Step 6), hyperparameter sweep example, and serving query examples into a separate referenced file to reduce the main skill's token footprint.
Remove the overview paragraph's restated concepts (e.g., 'discoverable features', 'real-time inference via REST API') that Claude already understands from the code examples themselves.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly efficient but includes some unnecessary verbosity—e.g., the overview paragraph restates concepts Claude already knows, and the churn prediction example is more elaborate than needed to convey the pattern. The code examples are thorough but could be tightened (e.g., the full metrics dictionary, the dual REST/SDK query examples). | 2 / 3 |
Actionability | Every step provides fully executable, copy-paste-ready Python code with concrete imports, function calls, and realistic parameters. The examples cover feature engineering, training, registry, serving, querying, and batch inference with specific API calls and expected outputs. | 3 / 3 |
Workflow Clarity | The 6-step sequence is clearly laid out and logically ordered. However, there are no explicit validation checkpoints between steps—no verification that the feature table was created successfully before training, no check that the model registered correctly before deploying, and no validate-fix-retry loops for the deployment step which can fail. For a multi-step pipeline involving production deployment, this is a notable gap. | 2 / 3 |
Progressive Disclosure | The skill has reasonable structure with sections for prerequisites, steps, error handling, examples, and resources. However, at ~180 lines of dense code, much of the content (e.g., the hyperparameter sweep example, batch inference, detailed serving query examples) could be split into referenced files. The references to external skills (databricks-install-auth, databricks-core-workflow-a, databricks-common-errors) are good but the main file itself is monolithic. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
3e83543
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.