Execute Databricks secondary workflow: MLflow model training and deployment. Use when building ML pipelines, training models, or deploying to production. Trigger with phrases like "databricks ML", "mlflow training", "databricks model", "feature store", "model registry".
80
77%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/saas-packs/databricks-pack/skills/databricks-core-workflow-b/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid skill description with strong trigger terms, clear completeness, and excellent distinctiveness. Its main weakness is that the specificity of capabilities could be improved—'execute secondary workflow' is vague, and the concrete actions (beyond training and deployment) are not well enumerated. The trigger terms and explicit 'Use when' / 'Trigger with' clauses compensate well for skill selection purposes.
Suggestions
Replace the vague 'Execute Databricks secondary workflow' with specific concrete actions like 'Trains MLflow models, registers models in the model registry, manages feature store tables, and deploys models to production endpoints.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (Databricks/MLflow) and some actions (model training, deployment), but doesn't list multiple concrete specific actions. 'Execute secondary workflow' is vague, and the actual capabilities like feature store operations or model registry interactions are only mentioned as trigger terms, not as described capabilities. | 2 / 3 |
Completeness | Clearly answers both 'what' (MLflow model training and deployment on Databricks) and 'when' (building ML pipelines, training models, deploying to production) with explicit trigger phrases provided. The 'Use when' and 'Trigger with' clauses are both present and explicit. | 3 / 3 |
Trigger Term Quality | Includes a good range of natural trigger terms: 'databricks ML', 'mlflow training', 'databricks model', 'feature store', 'model registry'. These are terms users would naturally use when requesting ML-related Databricks work, covering multiple variations and related concepts. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche: Databricks + MLflow model training/deployment. The combination of platform (Databricks) and specific tooling (MLflow, feature store, model registry) makes it very unlikely to conflict with generic ML or generic Databricks skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
64%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a comprehensive, highly actionable skill covering the full Databricks ML lifecycle with executable code at every step. Its main weaknesses are the lack of explicit validation checkpoints between critical steps (especially before production deployment) and the monolithic structure that could benefit from splitting advanced examples into separate files. The error handling table is a strong addition but doesn't compensate for missing inline verification steps in the workflow.
Suggestions
Add explicit validation checkpoints between steps — e.g., verify feature table exists after Step 1, validate model metrics meet a threshold before promoting to champion in Step 3, and check endpoint health/readiness before querying in Step 5.
Split advanced examples (hyperparameter sweep, batch inference) into a separate bundle file and reference them from the main SKILL.md to reduce the monolithic feel and improve progressive disclosure.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly long (~180 lines of code-heavy content) and includes some unnecessary commentary (e.g., 'Unity Catalog model registry replaces legacy stages with aliases'), but most content is executable code that earns its place. The feature engineering example with specific column names adds bulk that could be more generic, but overall it's reasonably efficient for the scope covered. | 2 / 3 |
Actionability | Every step includes fully executable, copy-paste-ready Python code with specific imports, method calls, and realistic parameters. The code covers the complete ML lifecycle from feature engineering through serving endpoint queries, with concrete examples including payload formats and expected outputs. | 3 / 3 |
Workflow Clarity | The 6-step sequence is clearly numbered and logically ordered, and the error handling table is helpful. However, there are no explicit validation checkpoints between steps — for instance, no verification that the feature table was created successfully before training, no model validation before deployment, and no health check after endpoint creation beyond printing state. For a multi-step pipeline involving production deployment, feedback loops are missing. | 2 / 3 |
Progressive Disclosure | The content is well-structured with clear sections and a useful error table, but it's essentially a monolithic document with ~180 lines of inline code. The Resources section links to external docs and Next Steps references another skill, but there are no bundle files to offload detailed content like the hyperparameter sweep example or batch inference patterns. The inline content could benefit from splitting advanced patterns into separate files. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
3a2d27d
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.