Deploy production recommendation systems with feature stores, caching, A/B testing. Use for personalization APIs, low latency serving, or encountering cache invalidation, experiment tracking, quality monitoring issues.
79
75%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/recommendation-system/skills/recommendation-system/SKILL.mdQuality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description that clearly identifies a specific domain (production recommendation systems), lists concrete capabilities (feature stores, caching, A/B testing), and provides explicit trigger guidance via a 'Use for...' clause. The trigger terms are natural and domain-appropriate, covering both the building and troubleshooting aspects of recommendation system deployment.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions and components: 'feature stores, caching, A/B testing, personalization APIs, low latency serving, cache invalidation, experiment tracking, quality monitoring.' These are concrete, actionable capabilities. | 3 / 3 |
Completeness | Clearly answers both 'what' (deploy production recommendation systems with feature stores, caching, A/B testing) and 'when' (Use for personalization APIs, low latency serving, or encountering cache invalidation, experiment tracking, quality monitoring issues). Has an explicit 'Use for...' clause. | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'recommendation systems', 'feature stores', 'caching', 'A/B testing', 'personalization APIs', 'low latency serving', 'cache invalidation', 'experiment tracking', 'quality monitoring'. These cover the domain well with terms practitioners naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly specific niche targeting production recommendation systems with distinct triggers like 'feature stores', 'A/B testing', 'personalization APIs', and 'cache invalidation'. Unlikely to conflict with general ML or deployment skills due to the specificity of the recommendation system domain. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
50%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill demonstrates strong actionability with executable, production-relevant code examples covering feature stores, caching, A/B testing, and monitoring. However, it suffers significantly from verbosity and redundancy - many patterns are repeated across sections, and substantial content that belongs in reference files is inlined. The workflow could benefit from explicit validation checkpoints, especially given this is a multi-service production deployment.
Suggestions
Move Known Issues #1-7 and Core Components details into the referenced files (references/caching-strategies.md, etc.) and keep only 1-2 sentence summaries with links in the main skill
Remove duplicate code patterns - the monitoring example appears 3 times, caching logic appears in 4 places; consolidate into the Common Patterns section only
Add validation checkpoints to the Quick Start: verify Redis is running (redis-cli ping), verify API health endpoint responds before testing recommendations
Remove the 'When to Use This Skill' section entirely - this is metadata that belongs in frontmatter, not content Claude needs to follow the skill
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~350+ lines. There is significant redundancy: the monitoring pattern appears three times (Known Issues #5, Common Patterns, and Quick Start mentions prometheus), caching logic is repeated across Quick Start, Core Components, and Known Issues. The 'When to Use This Skill' section is unnecessary padding. Many code examples overlap substantially (e.g., RecommendationService in Common Patterns duplicates the Quick Start flow). | 1 / 3 |
Actionability | The skill provides fully executable code throughout - the Quick Start is copy-paste ready with specific pip install commands, Docker commands, and a complete FastAPI app. All code examples use real libraries with concrete implementations rather than pseudocode. The A/B testing sample size calculator even includes scipy usage. | 3 / 3 |
Workflow Clarity | The Quick Start has a clear 5-step sequence, and the RecommendationService pattern shows a numbered workflow. However, there are no validation checkpoints - no step to verify Redis is running before starting the app, no verification that the API is healthy after launch, and no error recovery guidance in the deployment flow. For a production system involving caching and multiple services, this is a notable gap. | 2 / 3 |
Progressive Disclosure | The 'When to Load References' section properly signals external files with clear descriptions, which is good. However, the main file itself is bloated with content that should be in those reference files - the Known Issues section alone has 7 detailed subsections with full code examples that could be referenced rather than inlined. The Core Components section duplicates what the references claim to contain. | 2 / 3 |
Total | 8 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
Total | 10 / 11 Passed | |
88da5ff
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.