Name: jbvc/ai-engineer
Rating: 56.00000000000001 (1 reviews)
Author: jbvc

jbvc/ai-engineer

Build production-ready LLM applications, advanced RAG systems, and intelligent agents. Implements vector search, multimodal AI, agent orchestration, and enterprise AI integrations. Use PROACTIVELY for LLM features, chatbots, AI agents, or AI-powered applications.

Quality

56%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Discovery

82%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description has good completeness with an explicit 'Use when' clause and covers relevant trigger terms that users would naturally use. However, it tries to cover an extremely broad domain (LLM apps, RAG, agents, vector search, multimodal AI, enterprise integrations) which reduces distinctiveness and makes the specific capabilities feel more like a category listing than concrete actions. The description would benefit from narrowing scope or being more precise about what concrete operations it performs.

Suggestions

Narrow the scope or add more concrete actions — instead of 'enterprise AI integrations', specify what integrations (e.g., 'connects to OpenAI, Anthropic, and Pinecone APIs')

Reduce overlap risk by clarifying boundaries — e.g., 'Use for building new LLM-powered features from scratch, not for fine-tuning models or ML training pipelines'

Dimension	Reasoning	Score
Specificity	Names the domain (LLM applications, RAG systems, agents) and lists some actions (vector search, multimodal AI, agent orchestration, enterprise AI integrations), but many terms are broad categories rather than concrete actions like 'extract', 'fill', or 'merge'. 'Build production-ready LLM applications' is somewhat specific but 'enterprise AI integrations' is vague.	2 / 3
Completeness	Clearly answers both 'what' (build LLM applications, RAG systems, agents, vector search, multimodal AI, agent orchestration) and 'when' with an explicit trigger clause ('Use PROACTIVELY for LLM features, chatbots, AI agents, or AI-powered applications').	3 / 3
Trigger Term Quality	Includes strong natural trigger terms users would say: 'LLM', 'chatbots', 'AI agents', 'RAG', 'vector search', 'AI-powered applications'. These cover common variations of how users would describe needing this skill.	3 / 3
Distinctiveness Conflict Risk	While it targets a specific domain (LLM/AI applications), the scope is very broad — covering chatbots, agents, RAG, vector search, multimodal AI, and enterprise integrations. This could easily overlap with more specialized skills for any of those individual areas. Terms like 'AI-powered applications' are particularly broad.	2 / 3
	Total	10 / 12 Passed

Implementation

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill reads as a capability catalog or persona description rather than actionable instructions. It lists hundreds of technologies, frameworks, and concepts without providing any concrete code, commands, or step-by-step guidance. The content is almost entirely things Claude already knows, making it an extremely inefficient use of context window tokens.

Suggestions

Replace the extensive technology lists with 2-3 concrete, executable code examples for the most common tasks (e.g., a production RAG pipeline setup, an agent workflow with LangGraph).

Add specific validation checkpoints to the workflow, such as 'Test retrieval quality with sample queries before deploying' or 'Verify embedding dimensions match vector DB configuration'.

Split the monolithic content into focused sub-files (e.g., RAG.md, AGENTS.md, SAFETY.md) and keep SKILL.md as a concise overview with clear navigation links.

Remove the Capabilities, Knowledge Base, Behavioral Traits, and Example Interactions sections entirely—they describe what Claude already knows rather than providing new, actionable instructions.

Dimension	Reasoning	Score
Conciseness	Extremely verbose with extensive lists of technologies, model names, and capabilities that Claude already knows. The 'Capabilities' section is essentially a resume listing tools and frameworks rather than providing actionable instructions. Most of the content (Knowledge Base, Behavioral Traits, Example Interactions) adds no instructional value.	1 / 3
Actionability	No concrete code examples, commands, or executable guidance anywhere. The entire skill is abstract descriptions and bullet-point lists of technologies. The 'Instructions' section has only four vague steps like 'Design the AI architecture, data flow, and model selection' with no specifics on how to do any of it.	1 / 3
Workflow Clarity	The four-step 'Instructions' workflow is extremely vague with no validation checkpoints, no error recovery, and no concrete sequencing. 'Response Approach' lists 8 abstract steps but none have specific actions or verification points. For a skill involving production AI systems with potentially destructive or costly operations, the lack of any validation steps is a critical gap.	1 / 3
Progressive Disclosure	Monolithic wall of text with no references to external files despite covering an enormous scope. All content is inline in one massive document with no navigation structure. The breadth of topics (RAG, agents, multimodal, safety, pipelines, APIs) desperately needs to be split into focused sub-documents with clear cross-references.	1 / 3
	Total	4 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
metadata_version	'metadata.version' is missing	Warning

	Total	10 / 11 Passed

Reviewed

24 days ago

Table of Contents

Discovery Implementation Validation