Design LLM applications using LangChain 1.x and LangGraph for agents, memory, and tool integration. Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows.
74
66%
Does it follow best practices?
Impact
83%
1.66xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/llm-application-dev/skills/langchain-architecture/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid skill description that clearly identifies its technology niche (LangChain 1.x / LangGraph) and includes an explicit 'Use when' clause with relevant trigger terms. Its main weakness is that the capability listing is somewhat high-level—it could benefit from more specific concrete actions beyond 'Design'. Overall it performs well for skill selection purposes.
Suggestions
Add more specific concrete actions such as 'create chains, configure retrieval pipelines, set up streaming, define agent tools' to improve specificity beyond the general 'Design' verb.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (LLM applications, LangChain, LangGraph) and mentions some capabilities (agents, memory, tool integration), but doesn't list multiple concrete actions—'Design' is somewhat vague and there are no specific operations like 'create chains', 'configure retrieval', 'set up streaming', etc. | 2 / 3 |
Completeness | Clearly answers both 'what' (design LLM applications using LangChain 1.x and LangGraph for agents, memory, and tool integration) and 'when' (explicit 'Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows'). | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'LangChain', 'LangGraph', 'AI agents', 'LLM workflows', 'memory', 'tool integration'. These cover the main terms a developer would use when seeking help with this technology stack. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive due to specific technology references (LangChain 1.x, LangGraph) which clearly differentiate it from generic coding skills or other AI framework skills. Unlikely to conflict with other skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides excellent, executable code examples covering a wide range of LangChain/LangGraph patterns, which is its primary strength. However, it is severely bloated—much of the descriptive content (core concepts, feature lists, 'when to use') is unnecessary for Claude and wastes token budget. The lack of any bundle structure means everything is crammed into one file with no progressive disclosure, and workflows lack explicit validation/error-recovery steps needed for production reliability.
Suggestions
Cut the 'When to Use This Skill', 'Core Concepts' descriptive sections, and feature bullet lists—Claude already knows what vector stores, callbacks, and memory are. Keep only the code patterns and brief headers.
Split the monolithic file into separate bundle files: e.g., PATTERNS.md (architecture patterns), MEMORY.md (memory management), PRODUCTION.md (checklist, testing, optimization), and reference them from a concise SKILL.md overview.
Add explicit validation checkpoints to workflows—e.g., after RAG retrieval, verify documents were returned before generating; in multi-agent flows, add error handling nodes and retry logic.
Remove the 'Resources' section with external URLs (Claude can't browse them) and replace with actionable inline guidance or bundle file references.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~500+ lines. Explains concepts Claude already knows (what StateGraph is, what callbacks are, what vector stores do). Lists features like bullet-point marketing copy ('Durable Execution', 'Human-in-the-Loop') without actionable detail. The 'Core Concepts' section is largely descriptive padding. Memory types are listed without meaningful differentiation. The 'When to Use This Skill' section is unnecessary filler. | 1 / 3 |
Actionability | The code examples are concrete, executable, and copy-paste ready. The ReAct agent, RAG pipeline, multi-agent orchestration, structured tools, memory management, streaming, testing, and caching patterns all include complete, runnable Python code with proper imports and realistic usage patterns. | 3 / 3 |
Workflow Clarity | The patterns show clear multi-step sequences (e.g., RAG: retrieve → generate, multi-step workflow with routing), but there are no explicit validation checkpoints, error recovery loops, or verification steps. For production-grade LLM workflows involving state management and external services, the absence of validation/feedback loops is a notable gap. The production checklist is helpful but is a static list rather than a workflow. | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of text with no bundle files to offload content to. All patterns, memory management, callbacks, streaming, testing, optimization, and production checklists are inlined in a single massive file. There are external URL references but no internal file references for progressive disclosure. Content like the 4 architecture patterns, memory management details, and callback system should be in separate files. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (667 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
b09ec7f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.