Design LLM applications using LangChain 1.x and LangGraph for agents, memory, and tool integration. Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows.
74
66%
Does it follow best practices?
Impact
83%
1.66xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/llm-application-dev/skills/langchain-architecture/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid skill description that clearly identifies its technology niche (LangChain 1.x / LangGraph) and includes an explicit 'Use when' clause with relevant trigger terms. Its main weakness is that the capability listing is somewhat high-level—it could benefit from more specific concrete actions beyond 'Design'. Overall it performs well for skill selection purposes.
Suggestions
Add more specific concrete actions such as 'create chains, configure retrieval pipelines, set up streaming, define agent tools' to improve specificity beyond the general 'Design' verb.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (LLM applications, LangChain, LangGraph) and mentions some capabilities (agents, memory, tool integration), but doesn't list multiple concrete actions—'Design' is somewhat vague and there are no specific operations like 'create chains', 'configure retrieval', 'set up streaming', etc. | 2 / 3 |
Completeness | Clearly answers both 'what' (design LLM applications using LangChain 1.x and LangGraph for agents, memory, and tool integration) and 'when' (explicit 'Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows'). | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'LangChain', 'LangGraph', 'AI agents', 'LLM workflows', 'memory', 'tool integration'. These cover the main terms a developer would use when seeking help with this technology stack. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive due to specific technology references (LangChain 1.x, LangGraph) which clearly differentiate it from generic coding skills or other AI framework skills. Unlikely to conflict with other skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides excellent, executable code examples across many LangChain/LangGraph patterns, which is its primary strength. However, it is severely bloated—much of the descriptive content (core concepts, memory system descriptions, 'when to use' lists) explains things Claude already knows and wastes token budget. The lack of content splitting into separate files and absence of validation checkpoints in workflows are significant weaknesses.
Suggestions
Cut the 'Core Concepts' section drastically—remove descriptive bullet lists about what StateGraph, memory systems, document processing, and callbacks are. Keep only non-obvious configuration details or gotchas.
Split into multiple files: keep SKILL.md as a concise overview with Quick Start + Common Pitfalls, then reference separate files like PATTERNS.md (RAG, multi-agent, workflows), MEMORY.md, and PRODUCTION.md.
Remove the 'When to Use This Skill' section entirely—it lists obvious use cases that add no value.
Add validation/verification steps to workflows, e.g., checking retrieval quality in RAG, adding max-iteration guards for multi-agent loops, and verifying tool outputs before proceeding.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~500+ lines. Explains concepts Claude already knows (what callbacks are, what document loaders do, what vector stores are). Sections like 'Core Concepts' are largely descriptive bullet lists that add no actionable value. The 'When to Use This Skill' section lists obvious use cases. Memory system descriptions (ConversationBufferMemory, etc.) are just restating documentation Claude already has access to. | 1 / 3 |
Actionability | The code examples are concrete, executable, and copy-paste ready. The ReAct agent, RAG pipeline, multi-agent orchestration, structured tools, memory management, streaming, testing, and caching examples all contain fully functional Python code with proper imports and realistic implementations. | 3 / 3 |
Workflow Clarity | The patterns show clear multi-step workflows (RAG pipeline, multi-step workflow with StateGraph), but there are no validation checkpoints or error recovery feedback loops. For example, the RAG pattern has no step to verify retrieval quality, and the multi-agent pattern has no safeguard against infinite supervisor loops. The production checklist is helpful but is a static list rather than a sequenced workflow. | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of text with everything inline. There are no references to separate files for detailed patterns, API references, or examples. The entire content (~500+ lines) is in a single file covering package structure, core concepts, quick start, 4 architecture patterns, memory management, callbacks, streaming, testing, performance optimization, and production checklists. Much of this should be split into separate reference files. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (667 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
6e3d68c
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.