Design LLM applications using LangChain 1.x and LangGraph for agents, memory, and tool integration. Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows.
74
66%
Does it follow best practices?
Impact
82%
2.34xAverage score across 3 eval scenarios
Risky
Do not use without reviewing
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/llm-application-dev/skills/langchain-architecture/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid skill description that clearly identifies its technology niche (LangChain 1.x/LangGraph) and includes an explicit 'Use when' clause with relevant trigger terms. Its main weakness is that the capability listing is somewhat high-level—it could benefit from more specific concrete actions beyond 'Design'. The strong technology-specific keywords and explicit trigger guidance make it effective for skill selection.
Suggestions
Add more specific concrete actions such as 'build retrieval chains, configure agent tool calling, set up conversation memory, define LangGraph state machines' to improve specificity.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (LLM applications, LangChain, LangGraph) and mentions some capabilities (agents, memory, tool integration), but doesn't list multiple concrete actions—'Design' is somewhat vague and there are no specific operations like 'create chains', 'configure retrieval', 'set up streaming', etc. | 2 / 3 |
Completeness | Clearly answers both 'what' (design LLM applications using LangChain 1.x and LangGraph for agents, memory, and tool integration) and 'when' (explicit 'Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows'). | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'LangChain', 'LangGraph', 'AI agents', 'LLM workflows', 'tool integration', 'memory'. These cover the main terms a developer would use when seeking help with this technology stack. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive due to specific technology references (LangChain 1.x, LangGraph) which clearly differentiate it from generic coding skills or other AI framework skills. Unlikely to conflict with other skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides excellent, executable code examples across many LangChain/LangGraph patterns, which is its primary strength. However, it is severely bloated—much of the descriptive content (Core Concepts bullet lists, feature enumerations) adds no value for Claude and wastes token budget. The monolithic structure with no progressive disclosure and the lack of validation/error-handling steps in workflows are significant weaknesses.
Suggestions
Cut the 'Core Concepts' section drastically—remove descriptive bullet lists (memory types, document processing components, callbacks features) that Claude already knows, keeping only non-obvious configuration details or gotchas.
Split into multiple files: keep SKILL.md as a concise overview with Quick Start, then reference separate files like PATTERNS.md (RAG, multi-agent, workflows), MEMORY.md, TESTING.md, and PERFORMANCE.md.
Add explicit validation and error-handling steps to workflows—e.g., checking retriever results before generation in RAG, verifying tool outputs, handling LLM API failures with retries.
Remove the 'When to Use This Skill' section entirely—it's pure padding that describes obvious use cases Claude can infer from the skill title.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~500+ lines. Explains concepts Claude already knows (what StateGraph is, what memory types do, what document loaders are). Lists features like bullet-point marketing copy ('Durable Execution', 'Human-in-the-Loop') without actionable detail. The 'Core Concepts' section is largely descriptive padding. Multiple patterns could be consolidated or moved to separate files. | 1 / 3 |
Actionability | The code examples are concrete, executable, and copy-paste ready. The ReAct agent, RAG pipeline, multi-agent orchestration, structured tools, memory management, streaming, testing, and caching patterns all include complete, runnable Python code with proper imports and realistic usage. | 3 / 3 |
Workflow Clarity | Multi-step workflows like RAG and multi-agent orchestration are shown with clear graph construction sequences, but there are no validation checkpoints, error handling feedback loops, or verification steps. For production-grade LLM applications involving external services (databases, vector stores, email), the absence of error recovery and validation caps this at 2. | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of text with no references to external files. All patterns, memory management, callbacks, streaming, testing, and optimization are inlined in a single massive document. Content like individual architecture patterns, testing strategies, and performance optimization should be split into separate referenced files. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (635 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
34632bc
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.