langchain-architecture

Design LLM applications using LangChain 1.x and LangGraph for agents, memory, and tool integration. Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows.

2.34x

Quality

66%

Does it follow best practices?

Impact

82%

2.34x

Average score across 3 eval scenarios

Securityby

Risky

Do not use without reviewing

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/llm-application-dev/skills/langchain-architecture/SKILL.md

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid skill description that clearly identifies its technology niche (LangChain 1.x/LangGraph) and includes an explicit 'Use when' clause with relevant trigger terms. Its main weakness is that the capability listing is somewhat high-level—it could benefit from more specific concrete actions beyond 'Design'. The strong technology-specific keywords and explicit trigger guidance make it effective for skill selection.

Suggestions

Add more specific concrete actions such as 'build retrieval chains, configure agent tool calling, set up conversation memory, define LangGraph state machines' to improve specificity.

Dimension	Reasoning	Score
Specificity	Names the domain (LLM applications, LangChain, LangGraph) and mentions some capabilities (agents, memory, tool integration), but doesn't list multiple concrete actions—'Design' is somewhat vague and there are no specific operations like 'create chains', 'configure retrieval', 'set up streaming', etc.	2 / 3
Completeness	Clearly answers both 'what' (design LLM applications using LangChain 1.x and LangGraph for agents, memory, and tool integration) and 'when' (explicit 'Use when building LangChain applications, implementing AI agents, or creating complex LLM workflows').	3 / 3
Trigger Term Quality	Includes strong natural keywords users would say: 'LangChain', 'LangGraph', 'AI agents', 'LLM workflows', 'tool integration', 'memory'. These cover the main terms a developer would use when seeking help with this technology stack.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive due to specific technology references (LangChain 1.x, LangGraph) which clearly differentiate it from generic coding skills or other AI framework skills. Unlikely to conflict with other skills.	3 / 3
	Total	11 / 12 Passed

Implementation

42%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill provides excellent, executable code examples across many LangChain/LangGraph patterns, which is its primary strength. However, it is severely bloated—much of the descriptive content (Core Concepts bullet lists, feature enumerations) adds no value for Claude and wastes token budget. The monolithic structure with no progressive disclosure and the lack of validation/error-handling steps in workflows are significant weaknesses.

Suggestions

Cut the 'Core Concepts' section drastically—remove descriptive bullet lists (memory types, document processing components, callbacks features) that Claude already knows, keeping only non-obvious configuration details or gotchas.

Split into multiple files: keep SKILL.md as a concise overview with Quick Start, then reference separate files like PATTERNS.md (RAG, multi-agent, workflows), MEMORY.md, TESTING.md, and PERFORMANCE.md.

Add explicit validation and error-handling steps to workflows—e.g., checking retriever results before generation in RAG, verifying tool outputs, handling LLM API failures with retries.

Remove the 'When to Use This Skill' section entirely—it's pure padding that describes obvious use cases Claude can infer from the skill title.

Dimension	Reasoning	Score
Conciseness	Extremely verbose at ~500+ lines. Explains concepts Claude already knows (what StateGraph is, what memory types do, what document loaders are). Lists features like bullet-point marketing copy ('Durable Execution', 'Human-in-the-Loop') without actionable detail. The 'Core Concepts' section is largely descriptive padding. Multiple patterns could be consolidated or moved to separate files.	1 / 3
Actionability	The code examples are concrete, executable, and copy-paste ready. The ReAct agent, RAG pipeline, multi-agent orchestration, structured tools, memory management, streaming, testing, and caching patterns all include complete, runnable Python code with proper imports and realistic usage.	3 / 3
Workflow Clarity	Multi-step workflows like RAG and multi-agent orchestration are shown with clear graph construction sequences, but there are no validation checkpoints, error handling feedback loops, or verification steps. For production-grade LLM applications involving external services (databases, vector stores, email), the absence of error recovery and validation caps this at 2.	2 / 3
Progressive Disclosure	This is a monolithic wall of text with no references to external files. All patterns, memory management, callbacks, streaming, testing, and optimization are inlined in a single massive document. Content like individual architecture patterns, testing strategies, and performance optimization should be split into separate referenced files.	1 / 3
	Total	7 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (635 lines); consider splitting into references/ and linking	Warning

	Total	10 / 11 Passed

Repository: wshobson/agents
Commit: 34632bc

Reviewed: 5 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.