CtrlK
BlogDocsLog inGet started
Tessl Logo

neo4j-document-import-skill

Ingests unstructured and semi-structured documents into Neo4j as a knowledge graph. Use when chunking PDFs, HTML, plain text, or Markdown; extracting entities and relationships from text with an LLM (SimpleKGPipeline, neo4j-graphrag); loading JSON via apoc.load.json; building Document→Chunk→Entity graph structures; or connecting LangChain/LlamaIndex document loaders to Neo4j. Covers neo4j-graphrag SimpleKGPipeline, LLM Graph Builder web UI, entity resolution, chunking strategies, and graph schema design for RAG pipelines. Does NOT handle structured CSV/relational import — use neo4j-import-skill. Does NOT handle GraphRAG retrieval after ingestion — use neo4j-graphrag-skill. Does NOT handle vector index creation — use neo4j-vector-search-skill.

71

Quality

88%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that excels across all dimensions. It provides highly specific capabilities, rich trigger terms spanning both user-facing concepts and technical tool names, explicit 'Use when' guidance, and clear boundary delineation with related skills via 'Does NOT handle' exclusions. The description uses proper third-person voice throughout and is comprehensive without being padded.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: chunking PDFs/HTML/text/Markdown, extracting entities and relationships with LLM, loading JSON via apoc.load.json, building Document→Chunk→Entity graph structures, connecting LangChain/LlamaIndex document loaders to Neo4j. Very detailed and actionable.

3 / 3

Completeness

Clearly answers both 'what' (ingests unstructured/semi-structured documents into Neo4j as a knowledge graph, with detailed sub-capabilities) and 'when' (explicit 'Use when' clause listing specific trigger scenarios). The 'Does NOT handle' exclusions further clarify when to use this vs. other skills.

3 / 3

Trigger Term Quality

Excellent coverage of natural terms users would say: 'knowledge graph', 'chunking PDFs', 'HTML', 'Markdown', 'extracting entities', 'relationships', 'SimpleKGPipeline', 'neo4j-graphrag', 'apoc.load.json', 'LangChain', 'LlamaIndex', 'document loaders', 'entity resolution', 'RAG pipelines', 'LLM Graph Builder'. Covers both high-level concepts and specific tool names.

3 / 3

Distinctiveness Conflict Risk

Exceptionally distinctive with explicit boundary-setting via 'Does NOT handle' clauses that name the exact alternative skills for CSV import, GraphRAG retrieval, and vector index creation. This makes it very unlikely to conflict with related Neo4j skills.

3 / 3

Total

12

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a high-quality, comprehensive skill with excellent actionability and workflow clarity. Every major approach (SimpleKGPipeline, APOC, LangChain, LLM Graph Builder) has executable code and the verification checklist is thorough. The main weakness is length — at 350+ lines with some redundancy between sections (schema defined twice, resolver imports repeated), it could be more token-efficient by splitting secondary approaches into referenced files.

Suggestions

Move LangChain integration, APOC JSON ingestion, and LLM Graph Builder sections into separate reference files to reduce the main SKILL.md length and improve progressive disclosure.

Consolidate the schema definition — the GraphSchema API section (≥1.7.1) partially duplicates Step 1; merge them or clearly mark Step 1 as deprecated patterns and point to the current API.

DimensionReasoningScore

Conciseness

The skill is generally well-structured and avoids explaining basic concepts, but it's quite long (~350+ lines) with some redundancy — e.g., schema options are explained in Step 1 and then again in the GraphSchema section; entity resolution imports are repeated. Some sections like LangChain integration are verbose with full boilerplate that could be trimmed.

2 / 3

Actionability

Excellent actionability throughout — every section provides executable code (Python, Cypher, bash), specific library imports, concrete parameter values, and copy-paste-ready examples. The decision table, chunking guidance table, and error table all provide specific, concrete guidance.

3 / 3

Workflow Clarity

The skill follows a clear numbered step sequence (Steps 1–5) with explicit validation checkpoints: constraints must be ONLINE before ingestion, entity resolution runs after ingestion, and a verification checklist at the end covers all critical checks. The 'SHOW INDEXES' polling pattern and the 'If rows returned: wait' instruction demonstrate proper feedback loops.

3 / 3

Progressive Disclosure

The skill references one bundle file (references/kg-construction.md) and provides external documentation links, but no bundle files were provided to verify the reference exists. The main SKILL.md itself is quite long and could benefit from splitting out the LangChain integration, APOC JSON ingestion, and LLM Graph Builder sections into separate reference files, keeping the core SimpleKGPipeline workflow leaner.

2 / 3

Total

10

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (501 lines); consider splitting into references/ and linking

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
neo4j-contrib/neo4j-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.