CtrlK
BlogDocsLog inGet started
Tessl Logo

g14wxz/pgvector-hnsw-index-selection

Enforces HNSW index selection over IVFFlat and correct distance operator usage for pgvector.

100

Quality

100%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

pgvector-rules.mdrules/

Pgvector Rules

FATAL Constraints

  • NEVER create an IVFFlat index. HNSW MUST be the default and only index type for vector columns.
  • NEVER use Cosine Distance (<=>) on normalized embeddings. MUST use Inner Product (<#>) instead.
  • NEVER create an HNSW index without specifying m and ef_construction parameters. Relying on defaults is forbidden.
  • NEVER use L2 Distance (<->) unless the operator has explicitly confirmed Euclidean distance is required.
  • NEVER query a vector column that lacks an HNSW index. HALT and create the index first.

Mandatory Behaviors

  • The agent MUST verify embedding normalization status before selecting a distance operator. Check with SELECT vector_norm(embedding) FROM table LIMIT 5.
  • HNSW indexes MUST use m = 16, ef_construction = 64 as baseline parameters. For tables exceeding 1M rows, MUST increase to m = 24, ef_construction = 128.
  • Every vector similarity query MUST be validated with EXPLAIN ANALYZE to confirm index usage. A sequential scan on a vector column is a defect.
  • Operator class MUST match the chosen distance operator: vector_ip_ops for <#>, vector_cosine_ops for <=>, vector_l2_ops for <->.
  • When the embedding model changes, the agent MUST re-evaluate normalization status and rebuild the index if the operator class changes.
  • The vector extension MUST be confirmed enabled before any vector DDL. HALT if missing.

tile.json