tessl install github:jezweb/claude-skills --skill cloudflare-vectorizegithub.com/jezweb/claude-skills
Build semantic search with Cloudflare Vectorize V2. Covers async mutations, 5M vectors/index, 31ms latency, returnMetadata enum changes, and V1 deprecation. Prevents 14 errors including dimension mismatches, TypeScript types, testing setup. Use when: building RAG or semantic search, troubleshooting returnMetadata, V2 timing, metadata index, dimension errors, vitest setup, or wrangler --json output.
Review Score
87%
Validation Score
12/16
Implementation Score
77%
Activation Score
100%
Complete implementation guide for Cloudflare Vectorize - a globally distributed vector database for building semantic search, RAG (Retrieval Augmented Generation), and AI-powered applications with Cloudflare Workers.
Status: Production Ready β Last Updated: 2026-01-21 Dependencies: cloudflare-worker-base (for Worker setup), cloudflare-workers-ai (for embeddings) Latest Versions: wrangler@4.59.3, @cloudflare/workers-types@4.20260109.0 Token Savings: ~70% Errors Prevented: 14 Dev Time Saved: ~4 hours
IMPORTANT: Vectorize V2 became GA in September 2024 with significant breaking changes.
Performance Improvements:
Breaking API Changes:
Async Mutations - All mutations now asynchronous:
// V2: Returns mutationId
const result = await env.VECTORIZE_INDEX.insert(vectors);
console.log(result.mutationId); // "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
// Vector inserts/deletes may take a few seconds to be reflectedreturnMetadata Parameter - Boolean β String enum:
// β V1 (deprecated)
{ returnMetadata: true }
// β
V2 (required)
{ returnMetadata: 'all' | 'indexed' | 'none' }Metadata Indexes Required Before Insert:
V1 Deprecation Timeline:
wrangler vectorize --deprecated-v1 flag for V1 operationsWrangler Version Required:
// Get index info to check last mutation processed
const info = await env.VECTORIZE_INDEX.describe();
console.log(info.mutationId); // Last mutation ID
console.log(info.processedUpToMutation); // Last processed timestamp# 1. Create the index with FIXED dimensions and metric
npx wrangler vectorize create my-index \
--dimensions=768 \
--metric=cosine
# 2. Create metadata indexes IMMEDIATELY (before inserting vectors!)
npx wrangler vectorize create-metadata-index my-index \
--property-name=category \
--type=string
npx wrangler vectorize create-metadata-index my-index \
--property-name=timestamp \
--type=numberWhy: Metadata indexes MUST exist before vectors are inserted. Vectors added before a metadata index was created won't be filterable on that property.
# Dimensions MUST match your embedding model output:
# - Workers AI @cf/baai/bge-base-en-v1.5: 768 dimensions
# - OpenAI text-embedding-3-small: 1536 dimensions
# - OpenAI text-embedding-3-large: 3072 dimensions
# Metrics determine similarity calculation:
# - cosine: Best for normalized embeddings (most common)
# - euclidean: Absolute distance between vectors
# - dot-product: For non-normalized vectorswrangler.jsonc:
{
"name": "my-vectorize-worker",
"main": "src/index.ts",
"compatibility_date": "2025-10-21",
"vectorize": [
{
"binding": "VECTORIZE_INDEX",
"index_name": "my-index"
}
],
"ai": {
"binding": "AI"
}
}export interface Env {
VECTORIZE_INDEX: VectorizeIndex;
AI: Ai;
}
interface VectorizeVector {
id: string;
values: number[] | Float32Array | Float64Array;
namespace?: string;
metadata?: Record<string, string | number | boolean | string[]>;
}
interface VectorizeMatches {
matches: Array<{
id: string;
score: number;
values?: number[];
metadata?: Record<string, any>;
namespace?: string;
}>;
count: number;
}Vectorize V2 supports advanced metadata filtering with range queries:
// Equality (implicit $eq)
{ category: "docs" }
// Not equals
{ status: { $ne: "archived" } }
// In/Not in arrays
{ category: { $in: ["docs", "tutorials"] } }
{ category: { $nin: ["deprecated", "draft"] } }
// Range queries (numbers) - NEW in V2
{ timestamp: { $gte: 1704067200, $lt: 1735689600 } }
// Range queries (strings) - prefix searching
{ url: { $gte: "/docs/workers", $lt: "/docs/workersz" } }
// Nested metadata with dot notation
{ "author.id": "user123" }
// Multiple conditions (implicit AND)
{ category: "docs", language: "en", "metadata.published": true }Low Cardinality (Good for $eq filters):
// Few unique values - efficient filtering
metadata: {
category: "docs", // ~10 categories
language: "en", // ~5 languages
published: true // 2 values (boolean)
}High Cardinality (Avoid in range queries):
// Many unique values - avoid large range scans
metadata: {
user_id: "uuid-v4...", // Millions of unique values
timestamp_ms: 1704067200123 // Use seconds instead
}Current Limit: 1536 dimensions per vector Source: GitHub Issue #8729
Supported Embedding Models:
@cf/baai/bge-base-en-v1.5: 768 dimensions β
text-embedding-3-small: 1536 dimensions β
text-embedding-3-large: 3072 dimensions β (requires dimension reduction)Unsupported Models (>1536 dimensions):
nomic-embed-code: 3584 dimensionsQodo-Embed-1-7B: >1536 dimensionsWorkaround: Use dimensionality reduction (e.g., PCA) to compress embeddings to 1536 or fewer dimensions, though this may reduce semantic quality.
Feature Request: Higher dimension support is under consideration. Use Limit Increase Request Form if this blocks your use case.
// β INVALID metadata keys
metadata: {
"": "value", // Empty key
"user.name": "John", // Contains dot (reserved for nesting)
"$admin": true, // Starts with $
"key\"with\"quotes": 1 // Contains quotes
}
// β
VALID metadata keys
metadata: {
"user_name": "John",
"isAdmin": true,
"nested": { "allowed": true } // Access as "nested.allowed" in filters
}Critical: Use batch size of 5000 vectors for optimal performance.
Performance Data:
Why 5000?
Optimal Pattern:
const BATCH_SIZE = 5000;
async function insertVectors(vectors: VectorizeVector[]) {
for (let i = 0; i < vectors.length; i += BATCH_SIZE) {
const batch = vectors.slice(i, i + BATCH_SIZE);
const result = await env.VECTORIZE.insert(batch);
console.log(`Inserted batch ${i / BATCH_SIZE + 1}, mutationId: ${result.mutationId}`);
// Optional: Rate limiting delay
if (i + BATCH_SIZE < vectors.length) {
await new Promise(resolve => setTimeout(resolve, 100));
}
}
}Sources:
Vectorize uses approximate nearest neighbor (ANN) search by default with ~80% accuracy compared to exact search.
Default Mode: Approximate scoring (~80% accuracy)
High-Precision Mode: Near 100% accuracy
returnValues: trueTrade-off Example:
// Fast, ~80% accuracy, topK up to 100
const results = await env.VECTORIZE.query(embedding, {
topK: 50,
returnValues: false // Default
});
// Slower, ~100% accuracy, topK max 20
const preciseResults = await env.VECTORIZE.query(embedding, {
topK: 10,
returnValues: true // High-precision scoring
});When to Use High-Precision:
Source: Cloudflare Blog - Building Vectorize
Problem: Filtering doesn't work on existing vectors
Solution: Delete and re-insert vectors OR create metadata indexes BEFORE insertingProblem: "Vector dimensions do not match index configuration"
Solution: Ensure embedding model output matches index dimensions:
- Workers AI bge-base: 768
- OpenAI small: 1536
- OpenAI large: 3072Problem: "Invalid metadata key"
Solution: Keys cannot:
- Be empty
- Contain . (dot)
- Contain " (quote)
- Start with $ (dollar sign)Problem: "Filter exceeds 2048 bytes"
Solution: Simplify filter or split into multiple queriesProblem: Slow queries or reduced accuracy
Solution: Use lower cardinality fields for range queries, or use seconds instead of milliseconds for timestampsProblem: Updates not reflecting in index
Solution: Use upsert() to overwrite existing vectors, not insert()Problem: "VECTORIZE_INDEX is not defined"
Solution: Add [[vectorize]] binding to wrangler.jsoncProblem: Unclear when to use namespace vs metadata filtering
Solution:
- Namespace: Partition key, applied BEFORE metadata filters
- Metadata: Flexible key-value filtering within namespaceProblem: Inserted vectors not immediately queryable
Solution: V2 mutations are asynchronous - vectors may take a few seconds to be reflected
- Use mutationId to track mutation status
- Check env.VECTORIZE_INDEX.describe() for processedUpToMutation timestampProblem: "returnMetadata must be 'all', 'indexed', or 'none'"
Solution: V2 changed returnMetadata from boolean to string enum:
- β V1: { returnMetadata: true }
- β
V2: { returnMetadata: 'all' }Error: wrangler vectorize list --json output starts with log message, breaking JSON parsing
Source: GitHub Issue #11011
Affected Commands:
wrangler vectorize list --jsonwrangler vectorize list-metadata-index --jsonProblem:
$ wrangler vectorize list --json
π Listing Vectorize indexes...
[
{ "created_on": "2025-10-18T13:28:30.259277Z", ... }
]The log message makes output invalid JSON, breaking piping to jq or other tools.
Solution: Strip first line before parsing:
# Using tail
wrangler vectorize list --json | tail -n +2 | jq '.'
# Using sed
wrangler vectorize list --json | sed '1d' | jq '.'Error: wrangler types generates incomplete VectorizeVectorMetadataFilterOp type
Source: GitHub Issue #10092
Status: OPEN (tracked internally as VS-461)
Problem:
Generated type only includes $eq and $ne, missing V2 operators: $in, $nin, $lt, $lte, $gt, $gte
Impact: TypeScript shows false errors when using valid V2 metadata filter operators:
const vectorizeRes = env.VECTORIZE.queryById(imgId, {
filter: { gender: { $in: genderFilters } }, // β TS error but works!
topK,
returnMetadata: 'indexed',
});Workaround: Manual type override until wrangler types is fixed:
// Add to your types file
type VectorizeMetadataFilter = Record<string,
| string
| number
| boolean
| {
$eq?: string | number | boolean;
$ne?: string | number | boolean;
$in?: (string | number | boolean)[];
$nin?: (string | number | boolean)[];
$lt?: number | string;
$lte?: number | string;
$gt?: number | string;
$gte?: number | string;
}
>;Error: ENOENT: no such file or directory when running wrangler dev on Windows
Source: GitHub Issue #10383
Status: FIXED in wrangler@4.32.0
Problem: Wrangler attempted to create external worker files with colons in the name (invalid on Windows):
Error: ENOENT: ... '__WRANGLER_EXTERNAL_VECTORIZE_WORKER:<project>:<binding>'Solution: Update to wrangler@4.32.0 or later:
npm install -g wrangler@latestError: topK exceeds maximum allowed value
Source: Vectorize Limits
Problem: Maximum topK value changes based on query options:
| Configuration | Max topK |
|---|---|
returnValues: false, returnMetadata: 'none' | 100 |
returnValues: true OR returnMetadata: 'all' | 20 |
returnMetadata: 'indexed' | 100 |
Common Error:
// β ERROR - topK too high with returnValues
query(embedding, {
topK: 100, // Exceeds limit!
returnValues: true // Max topK=20 when true
});Solution:
// β
OK - respects conditional limit
query(embedding, {
topK: 20,
returnValues: true
});
// β
OK - higher topK without values
query(embedding, {
topK: 100,
returnValues: false,
returnMetadata: 'indexed'
});If migrating from V1 to V2:
npm install -g wrangler@latest)returnMetadata boolean β string enum ('all', 'indexed', 'none')mutationId in responses)V1 Deprecation:
wrangler vectorize --deprecated-v1 for V1 operationsIssue: Using @cloudflare/vitest-pool-workers with Vectorize or Workers AI bindings causes runtime failure.
Source: GitHub Issue #7434
Error: wrapped binding module can't be resolved
Workaround:
wrangler-test.jsonc without Vectorize/AI bindingsExample:
// wrangler-test.jsonc (no Vectorize binding)
{
"name": "my-worker-test",
"main": "src/index.ts",
"compatibility_date": "2025-10-21"
// No vectorize binding
}
// vitest.config.ts
import { defineWorkersProject } from '@cloudflare/vitest-pool-workers/config';
export default defineWorkersProject({
test: {
poolOptions: {
workers: {
wrangler: {
configPath: "./wrangler-test.jsonc"
}
}
}
}
});
// Mock in tests
import { vi } from 'vitest';
const mockVectorize = {
query: vi.fn().mockResolvedValue({
matches: [
{ id: 'test-1', score: 0.95, metadata: { category: 'docs' } }
],
count: 1
}),
insert: vi.fn().mockResolvedValue({ mutationId: "test-mutation-id" }),
upsert: vi.fn().mockResolvedValue({ mutationId: "test-mutation-id" })
};
// Use mock in tests
test('vector search', async () => {
const env = { VECTORIZE_INDEX: mockVectorize };
// ... test logic
});Note: These tips come from community discussions and official blog posts. Verify against your Vectorize version.
Source: Query Best Practices Confidence: MEDIUM Applies to: Datasets with ~10M+ vectors
Range queries ($lt, $lte, $gt, $gte) on large datasets may experience reduced accuracy.
Optimization Strategy:
// β High-cardinality range at scale
metadata: {
timestamp_ms: 1704067200123
}
filter: { timestamp_ms: { $gte: 1704067200000 } }
// β
Bucketed into discrete values
metadata: {
timestamp_bucket: "2025-01-01-00:00", // 1-hour buckets
timestamp_ms: 1704067200123 // Original (non-indexed)
}
filter: {
timestamp_bucket: {
$in: ["2025-01-01-00:00", "2025-01-01-01:00"]
}
}When This Matters:
Alternative: Use equality filters ($eq, $in) with bucketed values.
Source: Vectorize Changelog
Vectorize V2 added support for the list-vectors operation for paginated iteration through vector IDs.
Use Cases:
API:
const result = await env.VECTORIZE_INDEX.list({
limit: 1000, // Max 1000 per page
cursor?: string
});
// result.vectors: Array<{ id: string }>
// result.cursor: string | undefined
// result.count: number
// Pagination example
let cursor: string | undefined;
const allVectorIds: string[] = [];
do {
const result = await env.VECTORIZE_INDEX.list({
limit: 1000,
cursor
});
allVectorIds.push(...result.vectors.map(v => v.id));
cursor = result.cursor;
} while (cursor);Limitations:
Status: Production Ready β (Vectorize V2 GA - September 2024) Last Updated: 2026-01-21 Token Savings: ~70% Errors Prevented: 14 (includes V2 breaking changes, testing setup, TypeScript types) Changes: Added 4 new errors (wrangler --json, TypeScript types, Windows dev, topK limits), batch performance best practices, query accuracy modes, testing setup, community tips on range queries and list-vectors operation.