Build semantic search with Cloudflare Vectorize V2. Covers async mutations, 5M vectors/index, 31ms latency, returnMetadata enum changes, and V1 deprecation. Prevents 14 errors including dimension mismatches, TypeScript types, testing setup. Use when: building RAG or semantic search, troubleshooting returnMetadata, V2 timing, metadata index, dimension errors, vitest setup, or wrangler --json output.
Install with Tessl CLI
npx tessl i github:jezweb/claude-skills --skill cloudflare-vectorize87
Does it follow best practices?
If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:
npx tessl skill review --optimize ./path/to/skillValidation for skill structure
Complete implementation guide for Cloudflare Vectorize - a globally distributed vector database for building semantic search, RAG (Retrieval Augmented Generation), and AI-powered applications with Cloudflare Workers.
Status: Production Ready ✅ Last Updated: 2026-01-21 Dependencies: cloudflare-worker-base (for Worker setup), cloudflare-workers-ai (for embeddings) Latest Versions: wrangler@4.59.3, @cloudflare/workers-types@4.20260109.0 Token Savings: ~70% Errors Prevented: 14 Dev Time Saved: ~4 hours
IMPORTANT: Vectorize V2 became GA in September 2024 with significant breaking changes.
Performance Improvements:
Breaking API Changes:
Async Mutations - All mutations now asynchronous:
// V2: Returns mutationId
const result = await env.VECTORIZE_INDEX.insert(vectors);
console.log(result.mutationId); // "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
// Vector inserts/deletes may take a few seconds to be reflectedreturnMetadata Parameter - Boolean → String enum:
// ❌ V1 (deprecated)
{ returnMetadata: true }
// ✅ V2 (required)
{ returnMetadata: 'all' | 'indexed' | 'none' }Metadata Indexes Required Before Insert:
V1 Deprecation Timeline:
wrangler vectorize --deprecated-v1 flag for V1 operationsWrangler Version Required:
// Get index info to check last mutation processed
const info = await env.VECTORIZE_INDEX.describe();
console.log(info.mutationId); // Last mutation ID
console.log(info.processedUpToMutation); // Last processed timestamp# 1. Create the index with FIXED dimensions and metric
npx wrangler vectorize create my-index \
--dimensions=768 \
--metric=cosine
# 2. Create metadata indexes IMMEDIATELY (before inserting vectors!)
npx wrangler vectorize create-metadata-index my-index \
--property-name=category \
--type=string
npx wrangler vectorize create-metadata-index my-index \
--property-name=timestamp \
--type=numberWhy: Metadata indexes MUST exist before vectors are inserted. Vectors added before a metadata index was created won't be filterable on that property.
# Dimensions MUST match your embedding model output:
# - Workers AI @cf/baai/bge-base-en-v1.5: 768 dimensions
# - OpenAI text-embedding-3-small: 1536 dimensions
# - OpenAI text-embedding-3-large: 3072 dimensions
# Metrics determine similarity calculation:
# - cosine: Best for normalized embeddings (most common)
# - euclidean: Absolute distance between vectors
# - dot-product: For non-normalized vectorswrangler.jsonc:
{
"name": "my-vectorize-worker",
"main": "src/index.ts",
"compatibility_date": "2025-10-21",
"vectorize": [
{
"binding": "VECTORIZE_INDEX",
"index_name": "my-index"
}
],
"ai": {
"binding": "AI"
}
}export interface Env {
VECTORIZE_INDEX: VectorizeIndex;
AI: Ai;
}
interface VectorizeVector {
id: string;
values: number[] | Float32Array | Float64Array;
namespace?: string;
metadata?: Record<string, string | number | boolean | string[]>;
}
interface VectorizeMatches {
matches: Array<{
id: string;
score: number;
values?: number[];
metadata?: Record<string, any>;
namespace?: string;
}>;
count: number;
}Vectorize V2 supports advanced metadata filtering with range queries:
// Equality (implicit $eq)
{ category: "docs" }
// Not equals
{ status: { $ne: "archived" } }
// In/Not in arrays
{ category: { $in: ["docs", "tutorials"] } }
{ category: { $nin: ["deprecated", "draft"] } }
// Range queries (numbers) - NEW in V2
{ timestamp: { $gte: 1704067200, $lt: 1735689600 } }
// Range queries (strings) - prefix searching
{ url: { $gte: "/docs/workers", $lt: "/docs/workersz" } }
// Nested metadata with dot notation
{ "author.id": "user123" }
// Multiple conditions (implicit AND)
{ category: "docs", language: "en", "metadata.published": true }Low Cardinality (Good for $eq filters):
// Few unique values - efficient filtering
metadata: {
category: "docs", // ~10 categories
language: "en", // ~5 languages
published: true // 2 values (boolean)
}High Cardinality (Avoid in range queries):
// Many unique values - avoid large range scans
metadata: {
user_id: "uuid-v4...", // Millions of unique values
timestamp_ms: 1704067200123 // Use seconds instead
}Current Limit: 1536 dimensions per vector Source: GitHub Issue #8729
Supported Embedding Models:
@cf/baai/bge-base-en-v1.5: 768 dimensions ✅text-embedding-3-small: 1536 dimensions ✅text-embedding-3-large: 3072 dimensions ❌ (requires dimension reduction)Unsupported Models (>1536 dimensions):
nomic-embed-code: 3584 dimensionsQodo-Embed-1-7B: >1536 dimensionsWorkaround: Use dimensionality reduction (e.g., PCA) to compress embeddings to 1536 or fewer dimensions, though this may reduce semantic quality.
Feature Request: Higher dimension support is under consideration. Use Limit Increase Request Form if this blocks your use case.
// ❌ INVALID metadata keys
metadata: {
"": "value", // Empty key
"user.name": "John", // Contains dot (reserved for nesting)
"$admin": true, // Starts with $
"key\"with\"quotes": 1 // Contains quotes
}
// ✅ VALID metadata keys
metadata: {
"user_name": "John",
"isAdmin": true,
"nested": { "allowed": true } // Access as "nested.allowed" in filters
}Critical: Use batch size of 5000 vectors for optimal performance.
Performance Data:
Why 5000?
Optimal Pattern:
const BATCH_SIZE = 5000;
async function insertVectors(vectors: VectorizeVector[]) {
for (let i = 0; i < vectors.length; i += BATCH_SIZE) {
const batch = vectors.slice(i, i + BATCH_SIZE);
const result = await env.VECTORIZE.insert(batch);
console.log(`Inserted batch ${i / BATCH_SIZE + 1}, mutationId: ${result.mutationId}`);
// Optional: Rate limiting delay
if (i + BATCH_SIZE < vectors.length) {
await new Promise(resolve => setTimeout(resolve, 100));
}
}
}Sources:
Vectorize uses approximate nearest neighbor (ANN) search by default with ~80% accuracy compared to exact search.
Default Mode: Approximate scoring (~80% accuracy)
High-Precision Mode: Near 100% accuracy
returnValues: trueTrade-off Example:
// Fast, ~80% accuracy, topK up to 100
const results = await env.VECTORIZE.query(embedding, {
topK: 50,
returnValues: false // Default
});
// Slower, ~100% accuracy, topK max 20
const preciseResults = await env.VECTORIZE.query(embedding, {
topK: 10,
returnValues: true // High-precision scoring
});When to Use High-Precision:
Source: Cloudflare Blog - Building Vectorize
Problem: Filtering doesn't work on existing vectors
Solution: Delete and re-insert vectors OR create metadata indexes BEFORE insertingProblem: "Vector dimensions do not match index configuration"
Solution: Ensure embedding model output matches index dimensions:
- Workers AI bge-base: 768
- OpenAI small: 1536
- OpenAI large: 3072Problem: "Invalid metadata key"
Solution: Keys cannot:
- Be empty
- Contain . (dot)
- Contain " (quote)
- Start with $ (dollar sign)Problem: "Filter exceeds 2048 bytes"
Solution: Simplify filter or split into multiple queriesProblem: Slow queries or reduced accuracy
Solution: Use lower cardinality fields for range queries, or use seconds instead of milliseconds for timestampsProblem: Updates not reflecting in index
Solution: Use upsert() to overwrite existing vectors, not insert()Problem: "VECTORIZE_INDEX is not defined"
Solution: Add [[vectorize]] binding to wrangler.jsoncProblem: Unclear when to use namespace vs metadata filtering
Solution:
- Namespace: Partition key, applied BEFORE metadata filters
- Metadata: Flexible key-value filtering within namespaceProblem: Inserted vectors not immediately queryable
Solution: V2 mutations are asynchronous - vectors may take a few seconds to be reflected
- Use mutationId to track mutation status
- Check env.VECTORIZE_INDEX.describe() for processedUpToMutation timestampProblem: "returnMetadata must be 'all', 'indexed', or 'none'"
Solution: V2 changed returnMetadata from boolean to string enum:
- ❌ V1: { returnMetadata: true }
- ✅ V2: { returnMetadata: 'all' }Error: wrangler vectorize list --json output starts with log message, breaking JSON parsing
Source: GitHub Issue #11011
Affected Commands:
wrangler vectorize list --jsonwrangler vectorize list-metadata-index --jsonProblem:
$ wrangler vectorize list --json
📋 Listing Vectorize indexes...
[
{ "created_on": "2025-10-18T13:28:30.259277Z", ... }
]The log message makes output invalid JSON, breaking piping to jq or other tools.
Solution: Strip first line before parsing:
# Using tail
wrangler vectorize list --json | tail -n +2 | jq '.'
# Using sed
wrangler vectorize list --json | sed '1d' | jq '.'Error: wrangler types generates incomplete VectorizeVectorMetadataFilterOp type
Source: GitHub Issue #10092
Status: OPEN (tracked internally as VS-461)
Problem:
Generated type only includes $eq and $ne, missing V2 operators: $in, $nin, $lt, $lte, $gt, $gte
Impact: TypeScript shows false errors when using valid V2 metadata filter operators:
const vectorizeRes = env.VECTORIZE.queryById(imgId, {
filter: { gender: { $in: genderFilters } }, // ❌ TS error but works!
topK,
returnMetadata: 'indexed',
});Workaround: Manual type override until wrangler types is fixed:
// Add to your types file
type VectorizeMetadataFilter = Record<string,
| string
| number
| boolean
| {
$eq?: string | number | boolean;
$ne?: string | number | boolean;
$in?: (string | number | boolean)[];
$nin?: (string | number | boolean)[];
$lt?: number | string;
$lte?: number | string;
$gt?: number | string;
$gte?: number | string;
}
>;Error: ENOENT: no such file or directory when running wrangler dev on Windows
Source: GitHub Issue #10383
Status: FIXED in wrangler@4.32.0
Problem: Wrangler attempted to create external worker files with colons in the name (invalid on Windows):
Error: ENOENT: ... '__WRANGLER_EXTERNAL_VECTORIZE_WORKER:<project>:<binding>'Solution: Update to wrangler@4.32.0 or later:
npm install -g wrangler@latestError: topK exceeds maximum allowed value
Source: Vectorize Limits
Problem: Maximum topK value changes based on query options:
| Configuration | Max topK |
|---|---|
returnValues: false, returnMetadata: 'none' | 100 |
returnValues: true OR returnMetadata: 'all' | 20 |
returnMetadata: 'indexed' | 100 |
Common Error:
// ❌ ERROR - topK too high with returnValues
query(embedding, {
topK: 100, // Exceeds limit!
returnValues: true // Max topK=20 when true
});Solution:
// ✅ OK - respects conditional limit
query(embedding, {
topK: 20,
returnValues: true
});
// ✅ OK - higher topK without values
query(embedding, {
topK: 100,
returnValues: false,
returnMetadata: 'indexed'
});If migrating from V1 to V2:
npm install -g wrangler@latest)returnMetadata boolean → string enum ('all', 'indexed', 'none')mutationId in responses)V1 Deprecation:
wrangler vectorize --deprecated-v1 for V1 operationsIssue: Using @cloudflare/vitest-pool-workers with Vectorize or Workers AI bindings causes runtime failure.
Source: GitHub Issue #7434
Error: wrapped binding module can't be resolved
Workaround:
wrangler-test.jsonc without Vectorize/AI bindingsExample:
// wrangler-test.jsonc (no Vectorize binding)
{
"name": "my-worker-test",
"main": "src/index.ts",
"compatibility_date": "2025-10-21"
// No vectorize binding
}
// vitest.config.ts
import { defineWorkersProject } from '@cloudflare/vitest-pool-workers/config';
export default defineWorkersProject({
test: {
poolOptions: {
workers: {
wrangler: {
configPath: "./wrangler-test.jsonc"
}
}
}
}
});
// Mock in tests
import { vi } from 'vitest';
const mockVectorize = {
query: vi.fn().mockResolvedValue({
matches: [
{ id: 'test-1', score: 0.95, metadata: { category: 'docs' } }
],
count: 1
}),
insert: vi.fn().mockResolvedValue({ mutationId: "test-mutation-id" }),
upsert: vi.fn().mockResolvedValue({ mutationId: "test-mutation-id" })
};
// Use mock in tests
test('vector search', async () => {
const env = { VECTORIZE_INDEX: mockVectorize };
// ... test logic
});Note: These tips come from community discussions and official blog posts. Verify against your Vectorize version.
Source: Query Best Practices Confidence: MEDIUM Applies to: Datasets with ~10M+ vectors
Range queries ($lt, $lte, $gt, $gte) on large datasets may experience reduced accuracy.
Optimization Strategy:
// ❌ High-cardinality range at scale
metadata: {
timestamp_ms: 1704067200123
}
filter: { timestamp_ms: { $gte: 1704067200000 } }
// ✅ Bucketed into discrete values
metadata: {
timestamp_bucket: "2025-01-01-00:00", // 1-hour buckets
timestamp_ms: 1704067200123 // Original (non-indexed)
}
filter: {
timestamp_bucket: {
$in: ["2025-01-01-00:00", "2025-01-01-01:00"]
}
}When This Matters:
Alternative: Use equality filters ($eq, $in) with bucketed values.
Source: Vectorize Changelog
Vectorize V2 added support for the list-vectors operation for paginated iteration through vector IDs.
Use Cases:
API:
const result = await env.VECTORIZE_INDEX.list({
limit: 1000, // Max 1000 per page
cursor?: string
});
// result.vectors: Array<{ id: string }>
// result.cursor: string | undefined
// result.count: number
// Pagination example
let cursor: string | undefined;
const allVectorIds: string[] = [];
do {
const result = await env.VECTORIZE_INDEX.list({
limit: 1000,
cursor
});
allVectorIds.push(...result.vectors.map(v => v.id));
cursor = result.cursor;
} while (cursor);Limitations:
Status: Production Ready ✅ (Vectorize V2 GA - September 2024) Last Updated: 2026-01-21 Token Savings: ~70% Errors Prevented: 14 (includes V2 breaking changes, testing setup, TypeScript types) Changes: Added 4 new errors (wrangler --json, TypeScript types, Windows dev, topK limits), batch performance best practices, query accuracy modes, testing setup, community tips on range queries and list-vectors operation.
fa91c34
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.