0
# Search Functionality
1
2
Client-side search index generation for fast documentation search capabilities in rendered HTML output. Provides full-text search of docstrings, signatures, and identifiers without requiring a server backend.
3
4
## Capabilities
5
6
### Search Index Generation
7
8
Create searchable indices from documentation objects for client-side search.
9
10
```python { .api }
11
def make_index(all_modules: dict[str, doc.Module]) -> dict:
12
"""
13
Build search index data structure from documentation modules.
14
15
Parameters:
16
- all_modules: dict[str, doc.Module] - All modules to include in search
17
18
Returns:
19
- dict: Search index data structure containing:
20
- 'identifiers': List of all searchable identifiers
21
- 'docstrings': Full-text search data for docstrings
22
- 'signatures': Function and method signatures
23
- 'modules': Module hierarchy information
24
25
Features:
26
- Full-text indexing of docstrings and comments
27
- Identifier-based search with fuzzy matching
28
- Hierarchical module and class structure
29
- Type annotation inclusion
30
"""
31
32
def precompile_index(all_modules: dict[str, doc.Module]) -> dict:
33
"""
34
Create precompiled search index for optimized client-side performance.
35
36
Parameters:
37
- all_modules: dict[str, doc.Module] - Modules to index
38
39
Returns:
40
- dict: Precompiled search index with optimized data structures
41
42
Features:
43
- Compressed search data for faster loading
44
- Pre-computed search rankings
45
- Optimized for JavaScript consumption
46
- Includes search metadata and configuration
47
"""
48
```
49
50
### JavaScript Search Index
51
52
Generate JavaScript code containing search index for HTML documentation.
53
54
```python { .api }
55
def search_index(all_modules: dict[str, doc.Module]) -> str:
56
"""
57
Generate JavaScript search index for client-side search.
58
59
Parameters:
60
- all_modules: dict[str, doc.Module] - All modules to make searchable
61
62
Returns:
63
- str: JavaScript code defining search index and search functions
64
65
Features:
66
- Self-contained JavaScript search engine
67
- No external dependencies required
68
- Fuzzy matching and ranking algorithms
69
- Autocomplete and suggestion support
70
"""
71
```
72
73
## Usage Examples
74
75
### Basic Search Index Creation
76
77
```python
78
from pdoc.search import make_index, search_index
79
from pdoc.doc import Module
80
81
# Load multiple modules for comprehensive search
82
modules = {
83
"core": Module.from_name("my_package.core"),
84
"utils": Module.from_name("my_package.utils"),
85
"plugins": Module.from_name("my_package.plugins")
86
}
87
88
# Create search index data structure
89
index_data = make_index(modules)
90
91
print("Search index contents:")
92
print(f"Identifiers: {len(index_data['identifiers'])}")
93
print(f"Modules: {len(index_data['modules'])}")
94
95
# Generate JavaScript for HTML documentation
96
js_search = search_index(modules)
97
print(f"JavaScript search code: {len(js_search)} characters")
98
```
99
100
### Comprehensive Documentation with Search
101
102
```python
103
from pdoc import render, doc
104
from pdoc.search import search_index
105
from pathlib import Path
106
107
def generate_searchable_docs(module_names: list[str], output_dir: str):
108
"""Generate complete documentation with search functionality"""
109
110
# Load all modules
111
all_modules = {}
112
for name in module_names:
113
all_modules[name] = doc.Module.from_name(name)
114
115
# Configure rendering
116
render.configure(
117
docformat="google",
118
show_source=True,
119
math=True
120
)
121
122
output_path = Path(output_dir)
123
output_path.mkdir(exist_ok=True)
124
125
# Generate HTML for each module
126
for module_name, module_obj in all_modules.items():
127
html = render.html_module(module_obj, all_modules)
128
(output_path / f"{module_name}.html").write_text(html)
129
130
# Generate index page
131
index_html = render.html_index(all_modules)
132
(output_path / "index.html").write_text(index_html)
133
134
# Generate search functionality
135
search_js = search_index(all_modules)
136
(output_path / "search.js").write_text(search_js)
137
138
print(f"Documentation with search generated in {output_dir}")
139
print(f"Modules: {', '.join(module_names)}")
140
print(f"Search index size: {len(search_js)} characters")
141
142
# Usage
143
generate_searchable_docs(
144
["my_package", "my_package.core", "my_package.utils"],
145
"./docs"
146
)
147
```
148
149
### Custom Search Index Processing
150
151
```python
152
from pdoc.search import make_index, precompile_index
153
from pdoc.doc import Module
154
import json
155
156
def create_custom_search_index(modules: dict[str, Module]) -> dict:
157
"""Create custom search index with additional metadata"""
158
159
# Generate base index
160
base_index = make_index(modules)
161
162
# Add custom search metadata
163
custom_index = {
164
**base_index,
165
"custom_metadata": {
166
"generation_time": time.time(),
167
"module_count": len(modules),
168
"total_identifiers": len(base_index["identifiers"])
169
}
170
}
171
172
# Add category-based grouping
173
categories = {
174
"functions": [],
175
"classes": [],
176
"variables": [],
177
"modules": []
178
}
179
180
for identifier in base_index["identifiers"]:
181
if identifier["type"] == "function":
182
categories["functions"].append(identifier)
183
elif identifier["type"] == "class":
184
categories["classes"].append(identifier)
185
elif identifier["type"] == "variable":
186
categories["variables"].append(identifier)
187
elif identifier["type"] == "module":
188
categories["modules"].append(identifier)
189
190
custom_index["categories"] = categories
191
192
return custom_index
193
194
# Create enhanced search index
195
modules = {"math": Module.from_name("math")}
196
enhanced_index = create_custom_search_index(modules)
197
198
# Save to JSON file for analysis
199
with open("search_index.json", "w") as f:
200
json.dump(enhanced_index, f, indent=2)
201
```
202
203
### Search Performance Analysis
204
205
```python
206
from pdoc.search import make_index, precompile_index
207
from pdoc.doc import Module
208
import time
209
210
def benchmark_search_generation(module_names: list[str]):
211
"""Benchmark search index generation performance"""
212
213
# Load modules
214
print("Loading modules...")
215
start_time = time.time()
216
217
modules = {}
218
for name in module_names:
219
modules[name] = Module.from_name(name)
220
221
load_time = time.time() - start_time
222
print(f"Module loading: {load_time:.2f}s")
223
224
# Generate standard index
225
print("Generating standard search index...")
226
start_time = time.time()
227
228
standard_index = make_index(modules)
229
230
standard_time = time.time() - start_time
231
print(f"Standard index: {standard_time:.2f}s")
232
233
# Generate precompiled index
234
print("Generating precompiled search index...")
235
start_time = time.time()
236
237
precompiled_index = precompile_index(modules)
238
239
precompiled_time = time.time() - start_time
240
print(f"Precompiled index: {precompiled_time:.2f}s")
241
242
# Compare sizes
243
standard_size = len(str(standard_index))
244
precompiled_size = len(str(precompiled_index))
245
246
print(f"\nIndex comparison:")
247
print(f"Standard size: {standard_size:,} characters")
248
print(f"Precompiled size: {precompiled_size:,} characters")
249
print(f"Size ratio: {precompiled_size/standard_size:.2f}x")
250
251
# Benchmark with various module sets
252
benchmark_search_generation(["json", "urllib", "pathlib"])
253
```
254
255
### Advanced Search Features
256
257
```python
258
from pdoc.search import make_index
259
from pdoc.doc import Module
260
import re
261
262
class AdvancedSearchIndex:
263
"""Advanced search index with custom features"""
264
265
def __init__(self, modules: dict[str, Module]):
266
self.base_index = make_index(modules)
267
self.modules = modules
268
self._build_advanced_features()
269
270
def _build_advanced_features(self):
271
"""Build advanced search features"""
272
self.tag_index = self._build_tag_index()
273
self.similarity_map = self._build_similarity_map()
274
self.usage_patterns = self._extract_usage_patterns()
275
276
def _build_tag_index(self) -> dict:
277
"""Build tag-based search index"""
278
tags = {}
279
280
for identifier in self.base_index["identifiers"]:
281
# Extract tags from docstrings
282
docstring = identifier.get("docstring", "")
283
found_tags = re.findall(r'@(\w+)', docstring)
284
285
for tag in found_tags:
286
if tag not in tags:
287
tags[tag] = []
288
tags[tag].append(identifier["name"])
289
290
return tags
291
292
def _build_similarity_map(self) -> dict:
293
"""Build identifier similarity mapping"""
294
similarity_map = {}
295
identifiers = [id["name"] for id in self.base_index["identifiers"]]
296
297
for identifier in identifiers:
298
similar = []
299
for other in identifiers:
300
if identifier != other:
301
# Simple similarity based on common prefixes/suffixes
302
if (identifier.startswith(other[:3]) or
303
identifier.endswith(other[-3:]) or
304
other.startswith(identifier[:3]) or
305
other.endswith(identifier[-3:])):
306
similar.append(other)
307
308
if similar:
309
similarity_map[identifier] = similar[:5] # Top 5 similar
310
311
return similarity_map
312
313
def _extract_usage_patterns(self) -> dict:
314
"""Extract common usage patterns from docstrings"""
315
patterns = {
316
"common_imports": [],
317
"typical_usage": [],
318
"error_handling": []
319
}
320
321
for identifier in self.base_index["identifiers"]:
322
docstring = identifier.get("docstring", "")
323
324
# Find import patterns
325
import_matches = re.findall(r'import\s+[\w.]+', docstring)
326
patterns["common_imports"].extend(import_matches)
327
328
# Find usage examples
329
if "example" in docstring.lower():
330
patterns["typical_usage"].append(identifier["name"])
331
332
# Find error handling mentions
333
if any(word in docstring.lower() for word in ["raise", "except", "error"]):
334
patterns["error_handling"].append(identifier["name"])
335
336
return patterns
337
338
def search(self, query: str, search_type: str = "all") -> list:
339
"""Perform advanced search with multiple strategies"""
340
results = []
341
342
if search_type in ["all", "identifier"]:
343
# Standard identifier search
344
for identifier in self.base_index["identifiers"]:
345
if query.lower() in identifier["name"].lower():
346
results.append({
347
"type": "identifier",
348
"match": identifier,
349
"score": self._calculate_score(query, identifier["name"])
350
})
351
352
if search_type in ["all", "tag"]:
353
# Tag-based search
354
for tag, identifiers in self.tag_index.items():
355
if query.lower() in tag.lower():
356
for identifier_name in identifiers:
357
results.append({
358
"type": "tag",
359
"match": {"name": identifier_name, "tag": tag},
360
"score": 1.0
361
})
362
363
# Sort by score and return
364
results.sort(key=lambda x: x["score"], reverse=True)
365
return results[:20] # Top 20 results
366
367
def _calculate_score(self, query: str, target: str) -> float:
368
"""Calculate search relevance score"""
369
query_lower = query.lower()
370
target_lower = target.lower()
371
372
if query_lower == target_lower:
373
return 1.0
374
elif target_lower.startswith(query_lower):
375
return 0.9
376
elif query_lower in target_lower:
377
return 0.7
378
else:
379
return 0.1
380
381
# Usage
382
modules = {"json": Module.from_name("json")}
383
advanced_search = AdvancedSearchIndex(modules)
384
385
# Perform searches
386
results = advanced_search.search("load")
387
for result in results[:5]:
388
print(f"Type: {result['type']}, Match: {result['match']['name']}, Score: {result['score']}")
389
```
390
391
## Search Index Structure
392
393
The generated search index contains the following data:
394
395
### Identifiers Index
396
- **Name**: Full qualified name of identifier
397
- **Type**: Object type (function, class, variable, module)
398
- **Module**: Parent module name
399
- **Docstring**: First paragraph of docstring
400
- **Signature**: Function/method signature
401
- **Line Number**: Source code line number
402
403
### Full-Text Index
404
- **Content**: Complete docstring text
405
- **Keywords**: Extracted keywords and tags
406
- **Code Examples**: Embedded code snippets
407
- **Cross-References**: Links to related identifiers
408
409
### Module Hierarchy
410
- **Module Tree**: Nested module structure
411
- **Package Information**: Package metadata
412
- **Import Paths**: Available import statements
413
- **Dependencies**: Module dependency graph
414
415
## Client-Side Search Features
416
417
The generated JavaScript search provides:
418
419
- **Instant Search**: Real-time results as you type
420
- **Fuzzy Matching**: Handles typos and partial matches
421
- **Autocomplete**: Suggests completions for partial queries
422
- **Category Filtering**: Filter by object type (function, class, etc.)
423
- **Keyboard Navigation**: Full keyboard accessibility
424
- **Result Ranking**: Relevance-based result ordering