Tessl Tile for pypi/llama-index@0.13.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

agents-workflows.md data-indexing.md document-processing.md index.md llm-integration.md prompts.md query-processing.md response-synthesis.md retrievers.md storage-settings.md

response-synthesis.mddocs/

0
# Response Synthesis
1

2
Response generation strategies for combining retrieved context into coherent answers with various summarization approaches and synthesis modes.
3

4
## Capabilities
5

6
### Response Synthesizer Factory
7

8
Factory function for creating response synthesizers with different strategies and configurations.
9

10
```python { .api }
11
def get_response_synthesizer(
12
    response_mode="compact",
13
    service_context=None,
14
    text_qa_template=None,
15
    refine_template=None,
16
    summary_template=None,
17
    simple_template=None,
18
    use_async=False,
19
    streaming=False,
20
    structured_answer_filtering=False,
21
    **kwargs
22
):
23
    """
24
    Create response synthesizer with specified mode and configuration.
25
    
26
    Args:
27
        response_mode: Synthesis strategy ("compact", "refine", "tree_summarize", 
28
                      "simple_summarize", "accumulate", "generation")
29
        service_context: Service context (deprecated, use Settings)
30
        text_qa_template: Template for question-answering
31
        refine_template: Template for iterative refinement
32
        summary_template: Template for summarization
33
        simple_template: Template for simple responses
34
        use_async: Enable asynchronous processing
35
        streaming: Enable streaming responses
36
        structured_answer_filtering: Filter responses for structured output
37
        
38
    Returns:
39
        BaseSynthesizer: Configured response synthesizer
40
    """
41
```
42

43
**Usage Example:**
44

45
```python
46
from llama_index.core import get_response_synthesizer
47

48
# Compact mode (default) - combines chunks efficiently
49
synthesizer = get_response_synthesizer(
50
    response_mode="compact",
51
    streaming=True
52
)
53

54
# Tree summarize mode - hierarchical summarization
55
tree_synthesizer = get_response_synthesizer(
56
    response_mode="tree_summarize",
57
    use_async=True
58
)
59

60
# Refine mode - iterative improvement
61
refine_synthesizer = get_response_synthesizer(
62
    response_mode="refine",
63
    structured_answer_filtering=True
64
)
65

66
# Use with query engine
67
query_engine = index.as_query_engine(
68
    response_synthesizer=synthesizer
69
)
70
```
71

72
### Compact Response Synthesis
73

74
Efficient synthesis mode that combines retrieved chunks with intelligent context compression.
75

76
```python { .api }
77
class CompactAndRefine:
78
    """
79
    Compact and refine synthesis strategy.
80
    
81
    Combines chunks into larger contexts, then applies refinement for final answer.
82
    
83
    Args:
84
        text_qa_template: Template for initial question answering
85
        refine_template: Template for iterative refinement
86
        max_prompt_size: Maximum prompt size in tokens
87
        callback_manager: Callback manager for events
88
        use_async: Enable asynchronous processing
89
        streaming: Enable streaming responses
90
    """
91
    def __init__(
92
        self,
93
        text_qa_template=None,
94
        refine_template=None,
95
        max_prompt_size=None,
96
        callback_manager=None,
97
        use_async=False,
98
        streaming=False,
99
        **kwargs
100
    ): ...
101
    
102
    def synthesize(
103
        self,
104
        query,
105
        nodes,
106
        additional_source_nodes=None,
107
        **kwargs
108
    ):
109
        """
110
        Synthesize response from query and retrieved nodes.
111
        
112
        Args:
113
            query: User query or QueryBundle
114
            nodes: List of retrieved NodeWithScore objects
115
            additional_source_nodes: Extra context nodes
116
            
117
        Returns:
118
            Response: Synthesized response with sources
119
        """
120
    
121
    async def asynthesize(self, query, nodes, **kwargs):
122
        """Async version of synthesize."""
123
```
124

125
### Tree Summarization
126

127
Hierarchical summarization strategy that builds responses bottom-up through tree structures.
128

129
```python { .api }
130
class TreeSummarize:
131
    """
132
    Tree-based summarization synthesis.
133
    
134
    Recursively summarizes chunks in a tree structure for comprehensive responses.
135
    
136
    Args:
137
        summary_template: Template for summarization steps
138
        text_qa_template: Template for final question answering
139
        use_async: Enable asynchronous processing
140
        callback_manager: Callback manager for events
141
    """
142
    def __init__(
143
        self,
144
        summary_template=None,
145
        text_qa_template=None,
146
        use_async=False,
147
        callback_manager=None,
148
        **kwargs
149
    ): ...
150
    
151
    def synthesize(self, query, nodes, **kwargs):
152
        """Tree-based synthesis of response."""
153
    
154
    async def asynthesize(self, query, nodes, **kwargs):
155
        """Async tree synthesis."""
156
```
157

158
**Usage Example:**
159

160
```python
161
from llama_index.core.response_synthesizers import TreeSummarize
162
from llama_index.core.prompts import PromptTemplate
163

164
# Custom summarization template
165
summary_template = PromptTemplate(
166
    "Context information is below:\n"
167
    "---------------------\n"
168
    "{context_str}\n"
169
    "---------------------\n"
170
    "Summarize the key points relevant to: {query_str}\n"
171
    "Summary: "
172
)
173

174
tree_synthesizer = TreeSummarize(
175
    summary_template=summary_template,
176
    use_async=True
177
)
178

179
# Use with query engine
180
query_engine = index.as_query_engine(
181
    response_synthesizer=tree_synthesizer,
182
    similarity_top_k=10  # More chunks for tree processing
183
)
184

185
response = query_engine.query("What are the main themes in the documents?")
186
```
187

188
### Iterative Refinement
189

190
Refine synthesis strategy that iteratively improves responses using additional context.
191

192
```python { .api }
193
class Refine:
194
    """
195
    Iterative refinement synthesis strategy.
196
    
197
    Starts with initial response and refines it using additional retrieved chunks.
198
    
199
    Args:
200
        text_qa_template: Template for initial response
201
        refine_template: Template for refinement steps
202
        callback_manager: Callback manager for events
203
        streaming: Enable streaming responses
204
    """
205
    def __init__(
206
        self,
207
        text_qa_template=None,
208
        refine_template=None,
209
        callback_manager=None,
210
        streaming=False,
211
        **kwargs
212
    ): ...
213
    
214
    def synthesize(self, query, nodes, **kwargs):
215
        """Iteratively refine response using retrieved nodes."""
216
    
217
    async def asynthesize(self, query, nodes, **kwargs):
218
        """Async iterative refinement."""
219
```
220

221
**Usage Example:**
222

223
```python
224
from llama_index.core.response_synthesizers import Refine
225
from llama_index.core.prompts import PromptTemplate
226

227
# Custom refinement template
228
refine_template = PromptTemplate(
229
    "The original query is as follows: {query_str}\n"
230
    "We have provided an existing answer: {existing_answer}\n"
231
    "We have the opportunity to refine the existing answer "
232
    "(only if needed) with some more context below.\n"
233
    "------------\n"
234
    "{context_msg}\n"
235
    "------------\n"
236
    "Given the new context, refine the original answer to better "
237
    "answer the query. If the context isn't useful, return the original answer.\n"
238
    "Refined Answer: "
239
)
240

241
refine_synthesizer = Refine(
242
    refine_template=refine_template,
243
    streaming=True
244
)
245

246
query_engine = index.as_query_engine(
247
    response_synthesizer=refine_synthesizer
248
)
249
```
250

251
### Simple Summarization
252

253
Direct summarization strategy for straightforward responses without complex processing.
254

255
```python { .api }
256
class SimpleSummarize:
257
    """
258
    Simple summarization synthesis.
259
    
260
    Directly summarizes all retrieved context in a single step.
261
    
262
    Args:
263
        text_qa_template: Template for question answering
264
        callback_manager: Callback manager for events
265
        streaming: Enable streaming responses
266
    """
267
    def __init__(
268
        self,
269
        text_qa_template=None,
270
        callback_manager=None,
271
        streaming=False,
272
        **kwargs
273
    ): ...
274
    
275
    def synthesize(self, query, nodes, **kwargs):
276
        """Simple one-step summarization."""
277
```
278

279
### Accumulate Responses
280

281
Accumulation strategy that concatenates individual responses from each retrieved chunk.
282

283
```python { .api }
284
class Accumulate:
285
    """
286
    Accumulate synthesis strategy.
287
    
288
    Generates individual responses for each chunk and concatenates them.
289
    
290
    Args:
291
        text_qa_template: Template for individual chunk responses
292
        output_cls: Structured output class
293
        callback_manager: Callback manager for events
294
        use_async: Enable asynchronous processing
295
    """
296
    def __init__(
297
        self,
298
        text_qa_template=None,
299
        output_cls=None,
300
        callback_manager=None,
301
        use_async=False,
302
        **kwargs
303
    ): ...
304
    
305
    def synthesize(self, query, nodes, **kwargs):
306
        """Accumulate responses from individual chunks."""
307
```
308

309
**Usage Example:**
310

311
```python
312
from llama_index.core.response_synthesizers import Accumulate
313

314
accumulate_synthesizer = Accumulate(
315
    use_async=True  # Process chunks in parallel
316
)
317

318
# Good for gathering diverse perspectives
319
query_engine = index.as_query_engine(
320
    response_synthesizer=accumulate_synthesizer,
321
    similarity_top_k=5
322
)
323

324
response = query_engine.query("What are different opinions on this topic?")
325
print(response.response)  # Contains accumulated individual responses
326
```
327

328
### Generation Strategy
329

330
Direct generation strategy that creates responses without using retrieved context.
331

332
```python { .api }
333
class Generation:
334
    """
335
    Generation synthesis strategy.
336
    
337
    Generates responses directly from the query without using retrieved context.
338
    
339
    Args:
340
        simple_template: Template for direct generation
341
        callback_manager: Callback manager for events
342
        streaming: Enable streaming responses
343
    """
344
    def __init__(
345
        self,
346
        simple_template=None,
347
        callback_manager=None,
348
        streaming=False,
349
        **kwargs
350
    ): ...
351
    
352
    def synthesize(self, query, nodes, **kwargs):
353
        """Generate response directly from query."""
354
```
355

356
### Base Synthesizer Interface
357

358
Base class for implementing custom response synthesis strategies.
359

360
```python { .api }
361
class BaseSynthesizer:
362
    """
363
    Base class for response synthesizers.
364
    
365
    Args:
366
        callback_manager: Callback manager for events
367
        streaming: Enable streaming responses
368
    """
369
    def __init__(
370
        self,
371
        callback_manager=None,
372
        streaming=False,
373
        **kwargs
374
    ): ...
375
    
376
    def synthesize(
377
        self,
378
        query,
379
        nodes,
380
        additional_source_nodes=None,
381
        **kwargs
382
    ):
383
        """
384
        Synthesize response from query and nodes.
385
        
386
        Args:
387
            query: User query string or QueryBundle
388
            nodes: List of NodeWithScore objects from retrieval
389
            additional_source_nodes: Extra source nodes for context
390
            
391
        Returns:
392
            Response: Generated response with metadata
393
        """
394
    
395
    async def asynthesize(self, query, nodes, **kwargs):
396
        """Async version of synthesize method."""
397
    
398
    def get_prompts(self):
399
        """Get prompt templates used by synthesizer."""
400
    
401
    def update_prompts(self, prompts_dict):
402
        """Update prompt templates."""
403
```
404

405
### Structured Output Synthesis
406

407
Advanced synthesis with structured output generation for extracting specific information formats.
408

409
```python { .api }
410
class StructuredResponseSynthesizer(BaseSynthesizer):
411
    """
412
    Structured response synthesizer for typed outputs.
413
    
414
    Args:
415
        output_cls: Pydantic model class for structured output
416
        llm: Language model for generation
417
        text_qa_template: Template for question answering
418
        streaming: Enable streaming (limited for structured output)
419
    """
420
    def __init__(
421
        self,
422
        output_cls,
423
        llm=None,
424
        text_qa_template=None,
425
        streaming=False,
426
        **kwargs
427
    ): ...
428
    
429
    def synthesize(self, query, nodes, **kwargs):
430
        """Generate structured response matching output_cls schema."""
431
```
432

433
**Structured Output Example:**
434

435
```python
436
from pydantic import BaseModel
437
from typing import List
438
from llama_index.core.response_synthesizers import get_response_synthesizer
439

440
class SummaryOutput(BaseModel):
441
    main_points: List[str]
442
    sentiment: str
443
    confidence_score: float
444

445
# Create structured synthesizer
446
structured_synthesizer = get_response_synthesizer(
447
    response_mode="compact",
448
    output_cls=SummaryOutput,
449
    structured_answer_filtering=True
450
)
451

452
query_engine = index.as_query_engine(
453
    response_synthesizer=structured_synthesizer
454
)
455

456
response = query_engine.query("Summarize the main points")
457
structured_data = response.metadata.get("structured_response")
458
# structured_data is now a SummaryOutput instance
459
```
460

461
### Custom Synthesis Strategies
462

463
Framework for implementing custom response synthesis logic with full control over the generation process.
464

465
```python { .api }
466
class CustomSynthesizer(BaseSynthesizer):
467
    """
468
    Custom response synthesizer implementation.
469
    
470
    Args:
471
        custom_prompt: Custom prompt template
472
        processing_fn: Custom processing function
473
        **kwargs: BaseSynthesizer arguments
474
    """
475
    def __init__(
476
        self,
477
        custom_prompt=None,
478
        processing_fn=None,
479
        **kwargs
480
    ): ...
481
    
482
    def synthesize(self, query, nodes, **kwargs):
483
        """Custom synthesis logic."""
484
        context_str = self._prepare_context(nodes)
485
        
486
        if self.processing_fn:
487
            return self.processing_fn(query, context_str, **kwargs)
488
        
489
        # Default processing
490
        return self._generate_response(query, context_str)
491
    
492
    def _prepare_context(self, nodes):
493
        """Prepare context string from nodes."""
494
        return "\n\n".join([node.node.get_content() for node in nodes])
495
    
496
    def _generate_response(self, query, context):
497
        """Generate response using LLM."""
498
        # Implementation details
499
        pass
500
```
501

502
**Custom Synthesizer Example:**
503

504
```python
505
from llama_index.core.response_synthesizers import BaseSynthesizer
506
from llama_index.core.base.response.schema import Response
507

508
class FactCheckSynthesizer(BaseSynthesizer):
509
    """Custom synthesizer that fact-checks responses."""
510
    
511
    def __init__(self, fact_check_threshold=0.8, **kwargs):
512
        super().__init__(**kwargs)
513
        self.fact_check_threshold = fact_check_threshold
514
    
515
    def synthesize(self, query, nodes, **kwargs):
516
        # Generate initial response
517
        context_str = "\n\n".join([node.node.get_content() for node in nodes])
518
        
519
        initial_response = self._llm.complete(
520
            f"Context: {context_str}\n\nQuestion: {query}\n\nAnswer:"
521
        )
522
        
523
        # Fact-check the response
524
        fact_check_score = self._fact_check(initial_response.text, context_str)
525
        
526
        if fact_check_score < self.fact_check_threshold:
527
            # Generate more conservative response
528
            refined_response = self._llm.complete(
529
                f"Based only on the provided context, answer: {query}\n"
530
                f"Context: {context_str}\n"
531
                f"Conservative Answer:"
532
            )
533
            response_text = refined_response.text
534
        else:
535
            response_text = initial_response.text
536
        
537
        return Response(
538
            response=response_text,
539
            source_nodes=nodes,
540
            metadata={"fact_check_score": fact_check_score}
541
        )
542
    
543
    def _fact_check(self, response_text, context_str):
544
        # Custom fact-checking logic
545
        # Return confidence score 0-1
546
        return 0.9  # Placeholder
547

548
# Use custom synthesizer
549
fact_check_synthesizer = FactCheckSynthesizer(
550
    fact_check_threshold=0.85,
551
    streaming=False
552
)
553

554
query_engine = index.as_query_engine(
555
    response_synthesizer=fact_check_synthesizer
556
)
557
```
558

559
### Response Metadata and Source Tracking
560

561
Advanced response objects with comprehensive metadata and source attribution.
562

563
```python { .api }
564
class Response:
565
    """
566
    Response object with synthesis results and metadata.
567
    
568
    Attributes:
569
        response: Generated response text
570
        source_nodes: List of source nodes used
571
        metadata: Additional response metadata
572
    """
573
    response: str
574
    source_nodes: List[NodeWithScore]
575
    metadata: Dict[str, Any]
576
    
577
    def get_formatted_sources(self, length=100):
578
        """Get formatted source excerpts."""
579
    
580
    def __str__(self):
581
        """String representation of response."""
582

583
class StreamingResponse:
584
    """
585
    Streaming response for real-time synthesis.
586
    
587
    Methods:
588
        response_gen: Generator yielding response tokens
589
        get_response: Get complete response object
590
        print_response_stream: Print streaming response
591
    """
592
    def response_gen(self):
593
        """Generate response tokens in real-time."""
594
    
595
    def get_response(self):
596
        """Get final complete response."""
597
    
598
    def print_response_stream(self):
599
        """Print response as it's generated."""
600
```
601

602
**Response Usage Example:**
603

604
```python
605
# Regular response
606
response = query_engine.query("What is machine learning?")
607
print(f"Response: {response.response}")
608
print(f"Sources: {len(response.source_nodes)}")
609
print(f"Metadata: {response.metadata}")
610

611
# Streaming response
612
streaming_engine = index.as_query_engine(
613
    response_synthesizer=get_response_synthesizer(streaming=True)
614
)
615

616
streaming_response = streaming_engine.query("Explain neural networks")
617
streaming_response.print_response_stream()
618
```

Version

Tile

Files

response-synthesis.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

response-synthesis.mddocs/