or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

agents-workflows.mddata-indexing.mddocument-processing.mdindex.mdllm-integration.mdprompts.mdquery-processing.mdresponse-synthesis.mdretrievers.mdstorage-settings.md

response-synthesis.mddocs/

0

# Response Synthesis

1

2

Response generation strategies for combining retrieved context into coherent answers with various summarization approaches and synthesis modes.

3

4

## Capabilities

5

6

### Response Synthesizer Factory

7

8

Factory function for creating response synthesizers with different strategies and configurations.

9

10

```python { .api }

11

def get_response_synthesizer(

12

response_mode="compact",

13

service_context=None,

14

text_qa_template=None,

15

refine_template=None,

16

summary_template=None,

17

simple_template=None,

18

use_async=False,

19

streaming=False,

20

structured_answer_filtering=False,

21

**kwargs

22

):

23

"""

24

Create response synthesizer with specified mode and configuration.

25

26

Args:

27

response_mode: Synthesis strategy ("compact", "refine", "tree_summarize",

28

"simple_summarize", "accumulate", "generation")

29

service_context: Service context (deprecated, use Settings)

30

text_qa_template: Template for question-answering

31

refine_template: Template for iterative refinement

32

summary_template: Template for summarization

33

simple_template: Template for simple responses

34

use_async: Enable asynchronous processing

35

streaming: Enable streaming responses

36

structured_answer_filtering: Filter responses for structured output

37

38

Returns:

39

BaseSynthesizer: Configured response synthesizer

40

"""

41

```

42

43

**Usage Example:**

44

45

```python

46

from llama_index.core import get_response_synthesizer

47

48

# Compact mode (default) - combines chunks efficiently

49

synthesizer = get_response_synthesizer(

50

response_mode="compact",

51

streaming=True

52

)

53

54

# Tree summarize mode - hierarchical summarization

55

tree_synthesizer = get_response_synthesizer(

56

response_mode="tree_summarize",

57

use_async=True

58

)

59

60

# Refine mode - iterative improvement

61

refine_synthesizer = get_response_synthesizer(

62

response_mode="refine",

63

structured_answer_filtering=True

64

)

65

66

# Use with query engine

67

query_engine = index.as_query_engine(

68

response_synthesizer=synthesizer

69

)

70

```

71

72

### Compact Response Synthesis

73

74

Efficient synthesis mode that combines retrieved chunks with intelligent context compression.

75

76

```python { .api }

77

class CompactAndRefine:

78

"""

79

Compact and refine synthesis strategy.

80

81

Combines chunks into larger contexts, then applies refinement for final answer.

82

83

Args:

84

text_qa_template: Template for initial question answering

85

refine_template: Template for iterative refinement

86

max_prompt_size: Maximum prompt size in tokens

87

callback_manager: Callback manager for events

88

use_async: Enable asynchronous processing

89

streaming: Enable streaming responses

90

"""

91

def __init__(

92

self,

93

text_qa_template=None,

94

refine_template=None,

95

max_prompt_size=None,

96

callback_manager=None,

97

use_async=False,

98

streaming=False,

99

**kwargs

100

): ...

101

102

def synthesize(

103

self,

104

query,

105

nodes,

106

additional_source_nodes=None,

107

**kwargs

108

):

109

"""

110

Synthesize response from query and retrieved nodes.

111

112

Args:

113

query: User query or QueryBundle

114

nodes: List of retrieved NodeWithScore objects

115

additional_source_nodes: Extra context nodes

116

117

Returns:

118

Response: Synthesized response with sources

119

"""

120

121

async def asynthesize(self, query, nodes, **kwargs):

122

"""Async version of synthesize."""

123

```

124

125

### Tree Summarization

126

127

Hierarchical summarization strategy that builds responses bottom-up through tree structures.

128

129

```python { .api }

130

class TreeSummarize:

131

"""

132

Tree-based summarization synthesis.

133

134

Recursively summarizes chunks in a tree structure for comprehensive responses.

135

136

Args:

137

summary_template: Template for summarization steps

138

text_qa_template: Template for final question answering

139

use_async: Enable asynchronous processing

140

callback_manager: Callback manager for events

141

"""

142

def __init__(

143

self,

144

summary_template=None,

145

text_qa_template=None,

146

use_async=False,

147

callback_manager=None,

148

**kwargs

149

): ...

150

151

def synthesize(self, query, nodes, **kwargs):

152

"""Tree-based synthesis of response."""

153

154

async def asynthesize(self, query, nodes, **kwargs):

155

"""Async tree synthesis."""

156

```

157

158

**Usage Example:**

159

160

```python

161

from llama_index.core.response_synthesizers import TreeSummarize

162

from llama_index.core.prompts import PromptTemplate

163

164

# Custom summarization template

165

summary_template = PromptTemplate(

166

"Context information is below:\n"

167

"---------------------\n"

168

"{context_str}\n"

169

"---------------------\n"

170

"Summarize the key points relevant to: {query_str}\n"

171

"Summary: "

172

)

173

174

tree_synthesizer = TreeSummarize(

175

summary_template=summary_template,

176

use_async=True

177

)

178

179

# Use with query engine

180

query_engine = index.as_query_engine(

181

response_synthesizer=tree_synthesizer,

182

similarity_top_k=10 # More chunks for tree processing

183

)

184

185

response = query_engine.query("What are the main themes in the documents?")

186

```

187

188

### Iterative Refinement

189

190

Refine synthesis strategy that iteratively improves responses using additional context.

191

192

```python { .api }

193

class Refine:

194

"""

195

Iterative refinement synthesis strategy.

196

197

Starts with initial response and refines it using additional retrieved chunks.

198

199

Args:

200

text_qa_template: Template for initial response

201

refine_template: Template for refinement steps

202

callback_manager: Callback manager for events

203

streaming: Enable streaming responses

204

"""

205

def __init__(

206

self,

207

text_qa_template=None,

208

refine_template=None,

209

callback_manager=None,

210

streaming=False,

211

**kwargs

212

): ...

213

214

def synthesize(self, query, nodes, **kwargs):

215

"""Iteratively refine response using retrieved nodes."""

216

217

async def asynthesize(self, query, nodes, **kwargs):

218

"""Async iterative refinement."""

219

```

220

221

**Usage Example:**

222

223

```python

224

from llama_index.core.response_synthesizers import Refine

225

from llama_index.core.prompts import PromptTemplate

226

227

# Custom refinement template

228

refine_template = PromptTemplate(

229

"The original query is as follows: {query_str}\n"

230

"We have provided an existing answer: {existing_answer}\n"

231

"We have the opportunity to refine the existing answer "

232

"(only if needed) with some more context below.\n"

233

"------------\n"

234

"{context_msg}\n"

235

"------------\n"

236

"Given the new context, refine the original answer to better "

237

"answer the query. If the context isn't useful, return the original answer.\n"

238

"Refined Answer: "

239

)

240

241

refine_synthesizer = Refine(

242

refine_template=refine_template,

243

streaming=True

244

)

245

246

query_engine = index.as_query_engine(

247

response_synthesizer=refine_synthesizer

248

)

249

```

250

251

### Simple Summarization

252

253

Direct summarization strategy for straightforward responses without complex processing.

254

255

```python { .api }

256

class SimpleSummarize:

257

"""

258

Simple summarization synthesis.

259

260

Directly summarizes all retrieved context in a single step.

261

262

Args:

263

text_qa_template: Template for question answering

264

callback_manager: Callback manager for events

265

streaming: Enable streaming responses

266

"""

267

def __init__(

268

self,

269

text_qa_template=None,

270

callback_manager=None,

271

streaming=False,

272

**kwargs

273

): ...

274

275

def synthesize(self, query, nodes, **kwargs):

276

"""Simple one-step summarization."""

277

```

278

279

### Accumulate Responses

280

281

Accumulation strategy that concatenates individual responses from each retrieved chunk.

282

283

```python { .api }

284

class Accumulate:

285

"""

286

Accumulate synthesis strategy.

287

288

Generates individual responses for each chunk and concatenates them.

289

290

Args:

291

text_qa_template: Template for individual chunk responses

292

output_cls: Structured output class

293

callback_manager: Callback manager for events

294

use_async: Enable asynchronous processing

295

"""

296

def __init__(

297

self,

298

text_qa_template=None,

299

output_cls=None,

300

callback_manager=None,

301

use_async=False,

302

**kwargs

303

): ...

304

305

def synthesize(self, query, nodes, **kwargs):

306

"""Accumulate responses from individual chunks."""

307

```

308

309

**Usage Example:**

310

311

```python

312

from llama_index.core.response_synthesizers import Accumulate

313

314

accumulate_synthesizer = Accumulate(

315

use_async=True # Process chunks in parallel

316

)

317

318

# Good for gathering diverse perspectives

319

query_engine = index.as_query_engine(

320

response_synthesizer=accumulate_synthesizer,

321

similarity_top_k=5

322

)

323

324

response = query_engine.query("What are different opinions on this topic?")

325

print(response.response) # Contains accumulated individual responses

326

```

327

328

### Generation Strategy

329

330

Direct generation strategy that creates responses without using retrieved context.

331

332

```python { .api }

333

class Generation:

334

"""

335

Generation synthesis strategy.

336

337

Generates responses directly from the query without using retrieved context.

338

339

Args:

340

simple_template: Template for direct generation

341

callback_manager: Callback manager for events

342

streaming: Enable streaming responses

343

"""

344

def __init__(

345

self,

346

simple_template=None,

347

callback_manager=None,

348

streaming=False,

349

**kwargs

350

): ...

351

352

def synthesize(self, query, nodes, **kwargs):

353

"""Generate response directly from query."""

354

```

355

356

### Base Synthesizer Interface

357

358

Base class for implementing custom response synthesis strategies.

359

360

```python { .api }

361

class BaseSynthesizer:

362

"""

363

Base class for response synthesizers.

364

365

Args:

366

callback_manager: Callback manager for events

367

streaming: Enable streaming responses

368

"""

369

def __init__(

370

self,

371

callback_manager=None,

372

streaming=False,

373

**kwargs

374

): ...

375

376

def synthesize(

377

self,

378

query,

379

nodes,

380

additional_source_nodes=None,

381

**kwargs

382

):

383

"""

384

Synthesize response from query and nodes.

385

386

Args:

387

query: User query string or QueryBundle

388

nodes: List of NodeWithScore objects from retrieval

389

additional_source_nodes: Extra source nodes for context

390

391

Returns:

392

Response: Generated response with metadata

393

"""

394

395

async def asynthesize(self, query, nodes, **kwargs):

396

"""Async version of synthesize method."""

397

398

def get_prompts(self):

399

"""Get prompt templates used by synthesizer."""

400

401

def update_prompts(self, prompts_dict):

402

"""Update prompt templates."""

403

```

404

405

### Structured Output Synthesis

406

407

Advanced synthesis with structured output generation for extracting specific information formats.

408

409

```python { .api }

410

class StructuredResponseSynthesizer(BaseSynthesizer):

411

"""

412

Structured response synthesizer for typed outputs.

413

414

Args:

415

output_cls: Pydantic model class for structured output

416

llm: Language model for generation

417

text_qa_template: Template for question answering

418

streaming: Enable streaming (limited for structured output)

419

"""

420

def __init__(

421

self,

422

output_cls,

423

llm=None,

424

text_qa_template=None,

425

streaming=False,

426

**kwargs

427

): ...

428

429

def synthesize(self, query, nodes, **kwargs):

430

"""Generate structured response matching output_cls schema."""

431

```

432

433

**Structured Output Example:**

434

435

```python

436

from pydantic import BaseModel

437

from typing import List

438

from llama_index.core.response_synthesizers import get_response_synthesizer

439

440

class SummaryOutput(BaseModel):

441

main_points: List[str]

442

sentiment: str

443

confidence_score: float

444

445

# Create structured synthesizer

446

structured_synthesizer = get_response_synthesizer(

447

response_mode="compact",

448

output_cls=SummaryOutput,

449

structured_answer_filtering=True

450

)

451

452

query_engine = index.as_query_engine(

453

response_synthesizer=structured_synthesizer

454

)

455

456

response = query_engine.query("Summarize the main points")

457

structured_data = response.metadata.get("structured_response")

458

# structured_data is now a SummaryOutput instance

459

```

460

461

### Custom Synthesis Strategies

462

463

Framework for implementing custom response synthesis logic with full control over the generation process.

464

465

```python { .api }

466

class CustomSynthesizer(BaseSynthesizer):

467

"""

468

Custom response synthesizer implementation.

469

470

Args:

471

custom_prompt: Custom prompt template

472

processing_fn: Custom processing function

473

**kwargs: BaseSynthesizer arguments

474

"""

475

def __init__(

476

self,

477

custom_prompt=None,

478

processing_fn=None,

479

**kwargs

480

): ...

481

482

def synthesize(self, query, nodes, **kwargs):

483

"""Custom synthesis logic."""

484

context_str = self._prepare_context(nodes)

485

486

if self.processing_fn:

487

return self.processing_fn(query, context_str, **kwargs)

488

489

# Default processing

490

return self._generate_response(query, context_str)

491

492

def _prepare_context(self, nodes):

493

"""Prepare context string from nodes."""

494

return "\n\n".join([node.node.get_content() for node in nodes])

495

496

def _generate_response(self, query, context):

497

"""Generate response using LLM."""

498

# Implementation details

499

pass

500

```

501

502

**Custom Synthesizer Example:**

503

504

```python

505

from llama_index.core.response_synthesizers import BaseSynthesizer

506

from llama_index.core.base.response.schema import Response

507

508

class FactCheckSynthesizer(BaseSynthesizer):

509

"""Custom synthesizer that fact-checks responses."""

510

511

def __init__(self, fact_check_threshold=0.8, **kwargs):

512

super().__init__(**kwargs)

513

self.fact_check_threshold = fact_check_threshold

514

515

def synthesize(self, query, nodes, **kwargs):

516

# Generate initial response

517

context_str = "\n\n".join([node.node.get_content() for node in nodes])

518

519

initial_response = self._llm.complete(

520

f"Context: {context_str}\n\nQuestion: {query}\n\nAnswer:"

521

)

522

523

# Fact-check the response

524

fact_check_score = self._fact_check(initial_response.text, context_str)

525

526

if fact_check_score < self.fact_check_threshold:

527

# Generate more conservative response

528

refined_response = self._llm.complete(

529

f"Based only on the provided context, answer: {query}\n"

530

f"Context: {context_str}\n"

531

f"Conservative Answer:"

532

)

533

response_text = refined_response.text

534

else:

535

response_text = initial_response.text

536

537

return Response(

538

response=response_text,

539

source_nodes=nodes,

540

metadata={"fact_check_score": fact_check_score}

541

)

542

543

def _fact_check(self, response_text, context_str):

544

# Custom fact-checking logic

545

# Return confidence score 0-1

546

return 0.9 # Placeholder

547

548

# Use custom synthesizer

549

fact_check_synthesizer = FactCheckSynthesizer(

550

fact_check_threshold=0.85,

551

streaming=False

552

)

553

554

query_engine = index.as_query_engine(

555

response_synthesizer=fact_check_synthesizer

556

)

557

```

558

559

### Response Metadata and Source Tracking

560

561

Advanced response objects with comprehensive metadata and source attribution.

562

563

```python { .api }

564

class Response:

565

"""

566

Response object with synthesis results and metadata.

567

568

Attributes:

569

response: Generated response text

570

source_nodes: List of source nodes used

571

metadata: Additional response metadata

572

"""

573

response: str

574

source_nodes: List[NodeWithScore]

575

metadata: Dict[str, Any]

576

577

def get_formatted_sources(self, length=100):

578

"""Get formatted source excerpts."""

579

580

def __str__(self):

581

"""String representation of response."""

582

583

class StreamingResponse:

584

"""

585

Streaming response for real-time synthesis.

586

587

Methods:

588

response_gen: Generator yielding response tokens

589

get_response: Get complete response object

590

print_response_stream: Print streaming response

591

"""

592

def response_gen(self):

593

"""Generate response tokens in real-time."""

594

595

def get_response(self):

596

"""Get final complete response."""

597

598

def print_response_stream(self):

599

"""Print response as it's generated."""

600

```

601

602

**Response Usage Example:**

603

604

```python

605

# Regular response

606

response = query_engine.query("What is machine learning?")

607

print(f"Response: {response.response}")

608

print(f"Sources: {len(response.source_nodes)}")

609

print(f"Metadata: {response.metadata}")

610

611

# Streaming response

612

streaming_engine = index.as_query_engine(

613

response_synthesizer=get_response_synthesizer(streaming=True)

614

)

615

616

streaming_response = streaming_engine.query("Explain neural networks")

617

streaming_response.print_response_stream()

618

```