or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

admin-operations.mdcore-client.mddocument-processing.mdindex.mdsearch-operations.mdsolrcloud-support.mdutilities.md

core-client.mddocs/

0

# Core Client Operations

1

2

Essential Solr operations that form the foundation for interacting with Solr servers. These operations handle client initialization, health monitoring, document management, and index maintenance.

3

4

## Capabilities

5

6

### Client Initialization

7

8

Create and configure a Solr client instance with connection settings, authentication, timeouts, and custom handlers.

9

10

```python { .api }

11

class Solr:

12

def __init__(self, url, decoder=None, encoder=None, timeout=60, results_cls=Results,

13

search_handler="select", use_qt_param=False, always_commit=False,

14

auth=None, verify=True, session=None):

15

"""

16

Initialize a Solr client.

17

18

Parameters:

19

- url (str): Solr server URL (e.g., 'http://localhost:8983/solr/core_name')

20

- decoder (json.JSONDecoder, optional): Custom JSON decoder instance

21

- encoder (json.JSONEncoder, optional): Custom JSON encoder instance

22

- timeout (int): Request timeout in seconds (default: 60)

23

- results_cls (type): Results class for search responses (default: Results)

24

- search_handler (str): Default search handler name (default: "select")

25

- use_qt_param (bool): Use qt parameter instead of handler path (default: False)

26

- always_commit (bool): Auto-commit all update operations (default: False)

27

- auth (tuple or requests auth object, optional): HTTP authentication

28

- verify (bool): Enable SSL certificate verification (default: True)

29

- session (requests.Session, optional): Custom requests session

30

"""

31

```

32

33

Usage:

34

35

```python

36

import pysolr

37

38

# Basic client

39

solr = pysolr.Solr('http://localhost:8983/solr/my_core')

40

41

# Client with timeout and authentication

42

solr = pysolr.Solr(

43

'https://solr.example.com/solr/my_core',

44

timeout=30,

45

auth=('username', 'password'),

46

always_commit=True

47

)

48

49

# Client with custom session and SSL settings

50

import requests

51

session = requests.Session()

52

session.headers.update({'User-Agent': 'MyApp/1.0'})

53

54

solr = pysolr.Solr(

55

'https://solr.example.com/solr/my_core',

56

session=session,

57

verify='/path/to/ca-bundle.crt'

58

)

59

```

60

61

### Health Check

62

63

Test connectivity and server health with ping operations.

64

65

```python { .api }

66

def ping(self, handler="admin/ping", **kwargs):

67

"""

68

Send a ping request to test server connectivity.

69

70

Parameters:

71

- handler (str): Ping handler path (default: "admin/ping")

72

- **kwargs: Additional parameters passed to Solr

73

74

Returns:

75

str: Server response content

76

77

Raises:

78

SolrError: If ping fails or server is unreachable

79

"""

80

```

81

82

Usage:

83

84

```python

85

try:

86

response = solr.ping()

87

print("Solr server is healthy")

88

except pysolr.SolrError as e:

89

print(f"Solr server is down: {e}")

90

```

91

92

### Document Indexing

93

94

Add or update documents in the Solr index with support for batch operations, field updates, and commit control.

95

96

```python { .api }

97

def add(self, docs, boost=None, fieldUpdates=None, commit=None, softCommit=False,

98

commitWithin=None, waitFlush=None, waitSearcher=None, overwrite=None,

99

handler="update", min_rf=None):

100

"""

101

Add or update documents in the index.

102

103

Parameters:

104

- docs (list or dict): Document(s) to index. Each document is a dict with field names as keys

105

- boost (dict, optional): Per-field boost values {"field_name": boost_value}

106

- fieldUpdates (dict, optional): Field update operations {"field": "set"/"add"/"inc"}

107

- commit (bool, optional): Force commit after operation (overrides always_commit)

108

- softCommit (bool): Perform soft commit (default: False)

109

- commitWithin (int, optional): Auto-commit within specified milliseconds

110

- waitFlush (bool, optional): Wait for flush to complete

111

- waitSearcher (bool, optional): Wait for new searcher

112

- overwrite (bool, optional): Allow document overwrites (default: True)

113

- handler (str): Update handler path (default: "update")

114

- min_rf (int, optional): Minimum replication factor for SolrCloud

115

116

Returns:

117

str: Server response content

118

119

Raises:

120

SolrError: If indexing fails

121

ValueError: If docs parameter is invalid

122

"""

123

```

124

125

Usage:

126

127

```python

128

# Single document

129

solr.add({

130

"id": "doc_1",

131

"title": "Sample Document",

132

"content": "This is the document content.",

133

"category": "example"

134

})

135

136

# Multiple documents

137

docs = [

138

{"id": "doc_1", "title": "First Document", "content": "Content 1"},

139

{"id": "doc_2", "title": "Second Document", "content": "Content 2"}

140

]

141

solr.add(docs)

142

143

# With field boosts

144

solr.add(

145

{"id": "doc_1", "title": "Important Document", "content": "Key content"},

146

boost={"title": 2.0, "content": 1.5}

147

)

148

149

# Atomic field updates

150

solr.add(

151

{"id": "existing_doc", "category": "updated"},

152

fieldUpdates={"category": "set"}

153

)

154

155

# With commit control

156

solr.add(docs, commit=True) # Force immediate commit

157

solr.add(docs, commitWithin=5000) # Auto-commit within 5 seconds

158

```

159

160

### Document Deletion

161

162

Remove documents from the index by ID or query with commit control options.

163

164

```python { .api }

165

def delete(self, id=None, q=None, commit=None, softCommit=False,

166

waitFlush=None, waitSearcher=None, handler="update"):

167

"""

168

Delete documents from the index.

169

170

Parameters:

171

- id (str, list, or None): Document ID(s) to delete. Can be single ID or list of IDs

172

- q (str or None): Lucene query to select documents for deletion

173

- commit (bool, optional): Force commit after deletion (overrides always_commit)

174

- softCommit (bool): Perform soft commit (default: False)

175

- waitFlush (bool, optional): Wait for flush to complete

176

- waitSearcher (bool, optional): Wait for new searcher

177

- handler (str): Update handler path (default: "update")

178

179

Returns:

180

str: Server response content

181

182

Raises:

183

SolrError: If deletion fails

184

ValueError: If neither id nor q is specified, or both are specified

185

"""

186

```

187

188

Usage:

189

190

```python

191

# Delete by single ID

192

solr.delete(id='doc_1')

193

194

# Delete by multiple IDs

195

solr.delete(id=['doc_1', 'doc_2', 'doc_3'])

196

197

# Delete by query

198

solr.delete(q='category:obsolete')

199

solr.delete(q='*:*') # Delete all documents

200

201

# With commit control

202

solr.delete(id='doc_1', commit=True)

203

```

204

205

### Index Commit

206

207

Force Solr to write pending changes to disk and make them searchable.

208

209

```python { .api }

210

def commit(self, softCommit=False, waitFlush=None, waitSearcher=None,

211

expungeDeletes=None, handler="update"):

212

"""

213

Force Solr to commit pending changes to disk.

214

215

Parameters:

216

- softCommit (bool): Perform soft commit (visible but not durable) (default: False)

217

- waitFlush (bool, optional): Wait for flush to complete before returning

218

- waitSearcher (bool, optional): Wait for new searcher before returning

219

- expungeDeletes (bool, optional): Expunge deleted documents during commit

220

- handler (str): Update handler path (default: "update")

221

222

Returns:

223

str: Server response content

224

225

Raises:

226

SolrError: If commit fails

227

"""

228

```

229

230

Usage:

231

232

```python

233

# Standard commit

234

solr.commit()

235

236

# Soft commit (fast, visible immediately but not durable)

237

solr.commit(softCommit=True)

238

239

# Hard commit with deleted document cleanup

240

solr.commit(expungeDeletes=True)

241

242

# Synchronous commit (wait for completion)

243

solr.commit(waitFlush=True, waitSearcher=True)

244

```

245

246

### Index Optimization

247

248

Optimize the Solr index by reducing the number of segments, improving query performance.

249

250

```python { .api }

251

def optimize(self, commit=True, waitFlush=None, waitSearcher=None,

252

maxSegments=None, handler="update"):

253

"""

254

Optimize the Solr index by merging segments.

255

256

Parameters:

257

- commit (bool): Commit after optimization (default: True)

258

- waitFlush (bool, optional): Wait for flush to complete

259

- waitSearcher (bool, optional): Wait for new searcher

260

- maxSegments (int, optional): Maximum number of segments to merge down to

261

- handler (str): Update handler path (default: "update")

262

263

Returns:

264

str: Server response content

265

266

Raises:

267

SolrError: If optimization fails

268

"""

269

```

270

271

Usage:

272

273

```python

274

# Basic optimization

275

solr.optimize()

276

277

# Optimize to specific segment count

278

solr.optimize(maxSegments=1)

279

280

# Asynchronous optimization

281

solr.optimize(waitFlush=False, waitSearcher=False)

282

```

283

284

### Content Extraction

285

286

Extract content and metadata from files using Apache Tika integration for rich document processing.

287

288

```python { .api }

289

def extract(self, file_obj, extractOnly=True, handler="update/extract", **kwargs):

290

"""

291

Extract content and metadata from files using Apache Tika.

292

293

Parameters:

294

- file_obj (file-like object): File object with a 'name' attribute to extract from

295

- extractOnly (bool): If True, only extract without indexing (default: True)

296

- handler (str): Extract handler path (default: "update/extract")

297

- **kwargs: Additional parameters passed to Solr ExtractingRequestHandler

298

299

Returns:

300

dict: Dictionary containing extracted content and metadata:

301

- contents: Extracted full-text content (if applicable)

302

- metadata: Key-value pairs of extracted metadata

303

304

Raises:

305

ValueError: If file_obj doesn't have a 'name' attribute

306

SolrError: If extraction fails or server error occurs

307

"""

308

```

309

310

Usage:

311

312

```python

313

# Extract content from a PDF file

314

with open('document.pdf', 'rb') as pdf_file:

315

extracted = solr.extract(pdf_file)

316

print("Content:", extracted.get('contents', 'No content'))

317

print("Metadata:", extracted.get('metadata', {}))

318

319

# Extract and index in one step

320

with open('document.docx', 'rb') as doc_file:

321

result = solr.extract(

322

doc_file,

323

extractOnly=False, # Index the document

324

literal_id='doc_123', # Provide document ID

325

literal_title='Important Document' # Add custom fields

326

)

327

```

328

329

## Types

330

331

```python { .api }

332

class SolrError(Exception):

333

"""Exception raised for Solr-related errors including network issues, timeouts, and server errors."""

334

pass

335

```