or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-pyes

Python Elastic Search driver providing a pythonic interface for interacting with ElasticSearch clusters

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/pyes@0.99.x

To install, run

npx @tessl/cli install tessl/pypi-pyes@0.99.0

0

# PyES - Python ElasticSearch Driver

1

2

## Overview

3

4

PyES is a comprehensive Python client library for ElasticSearch that provides a pythonic interface for interacting with ElasticSearch clusters. First released in 2010, it offers extensive functionality for indexing, searching, and managing ElasticSearch infrastructure with support for both Python 2 and Python 3.

5

6

**Version**: 0.99.6

7

**License**: BSD

8

**Documentation**: http://pyes.rtfd.org/

9

**PyPI**: https://pypi.org/project/pyes/

10

11

## Installation

12

13

```bash

14

pip install pyes

15

```

16

17

## Core Imports

18

19

```python { .api }

20

# Main client class

21

from pyes import ES

22

23

# Query DSL classes

24

from pyes import (

25

Query, Search, BoolQuery, MatchAllQuery, TermQuery, TermsQuery,

26

RangeQuery, FilteredQuery, QueryStringQuery, MatchQuery,

27

MultiMatchQuery, TextQuery, SimpleQueryStringQuery,

28

FuzzyQuery, FuzzyLikeThisQuery, MoreLikeThisQuery,

29

PrefixQuery, WildcardQuery, RegexTermQuery, IdsQuery,

30

ConstantScoreQuery, DisMaxQuery, BoostingQuery,

31

CustomScoreQuery, FunctionScoreQuery, HasChildQuery,

32

HasParentQuery, TopChildrenQuery, NestedQuery,

33

SpanTermQuery, SpanFirstQuery, SpanNearQuery,

34

SpanNotQuery, SpanOrQuery, SpanMultiQuery,

35

PercolatorQuery, RescoreQuery, Suggest

36

)

37

38

# Filter DSL classes

39

from pyes import (

40

Filter, FilterList, ANDFilter, ORFilter, BoolFilter, NotFilter,

41

TermFilter, TermsFilter, PrefixFilter, RegexTermFilter,

42

ExistsFilter, MissingFilter, RangeFilter, LimitFilter,

43

GeoDistanceFilter, GeoBoundingBoxFilter, GeoPolygonFilter,

44

GeoShapeFilter, GeoIndexedShapeFilter, HasChildFilter,

45

HasParentFilter, NestedFilter, TypeFilter, IdsFilter,

46

QueryFilter, ScriptFilter, MatchAllFilter, RawFilter

47

)

48

49

# Facet and Aggregation classes

50

from pyes import (

51

FacetFactory, TermFacet, DateHistogramFacet, HistogramFacet,

52

RangeFacet, GeoDistanceFacet, StatisticalFacet, TermStatsFacet,

53

QueryFacet, FilterFacet, AggFactory, Agg, BucketAgg,

54

TermsAgg, DateHistogramAgg, HistogramAgg, RangeAgg,

55

FilterAgg, FiltersAgg, NestedAgg, ReverseNestedAgg,

56

MissingAgg, StatsAgg, ValueCountAgg, SumAgg, AvgAgg,

57

MinAgg, MaxAgg, CardinalityAgg, TermStatsAgg

58

)

59

60

# Mapping classes

61

from pyes import (

62

Mapper, AbstractField, StringField, NumericFieldAbstract,

63

IntegerField, LongField, FloatField, DoubleField,

64

DateField, BooleanField, BinaryField, IpField,

65

ByteField, ShortField, GeoPointField, MultiField,

66

ObjectField, NestedObject, DocumentObjectField,

67

AttachmentField

68

)

69

70

# River classes

71

from pyes import (

72

River, RabbitMQRiver, TwitterRiver, CouchDBRiver,

73

JDBCRiver, MongoDBRiver

74

)

75

76

# Utility functions

77

from pyes import (

78

file_to_attachment, make_path, make_id, clean_string,

79

string_b64encode, string_b64decode, quote, ESRange,

80

ESRangeOp, TermsLookup

81

)

82

83

# Exception classes

84

from pyes import (

85

ElasticSearchException, QueryError, InvalidQuery,

86

InvalidParameterQuery, IndexAlreadyExistsException,

87

IndexMissingException, InvalidIndexNameException,

88

TypeMissingException, DocumentAlreadyExistsException,

89

DocumentMissingException, VersionConflictEngineException,

90

BulkOperationException, SearchPhaseExecutionException,

91

ReduceSearchPhaseException, ReplicationShardOperationFailedException,

92

ClusterBlockException, MapperParsingException, NoServerAvailable

93

)

94

```

95

96

## Basic Usage Example

97

98

```python { .api }

99

from pyes import ES, TermQuery, Search

100

101

# Create ES client connection

102

es = ES('localhost:9200')

103

104

# Index a document

105

doc = {

106

"title": "Python ElasticSearch Guide",

107

"content": "Comprehensive guide to using PyES library",

108

"tags": ["python", "elasticsearch", "search"],

109

"published": "2023-01-15",

110

"author": "John Doe"

111

}

112

es.index(doc, "blog", "post", id="1")

113

114

# Search for documents

115

query = Search(TermQuery("tags", "python"))

116

results = es.search(query, indices=["blog"])

117

118

# Process results

119

for hit in results:

120

print(f"Title: {hit.title}")

121

print(f"Score: {hit._meta.score}")

122

```

123

124

## Architecture Overview

125

126

PyES provides a layered architecture for ElasticSearch interaction:

127

128

1. **Client Layer** (`ES` class) - Connection management and high-level operations

129

2. **Query DSL** - Pythonic query construction with full ElasticSearch query support

130

3. **Filter DSL** - Filtering capabilities with logical and specialized filters

131

4. **Facets & Aggregations** - Data analysis and summarization tools

132

5. **Mapping System** - Schema definition and field type management

133

6. **River System** - Data streaming from external sources

134

7. **Bulk Operations** - High-performance batch processing

135

8. **Index Management** - Index lifecycle and cluster administration

136

137

## Core Capabilities

138

139

### ES Client Operations

140

The main `ES` class provides comprehensive ElasticSearch client functionality:

141

142

```python { .api }

143

# Initialize client with configuration

144

es = ES(

145

server="localhost:9200",

146

timeout=30.0,

147

bulk_size=400,

148

max_retries=3,

149

basic_auth=("username", "password")

150

)

151

152

# Document operations

153

doc_id = es.index(document, "index_name", "doc_type", id="optional_id")

154

document = es.get("index_name", "doc_type", "doc_id")

155

es.update("index_name", "doc_type", "doc_id", script="ctx._source.views += 1")

156

es.delete("index_name", "doc_type", "doc_id")

157

158

# Bulk operations for performance

159

es.index(doc1, "index", "type", bulk=True)

160

es.index(doc2, "index", "type", bulk=True)

161

es.flush_bulk() # Execute all buffered operations

162

```

163

164

**[→ Full ES Client Reference](client.md)**

165

166

### Query DSL Construction

167

Build complex search queries with the comprehensive query DSL:

168

169

```python { .api }

170

from pyes import Search, BoolQuery, TermQuery, RangeQuery, MatchQuery

171

172

# Complex boolean query

173

query = Search(

174

BoolQuery(

175

must=[MatchQuery("title", "python")],

176

should=[TermQuery("tags", "tutorial")],

177

must_not=[TermQuery("status", "draft")],

178

filter=RangeQuery("published", gte="2023-01-01")

179

)

180

).size(20).sort("published", order="desc")

181

182

results = es.search(query, indices=["blog"])

183

```

184

185

**[→ Complete Query DSL Reference](query-dsl.md)**

186

187

### Filter DSL for Performance

188

Use filters for fast, non-scored filtering:

189

190

```python { .api }

191

from pyes import BoolFilter, TermFilter, RangeFilter, GeoDistanceFilter

192

193

# Geographic and term filtering

194

filter = BoolFilter(

195

must=[

196

TermFilter("category", "restaurant"),

197

RangeFilter("rating", gte=4.0),

198

GeoDistanceFilter(

199

distance="5km",

200

location={"lat": 40.7128, "lon": -74.0060}

201

)

202

]

203

)

204

205

filtered_query = Search().filter(filter)

206

```

207

208

**[→ Complete Filter DSL Reference](filters.md)**

209

210

### Facets and Aggregations

211

Analyze and summarize data with facets and aggregations:

212

213

```python { .api }

214

from pyes import Search, TermsAgg, DateHistogramAgg, StatsAgg

215

216

# Multi-level aggregations

217

search = Search().add_aggregation(

218

TermsAgg("categories", field="category.keyword", size=10)

219

.add_aggregation(

220

DateHistogramAgg("monthly", field="published", interval="month")

221

)

222

).add_aggregation(

223

StatsAgg("price_stats", field="price")

224

)

225

226

results = es.search(search, indices=["products"])

227

categories = results.facets.categories

228

monthly_trend = results.facets.categories.monthly

229

price_stats = results.facets.price_stats

230

```

231

232

**[→ Complete Facets & Aggregations Reference](facets-aggregations.md)**

233

234

### Index Mapping Management

235

Define and manage index schemas with typed field mappings:

236

237

```python { .api }

238

from pyes import Mapper, StringField, IntegerField, DateField, GeoPointField

239

240

# Define document mapping

241

mapping = Mapper()

242

mapping.add_property("title", StringField(analyzer="standard"))

243

mapping.add_property("content", StringField(analyzer="english"))

244

mapping.add_property("views", IntegerField())

245

mapping.add_property("published", DateField())

246

mapping.add_property("location", GeoPointField())

247

248

# Apply mapping to index

249

es.indices.put_mapping("blog_post", mapping.as_dict(), indices=["blog"])

250

```

251

252

**[→ Complete Mappings Reference](mappings.md)**

253

254

### Rivers for Data Streaming

255

Set up automated data ingestion from external sources:

256

257

```python { .api }

258

from pyes import CouchDBRiver, TwitterRiver, JDBCRiver

259

260

# CouchDB replication river

261

couchdb_river = CouchDBRiver(

262

couchdb_db="mydb",

263

couchdb_host="localhost",

264

couchdb_port=5984,

265

es_index="replicated_data",

266

es_type="document"

267

)

268

es.create_river(couchdb_river, "couchdb_sync")

269

270

# Twitter streaming river

271

twitter_river = TwitterRiver(

272

oauth_token="token",

273

oauth_secret="secret",

274

consumer_key="key",

275

consumer_secret="secret",

276

filter_tracks=["python", "elasticsearch"]

277

)

278

es.create_river(twitter_river, "twitter_stream")

279

```

280

281

**[→ Complete Rivers Reference](rivers.md)**

282

283

### Bulk Operations for Performance

284

Handle large-scale data operations efficiently:

285

286

```python { .api }

287

# Configure bulk processing

288

es.bulk_size = 1000 # Process in batches of 1000

289

290

# Bulk indexing with automatic flushing

291

documents = [{"title": f"Doc {i}", "content": f"Content {i}"} for i in range(5000)]

292

293

for doc in documents:

294

es.index(doc, "bulk_index", "doc", bulk=True)

295

# Automatically flushes when bulk_size reached

296

297

# Manual bulk operations

298

es.force_bulk() # Force immediate processing

299

300

# Bulk deletion

301

es.delete("index", "type", "id1", bulk=True)

302

es.delete("index", "type", "id2", bulk=True)

303

es.flush_bulk()

304

```

305

306

**[→ Complete Bulk Operations Reference](bulk-operations.md)**

307

308

## Advanced Features

309

310

### Percolator Queries

311

Store queries and match documents against them:

312

313

```python { .api }

314

# Register percolator query

315

percolator_query = TermQuery("tags", "python")

316

es.create_percolator("blog", "python_posts", percolator_query)

317

318

# Test document against registered queries

319

doc = {"title": "Python Tutorial", "tags": ["python", "programming"]}

320

matches = es.percolate("blog", ["post"], doc)

321

```

322

323

### More Like This

324

Find similar documents:

325

326

```python { .api }

327

similar_docs = es.morelikethis(

328

"blog", "post", "doc_id_1",

329

fields=["title", "content"],

330

min_term_freq=1,

331

max_query_terms=12

332

)

333

```

334

335

### Suggestions and Auto-complete

336

Provide search suggestions:

337

338

```python { .api }

339

from pyes import Suggest

340

341

# Term suggestions

342

suggest = Suggest()

343

suggest.add_term("python programming", "title_suggest", "title")

344

345

suggestions = es.suggest_from_object(suggest, indices=["blog"])

346

```

347

348

### Geospatial Search

349

Search by geographic location:

350

351

```python { .api }

352

from pyes import GeoDistanceFilter, Search

353

354

# Find restaurants within 2km

355

geo_query = Search().filter(

356

GeoDistanceFilter(

357

distance="2km",

358

location={"lat": 40.7128, "lon": -74.0060}

359

)

360

)

361

362

nearby_restaurants = es.search(geo_query, indices=["restaurants"])

363

```

364

365

## Connection and Configuration

366

367

PyES supports multiple connection protocols and extensive configuration:

368

369

```python { .api }

370

# HTTP connection (default)

371

es = ES(

372

server=["host1:9200", "host2:9200"], # Multiple hosts for failover

373

timeout=30.0,

374

max_retries=3,

375

retry_time=60,

376

basic_auth=("username", "password"),

377

cert_reqs='CERT_REQUIRED' # SSL certificate verification

378

)

379

380

# Thrift connection (optional)

381

from pyes import ES

382

es = ES(server="localhost:9500", connection_type="thrift")

383

```

384

385

## Error Handling

386

387

PyES provides comprehensive exception handling:

388

389

```python { .api }

390

from pyes import (

391

ElasticSearchException, IndexMissingException,

392

DocumentMissingException, BulkOperationException

393

)

394

395

try:

396

result = es.get("missing_index", "doc_type", "doc_id")

397

except IndexMissingException:

398

print("Index does not exist")

399

except DocumentMissingException:

400

print("Document not found")

401

except ElasticSearchException as e:

402

print(f"ElasticSearch error: {e}")

403

```

404

405

## Performance Considerations

406

407

- Use bulk operations for high-throughput indexing

408

- Implement connection pooling for concurrent access

409

- Use filters instead of queries when scoring is not needed

410

- Configure appropriate bulk_size based on document size and memory

411

- Use scan & scroll for large result sets

412

- Implement proper error handling and retry logic

413

414

## Migration and Compatibility

415

416

PyES maintains compatibility with ElasticSearch versions up to 2.x. For newer ElasticSearch versions (5.x+), consider migrating to the official `elasticsearch-py` client. PyES supports both Python 2 and Python 3.

417

418

---

419

420

This documentation provides comprehensive coverage of the PyES Python ElasticSearch driver. Each linked section contains detailed API references, examples, and usage patterns for building robust search-enabled applications.