or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-wikipedia

Wikipedia API for Python that simplifies access to Wikipedia data through the MediaWiki API

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/wikipedia@1.4.x

To install, run

npx @tessl/cli install tessl/pypi-wikipedia@1.4.0

0

# Wikipedia

1

2

A Python library that provides easy access to Wikipedia data through the MediaWiki API. Wikipedia simplifies search, content retrieval, and metadata extraction from Wikipedia pages without requiring direct API knowledge.

3

4

## Package Information

5

6

- **Package Name**: wikipedia

7

- **Language**: Python

8

- **Installation**: `pip install wikipedia`

9

10

## Core Imports

11

12

```python

13

import wikipedia

14

```

15

16

All functionality is available through the main module:

17

18

```python

19

from wikipedia import search, page, summary, set_lang

20

from wikipedia import WikipediaPage, PageError, DisambiguationError

21

from datetime import timedelta # For set_rate_limiting

22

from decimal import Decimal # For coordinate types

23

```

24

25

## Basic Usage

26

27

```python

28

import wikipedia

29

from decimal import Decimal

30

31

# Search for articles

32

results = wikipedia.search("Barack Obama")

33

print(results) # ['Barack Obama', 'Barack Obama Sr.', ...]

34

35

# Get a page summary

36

summary = wikipedia.summary("Barack Obama", sentences=2)

37

print(summary)

38

39

# Get a full page with properties

40

page = wikipedia.page("Barack Obama")

41

print(page.title)

42

print(page.url)

43

print(page.content[:200]) # First 200 characters

44

print(page.images[:3]) # First 3 image URLs

45

print(page.links[:5]) # First 5 linked pages

46

47

# Geographic search

48

nearby = wikipedia.geosearch(40.7128, -74.0060, results=5) # NYC coordinates

49

print(nearby) # Articles near New York City

50

51

# Change language and search

52

wikipedia.set_lang("fr")

53

summary_fr = wikipedia.summary("Barack Obama", sentences=1)

54

print(summary_fr)

55

56

# Enable rate limiting for heavy usage

57

from datetime import timedelta

58

wikipedia.set_rate_limiting(True, min_wait=timedelta(milliseconds=100))

59

```

60

61

## Capabilities

62

63

### Search Functions

64

65

Search Wikipedia for articles and get suggestions.

66

67

```python { .api }

68

def search(query, results=10, suggestion=False):

69

"""

70

Search Wikipedia for articles matching the query.

71

72

Parameters:

73

- query (str): Search term

74

- results (int): Maximum number of results (default: 10)

75

- suggestion (bool): Return search suggestion if True (default: False)

76

77

Returns:

78

- list: Article titles if suggestion=False

79

- tuple: (titles_list, suggestion_string) if suggestion=True

80

"""

81

82

def geosearch(latitude, longitude, title=None, results=10, radius=1000):

83

"""

84

Geographic search for articles near coordinates.

85

86

Parameters:

87

- latitude (float): Latitude coordinate

88

- longitude (float): Longitude coordinate

89

- title (str, optional): Specific article to search for

90

- results (int): Maximum results (default: 10)

91

- radius (int): Search radius in meters (10-10000, default: 1000)

92

93

Returns:

94

- list: Article titles near the coordinates

95

96

Example:

97

# Find articles near the Eiffel Tower

98

eiffel_articles = geosearch(48.8584, 2.2945, radius=500)

99

# Find specific landmark near coordinates

100

landmarks = geosearch(40.7589, -73.9851, title="Central Park", radius=1000)

101

"""

102

103

def suggest(query):

104

"""

105

Get search suggestion for a query.

106

107

Parameters:

108

- query (str): Search term

109

110

Returns:

111

- str or None: Suggested search term or None if no suggestion

112

"""

113

114

def random(pages=1):

115

"""

116

Get random Wikipedia article titles.

117

118

Parameters:

119

- pages (int): Number of random articles (max 10, default: 1)

120

121

Returns:

122

- str: Single title if pages=1

123

- list: Multiple titles if pages>1

124

"""

125

```

126

127

### Content Access

128

129

Retrieve article content and create page objects.

130

131

```python { .api }

132

def summary(title, sentences=0, chars=0, auto_suggest=True, redirect=True):

133

"""

134

Get plain text summary of a Wikipedia page.

135

136

Parameters:

137

- title (str): Page title

138

- sentences (int): Limit to first N sentences (max 10, default: 0 for intro)

139

- chars (int): Limit to first N characters (default: 0 for intro)

140

- auto_suggest (bool): Auto-correct page title (default: True)

141

- redirect (bool): Follow redirects (default: True)

142

143

Returns:

144

- str: Plain text summary

145

"""

146

147

def page(title=None, pageid=None, auto_suggest=True, redirect=True, preload=False):

148

"""

149

Get WikipediaPage object for a page.

150

151

Parameters:

152

- title (str, optional): Page title

153

- pageid (int, optional): Numeric page ID (mutually exclusive with title)

154

- auto_suggest (bool): Auto-correct page title (default: True)

155

- redirect (bool): Follow redirects (default: True)

156

- preload (bool): Load all properties during initialization (default: False)

157

158

Returns:

159

- WikipediaPage: Page object with lazy-loaded properties

160

"""

161

```

162

163

### Configuration

164

165

Configure library behavior for language, rate limiting, and user agent.

166

167

```python { .api }

168

def set_lang(prefix):

169

"""

170

Change Wikipedia language edition.

171

172

Parameters:

173

- prefix (str): Two-letter language code ('en', 'fr', 'es', etc.)

174

175

Note: Clears search, suggest, and summary caches

176

"""

177

178

def set_user_agent(user_agent_string):

179

"""

180

Set custom User-Agent header for requests.

181

182

Parameters:

183

- user_agent_string (str): Custom User-Agent string

184

"""

185

186

def set_rate_limiting(rate_limit, min_wait=timedelta(milliseconds=50)):

187

"""

188

Enable or disable rate limiting for API requests.

189

190

Parameters:

191

- rate_limit (bool): Enable rate limiting

192

- min_wait (timedelta, optional): Minimum wait between requests

193

(default: timedelta(milliseconds=50))

194

"""

195

```

196

197

### Utility Functions

198

199

Additional utility functions for language support and donations.

200

201

```python { .api }

202

def languages():

203

"""

204

Get all supported Wikipedia language prefixes.

205

206

Returns:

207

- dict: Language code to local name mapping

208

"""

209

210

def donate():

211

"""

212

Open Wikimedia donation page in default browser.

213

"""

214

```

215

216

## WikipediaPage Class

217

218

Represents a Wikipedia page with lazy-loaded properties for content and metadata.

219

220

```python { .api }

221

class WikipediaPage:

222

def __init__(self, title=None, pageid=None, redirect=True, preload=False, original_title=''):

223

"""

224

Initialize WikipediaPage object.

225

226

Parameters:

227

- title (str, optional): Page title

228

- pageid (int, optional): Numeric page ID

229

- redirect (bool): Allow redirects (default: True)

230

- preload (bool): Load all properties immediately (default: False)

231

- original_title (str): Original search title

232

"""

233

234

# Properties (lazy-loaded)

235

title: str # Page title

236

url: str # Full Wikipedia URL

237

pageid: str # Numeric page ID (stored as string)

238

content: str # Full plain text content

239

summary: str # Plain text summary (intro section)

240

images: list[str] # List of image URLs

241

coordinates: tuple[Decimal, Decimal] | None # (latitude, longitude) or None

242

references: list[str] # External link URLs

243

links: list[str] # Wikipedia page titles linked from this page

244

categories: list[str] # Wikipedia categories for this page

245

sections: list[str] # Section titles from table of contents

246

revision_id: int # Current revision ID

247

parent_id: int # Parent revision ID

248

249

def html(self):

250

"""

251

Get full page HTML content.

252

253

Returns:

254

- str: Complete HTML content

255

256

Warning: Can be slow for long pages

257

"""

258

259

def section(self, section_title):

260

"""

261

Get plain text content of a specific section.

262

263

Parameters:

264

- section_title (str): Section title from self.sections

265

266

Returns:

267

- str or None: Section content or None if not found

268

269

Warning: Only returns content between section and next subsection

270

"""

271

```

272

273

## Exception Classes

274

275

Custom exceptions for error handling.

276

277

```python { .api }

278

class WikipediaException(Exception):

279

"""Base exception class for all Wikipedia errors."""

280

281

def __init__(self, error):

282

self.error = error

283

284

class PageError(WikipediaException):

285

"""Raised when no Wikipedia page matches a query."""

286

287

def __init__(self, pageid=None, *args):

288

# Sets self.pageid or self.title based on parameters

289

pass

290

291

class DisambiguationError(WikipediaException):

292

"""Raised when a page resolves to a disambiguation page."""

293

294

def __init__(self, title, may_refer_to):

295

self.title = title

296

self.options = may_refer_to # List of possible page titles

297

298

class RedirectError(WikipediaException):

299

"""Raised when a page redirects but redirect=False."""

300

301

def __init__(self, title):

302

self.title = title

303

304

class HTTPTimeoutError(WikipediaException):

305

"""Raised when MediaWiki API request times out."""

306

307

def __init__(self, query):

308

self.query = query

309

```

310

311

## Error Handling Examples

312

313

```python

314

import wikipedia

315

316

# Handle page not found

317

try:

318

page = wikipedia.page("Nonexistent Page", auto_suggest=False)

319

except wikipedia.PageError as e:

320

print(f"Page not found: {e}")

321

322

# Handle disambiguation pages

323

try:

324

page = wikipedia.page("Python") # Might be ambiguous

325

except wikipedia.DisambiguationError as e:

326

print(f"Multiple pages found for '{e.title}':")

327

for option in e.options[:5]: # Show first 5 options

328

print(f" - {option}")

329

# Choose specific page

330

page = wikipedia.page(e.options[0])

331

332

# Handle redirect pages

333

try:

334

page = wikipedia.page("Redirect Page", redirect=False)

335

except wikipedia.RedirectError as e:

336

print(f"Page '{e.title}' redirects. Set redirect=True to follow.")

337

338

# Handle API timeouts with retry logic

339

import time

340

341

def robust_search(query, max_retries=3):

342

for attempt in range(max_retries):

343

try:

344

return wikipedia.search(query)

345

except wikipedia.HTTPTimeoutError as e:

346

if attempt < max_retries - 1:

347

print(f"Timeout on attempt {attempt + 1}, retrying...")

348

time.sleep(2 ** attempt) # Exponential backoff

349

else:

350

print(f"Failed after {max_retries} attempts: {e}")

351

return []

352

353

# Handle general Wikipedia exceptions

354

try:

355

results = wikipedia.search("test query")

356

page = wikipedia.page(results[0])

357

except wikipedia.WikipediaException as e:

358

print(f"Wikipedia error: {e}")

359

except IndexError:

360

print("No search results found")

361

```