0
# Wikipedia
1
2
A Python library that provides easy access to Wikipedia data through the MediaWiki API. Wikipedia simplifies search, content retrieval, and metadata extraction from Wikipedia pages without requiring direct API knowledge.
3
4
## Package Information
5
6
- **Package Name**: wikipedia
7
- **Language**: Python
8
- **Installation**: `pip install wikipedia`
9
10
## Core Imports
11
12
```python
13
import wikipedia
14
```
15
16
All functionality is available through the main module:
17
18
```python
19
from wikipedia import search, page, summary, set_lang
20
from wikipedia import WikipediaPage, PageError, DisambiguationError
21
from datetime import timedelta # For set_rate_limiting
22
from decimal import Decimal # For coordinate types
23
```
24
25
## Basic Usage
26
27
```python
28
import wikipedia
29
from decimal import Decimal
30
31
# Search for articles
32
results = wikipedia.search("Barack Obama")
33
print(results) # ['Barack Obama', 'Barack Obama Sr.', ...]
34
35
# Get a page summary
36
summary = wikipedia.summary("Barack Obama", sentences=2)
37
print(summary)
38
39
# Get a full page with properties
40
page = wikipedia.page("Barack Obama")
41
print(page.title)
42
print(page.url)
43
print(page.content[:200]) # First 200 characters
44
print(page.images[:3]) # First 3 image URLs
45
print(page.links[:5]) # First 5 linked pages
46
47
# Geographic search
48
nearby = wikipedia.geosearch(40.7128, -74.0060, results=5) # NYC coordinates
49
print(nearby) # Articles near New York City
50
51
# Change language and search
52
wikipedia.set_lang("fr")
53
summary_fr = wikipedia.summary("Barack Obama", sentences=1)
54
print(summary_fr)
55
56
# Enable rate limiting for heavy usage
57
from datetime import timedelta
58
wikipedia.set_rate_limiting(True, min_wait=timedelta(milliseconds=100))
59
```
60
61
## Capabilities
62
63
### Search Functions
64
65
Search Wikipedia for articles and get suggestions.
66
67
```python { .api }
68
def search(query, results=10, suggestion=False):
69
"""
70
Search Wikipedia for articles matching the query.
71
72
Parameters:
73
- query (str): Search term
74
- results (int): Maximum number of results (default: 10)
75
- suggestion (bool): Return search suggestion if True (default: False)
76
77
Returns:
78
- list: Article titles if suggestion=False
79
- tuple: (titles_list, suggestion_string) if suggestion=True
80
"""
81
82
def geosearch(latitude, longitude, title=None, results=10, radius=1000):
83
"""
84
Geographic search for articles near coordinates.
85
86
Parameters:
87
- latitude (float): Latitude coordinate
88
- longitude (float): Longitude coordinate
89
- title (str, optional): Specific article to search for
90
- results (int): Maximum results (default: 10)
91
- radius (int): Search radius in meters (10-10000, default: 1000)
92
93
Returns:
94
- list: Article titles near the coordinates
95
96
Example:
97
# Find articles near the Eiffel Tower
98
eiffel_articles = geosearch(48.8584, 2.2945, radius=500)
99
# Find specific landmark near coordinates
100
landmarks = geosearch(40.7589, -73.9851, title="Central Park", radius=1000)
101
"""
102
103
def suggest(query):
104
"""
105
Get search suggestion for a query.
106
107
Parameters:
108
- query (str): Search term
109
110
Returns:
111
- str or None: Suggested search term or None if no suggestion
112
"""
113
114
def random(pages=1):
115
"""
116
Get random Wikipedia article titles.
117
118
Parameters:
119
- pages (int): Number of random articles (max 10, default: 1)
120
121
Returns:
122
- str: Single title if pages=1
123
- list: Multiple titles if pages>1
124
"""
125
```
126
127
### Content Access
128
129
Retrieve article content and create page objects.
130
131
```python { .api }
132
def summary(title, sentences=0, chars=0, auto_suggest=True, redirect=True):
133
"""
134
Get plain text summary of a Wikipedia page.
135
136
Parameters:
137
- title (str): Page title
138
- sentences (int): Limit to first N sentences (max 10, default: 0 for intro)
139
- chars (int): Limit to first N characters (default: 0 for intro)
140
- auto_suggest (bool): Auto-correct page title (default: True)
141
- redirect (bool): Follow redirects (default: True)
142
143
Returns:
144
- str: Plain text summary
145
"""
146
147
def page(title=None, pageid=None, auto_suggest=True, redirect=True, preload=False):
148
"""
149
Get WikipediaPage object for a page.
150
151
Parameters:
152
- title (str, optional): Page title
153
- pageid (int, optional): Numeric page ID (mutually exclusive with title)
154
- auto_suggest (bool): Auto-correct page title (default: True)
155
- redirect (bool): Follow redirects (default: True)
156
- preload (bool): Load all properties during initialization (default: False)
157
158
Returns:
159
- WikipediaPage: Page object with lazy-loaded properties
160
"""
161
```
162
163
### Configuration
164
165
Configure library behavior for language, rate limiting, and user agent.
166
167
```python { .api }
168
def set_lang(prefix):
169
"""
170
Change Wikipedia language edition.
171
172
Parameters:
173
- prefix (str): Two-letter language code ('en', 'fr', 'es', etc.)
174
175
Note: Clears search, suggest, and summary caches
176
"""
177
178
def set_user_agent(user_agent_string):
179
"""
180
Set custom User-Agent header for requests.
181
182
Parameters:
183
- user_agent_string (str): Custom User-Agent string
184
"""
185
186
def set_rate_limiting(rate_limit, min_wait=timedelta(milliseconds=50)):
187
"""
188
Enable or disable rate limiting for API requests.
189
190
Parameters:
191
- rate_limit (bool): Enable rate limiting
192
- min_wait (timedelta, optional): Minimum wait between requests
193
(default: timedelta(milliseconds=50))
194
"""
195
```
196
197
### Utility Functions
198
199
Additional utility functions for language support and donations.
200
201
```python { .api }
202
def languages():
203
"""
204
Get all supported Wikipedia language prefixes.
205
206
Returns:
207
- dict: Language code to local name mapping
208
"""
209
210
def donate():
211
"""
212
Open Wikimedia donation page in default browser.
213
"""
214
```
215
216
## WikipediaPage Class
217
218
Represents a Wikipedia page with lazy-loaded properties for content and metadata.
219
220
```python { .api }
221
class WikipediaPage:
222
def __init__(self, title=None, pageid=None, redirect=True, preload=False, original_title=''):
223
"""
224
Initialize WikipediaPage object.
225
226
Parameters:
227
- title (str, optional): Page title
228
- pageid (int, optional): Numeric page ID
229
- redirect (bool): Allow redirects (default: True)
230
- preload (bool): Load all properties immediately (default: False)
231
- original_title (str): Original search title
232
"""
233
234
# Properties (lazy-loaded)
235
title: str # Page title
236
url: str # Full Wikipedia URL
237
pageid: str # Numeric page ID (stored as string)
238
content: str # Full plain text content
239
summary: str # Plain text summary (intro section)
240
images: list[str] # List of image URLs
241
coordinates: tuple[Decimal, Decimal] | None # (latitude, longitude) or None
242
references: list[str] # External link URLs
243
links: list[str] # Wikipedia page titles linked from this page
244
categories: list[str] # Wikipedia categories for this page
245
sections: list[str] # Section titles from table of contents
246
revision_id: int # Current revision ID
247
parent_id: int # Parent revision ID
248
249
def html(self):
250
"""
251
Get full page HTML content.
252
253
Returns:
254
- str: Complete HTML content
255
256
Warning: Can be slow for long pages
257
"""
258
259
def section(self, section_title):
260
"""
261
Get plain text content of a specific section.
262
263
Parameters:
264
- section_title (str): Section title from self.sections
265
266
Returns:
267
- str or None: Section content or None if not found
268
269
Warning: Only returns content between section and next subsection
270
"""
271
```
272
273
## Exception Classes
274
275
Custom exceptions for error handling.
276
277
```python { .api }
278
class WikipediaException(Exception):
279
"""Base exception class for all Wikipedia errors."""
280
281
def __init__(self, error):
282
self.error = error
283
284
class PageError(WikipediaException):
285
"""Raised when no Wikipedia page matches a query."""
286
287
def __init__(self, pageid=None, *args):
288
# Sets self.pageid or self.title based on parameters
289
pass
290
291
class DisambiguationError(WikipediaException):
292
"""Raised when a page resolves to a disambiguation page."""
293
294
def __init__(self, title, may_refer_to):
295
self.title = title
296
self.options = may_refer_to # List of possible page titles
297
298
class RedirectError(WikipediaException):
299
"""Raised when a page redirects but redirect=False."""
300
301
def __init__(self, title):
302
self.title = title
303
304
class HTTPTimeoutError(WikipediaException):
305
"""Raised when MediaWiki API request times out."""
306
307
def __init__(self, query):
308
self.query = query
309
```
310
311
## Error Handling Examples
312
313
```python
314
import wikipedia
315
316
# Handle page not found
317
try:
318
page = wikipedia.page("Nonexistent Page", auto_suggest=False)
319
except wikipedia.PageError as e:
320
print(f"Page not found: {e}")
321
322
# Handle disambiguation pages
323
try:
324
page = wikipedia.page("Python") # Might be ambiguous
325
except wikipedia.DisambiguationError as e:
326
print(f"Multiple pages found for '{e.title}':")
327
for option in e.options[:5]: # Show first 5 options
328
print(f" - {option}")
329
# Choose specific page
330
page = wikipedia.page(e.options[0])
331
332
# Handle redirect pages
333
try:
334
page = wikipedia.page("Redirect Page", redirect=False)
335
except wikipedia.RedirectError as e:
336
print(f"Page '{e.title}' redirects. Set redirect=True to follow.")
337
338
# Handle API timeouts with retry logic
339
import time
340
341
def robust_search(query, max_retries=3):
342
for attempt in range(max_retries):
343
try:
344
return wikipedia.search(query)
345
except wikipedia.HTTPTimeoutError as e:
346
if attempt < max_retries - 1:
347
print(f"Timeout on attempt {attempt + 1}, retrying...")
348
time.sleep(2 ** attempt) # Exponential backoff
349
else:
350
print(f"Failed after {max_retries} attempts: {e}")
351
return []
352
353
# Handle general Wikipedia exceptions
354
try:
355
results = wikipedia.search("test query")
356
page = wikipedia.page(results[0])
357
except wikipedia.WikipediaException as e:
358
print(f"Wikipedia error: {e}")
359
except IndexError:
360
print("No search results found")
361
```