0
# Wikipedia API Wrapper
1
2
Core functionality for initializing Wikipedia API connections, configuring extraction formats, language settings, and creating page objects. The Wikipedia class serves as the main entry point for all Wikipedia data access.
3
4
## Capabilities
5
6
### Wikipedia Initialization
7
8
Create and configure a Wikipedia API wrapper instance with user agent, language, format settings, and connection parameters.
9
10
```python { .api }
11
class Wikipedia:
12
def __init__(
13
self,
14
user_agent: str,
15
language: str = "en",
16
variant: Optional[str] = None,
17
extract_format: ExtractFormat = ExtractFormat.WIKI,
18
headers: Optional[dict[str, Any]] = None,
19
extra_api_params: Optional[dict[str, Any]] = None,
20
**request_kwargs
21
):
22
"""
23
Initialize Wikipedia API wrapper.
24
25
Parameters:
26
- user_agent: HTTP User-Agent identifier (required, min 5 chars)
27
- language: Wikipedia language edition (e.g., 'en', 'es', 'fr')
28
- variant: Language variant for languages that support conversion
29
- extract_format: Content extraction format (WIKI or HTML)
30
- headers: Additional HTTP headers for requests
31
- extra_api_params: Additional API parameters for all requests
32
- request_kwargs: Additional parameters for requests library (timeout, proxies, etc.)
33
34
Raises:
35
AssertionError: If user_agent is too short or language is invalid
36
"""
37
```
38
39
#### Usage Examples
40
41
```python
42
import wikipediaapi
43
44
# Basic initialization
45
wiki = wikipediaapi.Wikipedia(
46
user_agent='MyApp/1.0 (contact@example.com)',
47
language='en'
48
)
49
50
# With custom settings
51
wiki = wikipediaapi.Wikipedia(
52
user_agent='MyApp/1.0 (contact@example.com)',
53
language='zh',
54
variant='zh-cn', # Simplified Chinese variant
55
extract_format=wikipediaapi.ExtractFormat.HTML,
56
headers={'Accept-Language': 'zh-CN,zh;q=0.9'},
57
timeout=15.0, # Custom timeout
58
proxies={'http': 'http://proxy:8080'} # Proxy support
59
)
60
61
# Multiple language instances
62
wiki_en = wikipediaapi.Wikipedia('MyApp/1.0', 'en')
63
wiki_es = wikipediaapi.Wikipedia('MyApp/1.0', 'es')
64
wiki_fr = wikipediaapi.Wikipedia('MyApp/1.0', 'fr')
65
```
66
67
### Page Creation
68
69
Create WikipediaPage objects for accessing Wikipedia content. Pages are created with lazy loading - content is fetched only when accessed.
70
71
```python { .api }
72
def page(
73
self,
74
title: str,
75
ns: WikiNamespace = Namespace.MAIN,
76
unquote: bool = False
77
) -> WikipediaPage:
78
"""
79
Create a WikipediaPage object for the specified title.
80
81
Parameters:
82
- title: Page title as used in Wikipedia URL
83
- ns: Wikipedia namespace (default: MAIN)
84
- unquote: Whether to URL-unquote the title
85
86
Returns:
87
WikipediaPage object (content loaded lazily)
88
"""
89
90
def article(
91
self,
92
title: str,
93
ns: WikiNamespace = Namespace.MAIN,
94
unquote: bool = False
95
) -> WikipediaPage:
96
"""
97
Alias for page() method.
98
99
Parameters:
100
- title: Page title as used in Wikipedia URL
101
- ns: Wikipedia namespace (default: MAIN)
102
- unquote: Whether to URL-unquote the title
103
104
Returns:
105
WikipediaPage object (content loaded lazily)
106
"""
107
```
108
109
#### Usage Examples
110
111
```python
112
# Basic page creation
113
page = wiki.page('Python_(programming_language)')
114
115
# Page in different namespace
116
category_page = wiki.page('Physics', ns=wikipediaapi.Namespace.CATEGORY)
117
118
# URL-encoded title (Hindi Wikipedia example)
119
hindi_page = wiki.page('%E0%A4%AA%E0%A4%BE%E0%A4%87%E0%A4%A5%E0%A4%A8', unquote=True)
120
121
# Using article() alias
122
page = wiki.article('Machine_learning')
123
```
124
125
### Direct API Methods
126
127
Low-level methods for direct Wikipedia API access. These methods are used internally by WikipediaPage properties but can be called directly for custom use cases.
128
129
```python { .api }
130
def extracts(self, page: WikipediaPage, **kwargs) -> str:
131
"""
132
Get page content extracts with custom parameters.
133
134
Parameters:
135
- page: WikipediaPage object
136
- kwargs: Additional API parameters (exsentences, exchars, etc.)
137
138
Returns:
139
Extracted page content as string
140
"""
141
142
def info(self, page: WikipediaPage) -> WikipediaPage:
143
"""
144
Get page metadata and information.
145
146
Parameters:
147
- page: WikipediaPage object
148
149
Returns:
150
Updated WikipediaPage with metadata populated
151
"""
152
153
def langlinks(self, page: WikipediaPage, **kwargs) -> dict[str, WikipediaPage]:
154
"""
155
Get language links for the page.
156
157
Parameters:
158
- page: WikipediaPage object
159
- kwargs: Additional API parameters
160
161
Returns:
162
Dictionary mapping language codes to WikipediaPage objects
163
"""
164
165
def links(self, page: WikipediaPage, **kwargs) -> dict[str, WikipediaPage]:
166
"""
167
Get internal links from the page.
168
169
Parameters:
170
- page: WikipediaPage object
171
- kwargs: Additional API parameters
172
173
Returns:
174
Dictionary mapping page titles to WikipediaPage objects
175
"""
176
177
def backlinks(self, page: WikipediaPage, **kwargs) -> dict[str, WikipediaPage]:
178
"""
179
Get pages that link to this page.
180
181
Parameters:
182
- page: WikipediaPage object
183
- kwargs: Additional API parameters
184
185
Returns:
186
Dictionary mapping page titles to WikipediaPage objects
187
"""
188
189
def categories(self, page: WikipediaPage, **kwargs) -> dict[str, WikipediaPage]:
190
"""
191
Get categories for the page.
192
193
Parameters:
194
- page: WikipediaPage object
195
- kwargs: Additional API parameters
196
197
Returns:
198
Dictionary mapping category names to WikipediaPage objects
199
"""
200
201
def categorymembers(self, page: WikipediaPage, **kwargs) -> dict[str, WikipediaPage]:
202
"""
203
Get pages in the category (for category pages).
204
205
Parameters:
206
- page: WikipediaPage object representing a category
207
- kwargs: Additional API parameters
208
209
Returns:
210
Dictionary mapping page titles to WikipediaPage objects
211
"""
212
```
213
214
### Properties
215
216
Access Wikipedia instance configuration after initialization.
217
218
```python { .api }
219
@property
220
def language(self) -> str:
221
"""Get the configured language."""
222
223
@property
224
def variant(self) -> Optional[str]:
225
"""Get the configured language variant."""
226
227
@property
228
def extract_format(self) -> ExtractFormat:
229
"""Get the configured extraction format."""
230
```
231
232
### Session Management
233
234
The Wikipedia class automatically manages HTTP sessions and cleanup.
235
236
```python { .api }
237
def __del__(self) -> None:
238
"""Automatically closes the HTTP session when Wikipedia object is destroyed."""
239
```
240
241
#### Usage Examples
242
243
```python
244
# Session is automatically managed
245
wiki = wikipediaapi.Wikipedia('MyApp/1.0', 'en')
246
# ... use wiki object
247
# Session automatically closed when wiki goes out of scope
248
249
# For long-running applications, you can explicitly manage lifecycle
250
def process_pages(page_titles):
251
wiki = wikipediaapi.Wikipedia('MyApp/1.0', 'en')
252
try:
253
for title in page_titles:
254
page = wiki.page(title)
255
# Process page...
256
finally:
257
# Session automatically cleaned up
258
pass
259
```
260
261
## Error Handling
262
263
The Wikipedia class validates parameters and raises AssertionError for invalid configurations:
264
265
- **user_agent**: Must be at least 5 characters long
266
- **language**: Must be specified and non-empty
267
- **Long language codes**: Warning logged if language code exceeds 5 characters
268
269
```python
270
# These will raise AssertionError
271
try:
272
wiki = wikipediaapi.Wikipedia("", "en") # Empty user agent
273
except AssertionError as e:
274
print(f"Error: {e}")
275
276
try:
277
wiki = wikipediaapi.Wikipedia("MyApp", "") # Empty language
278
except AssertionError as e:
279
print(f"Error: {e}")
280
```