or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

categories.mdcontent-extraction.mdindex.mdpage-navigation.mdwikipedia-wrapper.md

wikipedia-wrapper.mddocs/

0

# Wikipedia API Wrapper

1

2

Core functionality for initializing Wikipedia API connections, configuring extraction formats, language settings, and creating page objects. The Wikipedia class serves as the main entry point for all Wikipedia data access.

3

4

## Capabilities

5

6

### Wikipedia Initialization

7

8

Create and configure a Wikipedia API wrapper instance with user agent, language, format settings, and connection parameters.

9

10

```python { .api }

11

class Wikipedia:

12

def __init__(

13

self,

14

user_agent: str,

15

language: str = "en",

16

variant: Optional[str] = None,

17

extract_format: ExtractFormat = ExtractFormat.WIKI,

18

headers: Optional[dict[str, Any]] = None,

19

extra_api_params: Optional[dict[str, Any]] = None,

20

**request_kwargs

21

):

22

"""

23

Initialize Wikipedia API wrapper.

24

25

Parameters:

26

- user_agent: HTTP User-Agent identifier (required, min 5 chars)

27

- language: Wikipedia language edition (e.g., 'en', 'es', 'fr')

28

- variant: Language variant for languages that support conversion

29

- extract_format: Content extraction format (WIKI or HTML)

30

- headers: Additional HTTP headers for requests

31

- extra_api_params: Additional API parameters for all requests

32

- request_kwargs: Additional parameters for requests library (timeout, proxies, etc.)

33

34

Raises:

35

AssertionError: If user_agent is too short or language is invalid

36

"""

37

```

38

39

#### Usage Examples

40

41

```python

42

import wikipediaapi

43

44

# Basic initialization

45

wiki = wikipediaapi.Wikipedia(

46

user_agent='MyApp/1.0 (contact@example.com)',

47

language='en'

48

)

49

50

# With custom settings

51

wiki = wikipediaapi.Wikipedia(

52

user_agent='MyApp/1.0 (contact@example.com)',

53

language='zh',

54

variant='zh-cn', # Simplified Chinese variant

55

extract_format=wikipediaapi.ExtractFormat.HTML,

56

headers={'Accept-Language': 'zh-CN,zh;q=0.9'},

57

timeout=15.0, # Custom timeout

58

proxies={'http': 'http://proxy:8080'} # Proxy support

59

)

60

61

# Multiple language instances

62

wiki_en = wikipediaapi.Wikipedia('MyApp/1.0', 'en')

63

wiki_es = wikipediaapi.Wikipedia('MyApp/1.0', 'es')

64

wiki_fr = wikipediaapi.Wikipedia('MyApp/1.0', 'fr')

65

```

66

67

### Page Creation

68

69

Create WikipediaPage objects for accessing Wikipedia content. Pages are created with lazy loading - content is fetched only when accessed.

70

71

```python { .api }

72

def page(

73

self,

74

title: str,

75

ns: WikiNamespace = Namespace.MAIN,

76

unquote: bool = False

77

) -> WikipediaPage:

78

"""

79

Create a WikipediaPage object for the specified title.

80

81

Parameters:

82

- title: Page title as used in Wikipedia URL

83

- ns: Wikipedia namespace (default: MAIN)

84

- unquote: Whether to URL-unquote the title

85

86

Returns:

87

WikipediaPage object (content loaded lazily)

88

"""

89

90

def article(

91

self,

92

title: str,

93

ns: WikiNamespace = Namespace.MAIN,

94

unquote: bool = False

95

) -> WikipediaPage:

96

"""

97

Alias for page() method.

98

99

Parameters:

100

- title: Page title as used in Wikipedia URL

101

- ns: Wikipedia namespace (default: MAIN)

102

- unquote: Whether to URL-unquote the title

103

104

Returns:

105

WikipediaPage object (content loaded lazily)

106

"""

107

```

108

109

#### Usage Examples

110

111

```python

112

# Basic page creation

113

page = wiki.page('Python_(programming_language)')

114

115

# Page in different namespace

116

category_page = wiki.page('Physics', ns=wikipediaapi.Namespace.CATEGORY)

117

118

# URL-encoded title (Hindi Wikipedia example)

119

hindi_page = wiki.page('%E0%A4%AA%E0%A4%BE%E0%A4%87%E0%A4%A5%E0%A4%A8', unquote=True)

120

121

# Using article() alias

122

page = wiki.article('Machine_learning')

123

```

124

125

### Direct API Methods

126

127

Low-level methods for direct Wikipedia API access. These methods are used internally by WikipediaPage properties but can be called directly for custom use cases.

128

129

```python { .api }

130

def extracts(self, page: WikipediaPage, **kwargs) -> str:

131

"""

132

Get page content extracts with custom parameters.

133

134

Parameters:

135

- page: WikipediaPage object

136

- kwargs: Additional API parameters (exsentences, exchars, etc.)

137

138

Returns:

139

Extracted page content as string

140

"""

141

142

def info(self, page: WikipediaPage) -> WikipediaPage:

143

"""

144

Get page metadata and information.

145

146

Parameters:

147

- page: WikipediaPage object

148

149

Returns:

150

Updated WikipediaPage with metadata populated

151

"""

152

153

def langlinks(self, page: WikipediaPage, **kwargs) -> dict[str, WikipediaPage]:

154

"""

155

Get language links for the page.

156

157

Parameters:

158

- page: WikipediaPage object

159

- kwargs: Additional API parameters

160

161

Returns:

162

Dictionary mapping language codes to WikipediaPage objects

163

"""

164

165

def links(self, page: WikipediaPage, **kwargs) -> dict[str, WikipediaPage]:

166

"""

167

Get internal links from the page.

168

169

Parameters:

170

- page: WikipediaPage object

171

- kwargs: Additional API parameters

172

173

Returns:

174

Dictionary mapping page titles to WikipediaPage objects

175

"""

176

177

def backlinks(self, page: WikipediaPage, **kwargs) -> dict[str, WikipediaPage]:

178

"""

179

Get pages that link to this page.

180

181

Parameters:

182

- page: WikipediaPage object

183

- kwargs: Additional API parameters

184

185

Returns:

186

Dictionary mapping page titles to WikipediaPage objects

187

"""

188

189

def categories(self, page: WikipediaPage, **kwargs) -> dict[str, WikipediaPage]:

190

"""

191

Get categories for the page.

192

193

Parameters:

194

- page: WikipediaPage object

195

- kwargs: Additional API parameters

196

197

Returns:

198

Dictionary mapping category names to WikipediaPage objects

199

"""

200

201

def categorymembers(self, page: WikipediaPage, **kwargs) -> dict[str, WikipediaPage]:

202

"""

203

Get pages in the category (for category pages).

204

205

Parameters:

206

- page: WikipediaPage object representing a category

207

- kwargs: Additional API parameters

208

209

Returns:

210

Dictionary mapping page titles to WikipediaPage objects

211

"""

212

```

213

214

### Properties

215

216

Access Wikipedia instance configuration after initialization.

217

218

```python { .api }

219

@property

220

def language(self) -> str:

221

"""Get the configured language."""

222

223

@property

224

def variant(self) -> Optional[str]:

225

"""Get the configured language variant."""

226

227

@property

228

def extract_format(self) -> ExtractFormat:

229

"""Get the configured extraction format."""

230

```

231

232

### Session Management

233

234

The Wikipedia class automatically manages HTTP sessions and cleanup.

235

236

```python { .api }

237

def __del__(self) -> None:

238

"""Automatically closes the HTTP session when Wikipedia object is destroyed."""

239

```

240

241

#### Usage Examples

242

243

```python

244

# Session is automatically managed

245

wiki = wikipediaapi.Wikipedia('MyApp/1.0', 'en')

246

# ... use wiki object

247

# Session automatically closed when wiki goes out of scope

248

249

# For long-running applications, you can explicitly manage lifecycle

250

def process_pages(page_titles):

251

wiki = wikipediaapi.Wikipedia('MyApp/1.0', 'en')

252

try:

253

for title in page_titles:

254

page = wiki.page(title)

255

# Process page...

256

finally:

257

# Session automatically cleaned up

258

pass

259

```

260

261

## Error Handling

262

263

The Wikipedia class validates parameters and raises AssertionError for invalid configurations:

264

265

- **user_agent**: Must be at least 5 characters long

266

- **language**: Must be specified and non-empty

267

- **Long language codes**: Warning logged if language code exceeds 5 characters

268

269

```python

270

# These will raise AssertionError

271

try:

272

wiki = wikipediaapi.Wikipedia("", "en") # Empty user agent

273

except AssertionError as e:

274

print(f"Error: {e}")

275

276

try:

277

wiki = wikipediaapi.Wikipedia("MyApp", "") # Empty language

278

except AssertionError as e:

279

print(f"Error: {e}")

280

```