or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-pyexcel-xlsx

A wrapper library to read, manipulate and write data in xlsx and xlsm format

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/pyexcel-xlsx@0.6.x

To install, run

npx @tessl/cli install tessl/pypi-pyexcel-xlsx@0.6.0

0

# Pyexcel-xlsx

1

2

A Python wrapper library that provides Excel XLSX and XLSM file reading and writing capabilities using openpyxl as the backend. It offers both standalone functionality and seamless integration as a plugin for the pyexcel ecosystem, supporting memory-efficient operations, pagination, and advanced features like merged cell handling and hidden sheet processing.

3

4

## Package Information

5

6

- **Package Name**: pyexcel-xlsx

7

- **Language**: Python

8

- **Installation**: `pip install pyexcel-xlsx`

9

10

## Core Imports

11

12

```python

13

from pyexcel_xlsx import get_data, save_data

14

```

15

16

For advanced usage with reader/writer classes:

17

18

```python

19

from pyexcel_xlsx.xlsxr import XLSXBook, XLSXBookInContent, FastSheet, SlowSheet

20

from pyexcel_xlsx.xlsxw import XLSXWriter, XLSXSheetWriter

21

```

22

23

For module constants:

24

25

```python

26

from pyexcel_xlsx import __FILE_TYPE__

27

```

28

29

## Basic Usage

30

31

```python

32

from pyexcel_xlsx import get_data, save_data

33

from collections import OrderedDict

34

35

# Read an Excel file

36

data = get_data("input.xlsx")

37

print(data) # {'Sheet1': [[1, 2, 3], [4, 5, 6]], 'Sheet2': [['a', 'b', 'c']]}

38

39

# Write data to an Excel file

40

output_data = OrderedDict()

41

output_data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})

42

output_data.update({"Sheet 2": [["row 1", "row 2", "row 3"]]})

43

save_data("output.xlsx", output_data)

44

45

# Memory operations

46

from io import BytesIO

47

io_buffer = BytesIO()

48

save_data(io_buffer, output_data)

49

io_buffer.seek(0)

50

data_from_memory = get_data(io_buffer)

51

```

52

53

## Capabilities

54

55

### Reading Excel Files

56

57

Load data from XLSX and XLSM files with support for pagination, hidden content handling, and merged cell processing.

58

59

```python { .api }

60

def get_data(afile, file_type=None, **keywords):

61

"""

62

Read data from Excel files (xlsx, xlsm).

63

64

Parameters:

65

- afile: str or file-like object, path to Excel file or file object

66

- file_type: str, optional file type ('xlsx' or 'xlsm'), auto-detected if None

67

- library: str, optional library identifier for plugin selection

68

- start_row: int, starting row index for pagination (0-based)

69

- row_limit: int, maximum number of rows to read

70

- start_column: int, starting column index for pagination (0-based)

71

- column_limit: int, maximum number of columns to read

72

- skip_hidden_sheets: bool, whether to skip hidden sheets (default: True)

73

- detect_merged_cells: bool, whether to detect and handle merged cells (default: False)

74

- skip_hidden_row_and_column: bool, whether to skip hidden rows/columns (default: True)

75

76

Returns:

77

dict: Dictionary with sheet names as keys and list of lists as values

78

"""

79

```

80

81

### Writing Excel Files

82

83

Save data to XLSX and XLSM files with write-only optimization for memory efficiency.

84

85

```python { .api }

86

def save_data(afile, data, file_type=None, **keywords):

87

"""

88

Write data to Excel files (xlsx, xlsm).

89

90

Parameters:

91

- afile: str or file-like object, path to Excel file or file object

92

- data: dict, dictionary with sheet names as keys and data as list of lists

93

- file_type: str, optional file type ('xlsx' or 'xlsm'), auto-detected if None

94

- **keywords: additional options passed to underlying writer

95

96

Returns:

97

None

98

"""

99

```

100

101

### Module Constants

102

103

Core module constants and file type identifiers.

104

105

```python { .api }

106

__FILE_TYPE__ = "xlsx" # Default file type identifier

107

```

108

109

### Advanced Reading Classes

110

111

Low-level reader classes for custom Excel file processing with configurable options for performance and feature trade-offs.

112

113

```python { .api }

114

class XLSXBook:

115

"""

116

Excel file reader with configurable options for performance and features.

117

"""

118

119

def __init__(

120

self,

121

file_alike_object,

122

file_type,

123

skip_hidden_sheets=True,

124

detect_merged_cells=False,

125

skip_hidden_row_and_column=True,

126

**keywords

127

):

128

"""

129

Initialize Excel file reader.

130

131

Parameters:

132

- file_alike_object: str or file-like object, Excel file path or file object

133

- file_type: str, file type identifier

134

- skip_hidden_sheets: bool, whether to skip hidden sheets

135

- detect_merged_cells: bool, whether to detect merged cells (disables read-only mode)

136

- skip_hidden_row_and_column: bool, whether to skip hidden content (disables read-only mode)

137

- **keywords: additional options

138

"""

139

140

def read_sheet(self, sheet_index):

141

"""

142

Read a specific sheet by index.

143

144

Parameters:

145

- sheet_index: int, zero-based sheet index

146

147

Returns:

148

Sheet reader object (FastSheet or SlowSheet)

149

"""

150

151

def close(self):

152

"""Close the Excel workbook and free resources."""

153

154

class XLSXBookInContent(XLSXBook):

155

"""

156

Excel reader for binary content from memory.

157

"""

158

159

def __init__(self, file_content, file_type, **keywords):

160

"""

161

Initialize Excel reader from binary content.

162

163

Parameters:

164

- file_content: bytes, binary content of Excel file

165

- file_type: str, file type identifier

166

- **keywords: additional options

167

"""

168

169

class FastSheet:

170

"""

171

Fast sheet reader using openpyxl's read-only mode for performance.

172

"""

173

174

def __init__(self, sheet, **keywords):

175

"""

176

Initialize fast sheet reader.

177

178

Parameters:

179

- sheet: openpyxl worksheet object

180

- **keywords: additional options (unused)

181

"""

182

183

def row_iterator(self):

184

"""

185

Iterate through rows in the sheet.

186

187

Returns:

188

Generator yielding row objects

189

"""

190

191

def column_iterator(self, row):

192

"""

193

Iterate through columns in a row.

194

195

Parameters:

196

- row: openpyxl row object

197

198

Returns:

199

Generator yielding cell values

200

"""

201

202

class SlowSheet(FastSheet):

203

"""

204

Slower sheet reader that supports hidden content and merged cells.

205

"""

206

207

def __init__(self, sheet, **keywords):

208

"""

209

Initialize slower sheet reader with advanced features.

210

211

Parameters:

212

- sheet: openpyxl worksheet object

213

- **keywords: additional options

214

"""

215

216

def row_iterator(self):

217

"""

218

Iterate through rows, skipping hidden rows.

219

220

Returns:

221

Generator yielding (row, row_index) tuples

222

"""

223

224

def column_iterator(self, row_struct):

225

"""

226

Iterate through columns, skipping hidden columns and handling merged cells.

227

228

Parameters:

229

- row_struct: tuple of (row, row_index)

230

231

Returns:

232

Generator yielding cell values with merged cell handling

233

"""

234

```

235

236

### Advanced Writing Classes

237

238

Low-level writer classes for custom Excel file generation with write-only optimization.

239

240

```python { .api }

241

class XLSXWriter:

242

"""

243

Excel file writer using write-only mode for memory efficiency.

244

"""

245

246

def __init__(self, file_alike_object, file_type, **keywords):

247

"""

248

Initialize Excel file writer.

249

250

Parameters:

251

- file_alike_object: str or file-like object, output file path or file object

252

- file_type: str, file type identifier (unused placeholder parameter)

253

- **keywords: additional options

254

"""

255

256

def create_sheet(self, name):

257

"""

258

Create a new sheet in the workbook.

259

260

Parameters:

261

- name: str, sheet name

262

263

Returns:

264

XLSXSheetWriter: Sheet writer object

265

"""

266

267

def write(self, incoming_dict):

268

"""

269

Write dictionary data to Excel file with multiple sheets.

270

271

Parameters:

272

- incoming_dict: dict, dictionary with sheet names as keys and data as values

273

"""

274

275

def close(self):

276

"""Save the workbook to file and close."""

277

278

class XLSXSheetWriter:

279

"""

280

Individual sheet writer for Excel files.

281

"""

282

283

def __init__(self, xlsx_sheet, sheet_name="Sheet"):

284

"""

285

Initialize sheet writer.

286

287

Parameters:

288

- xlsx_sheet: openpyxl worksheet object

289

- sheet_name: str, name for the sheet

290

"""

291

292

def write_row(self, array):

293

"""

294

Write a row of data to the sheet.

295

296

Parameters:

297

- array: list, row data as list of values

298

"""

299

300

def close(self):

301

"""Close the sheet writer (no operation required)."""

302

```

303

304

## Types

305

306

```python { .api }

307

from typing import Any, Dict, List, Union

308

from io import BinaryIO, BytesIO

309

310

# Sheet data structure

311

SheetData = List[List[Any]] # List of rows, each row is a list of cell values

312

313

# Workbook data structure

314

WorkbookData = Dict[str, SheetData] # Dictionary mapping sheet names to sheet data

315

316

# File-like objects supported

317

FileTypes = Union[str, BinaryIO, BytesIO] # File paths, binary streams, or BytesIO objects

318

319

# Constants

320

FILE_TYPE = "xlsx" # Default file type identifier

321

```

322

323

## Supported Features

324

325

- **File Formats**: XLSX (Excel Open XML Spreadsheet), XLSM (Excel Open XML Macro-Enabled Spreadsheet)

326

- **Read Modes**: Read-only mode for performance, full mode for advanced features

327

- **Write Mode**: Write-only mode for memory efficiency

328

- **Pagination**: Reading subsets of data with start_row, row_limit, start_column, column_limit

329

- **Hidden Content**: Skip or include hidden sheets, rows, and columns

330

- **Merged Cells**: Detection and handling of merged cell ranges

331

- **Memory Operations**: Direct BytesIO and file-like object support

332

- **Integration**: Seamless pyexcel plugin integration

333

- **Data Types**: Automatic handling of dates, times, numbers, and text

334

- **Multiple Sheets**: Full support for multi-sheet workbooks

335

336

## Error Handling

337

338

The library integrates with pyexcel-io error handling. Common exceptions include:

339

340

- **FileNotFoundError**: When specified file doesn't exist

341

- **PermissionError**: When file is locked or access denied

342

- **ValueError**: When invalid parameters or data formats are provided

343

- **openpyxl exceptions**: Underlying Excel file format errors

344

345

## Performance Considerations

346

347

- **Read-only mode**: Enabled by default for better performance, disabled when `skip_hidden_row_and_column=True` or `detect_merged_cells=True`

348

- **Write-only mode**: Always used for writing to minimize memory usage

349

- **Pagination**: Use for large files to limit memory consumption

350

- **Hidden content**: Skipping hidden content improves performance

351

- **Merged cells**: Detection has performance overhead, only enable when needed