or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

array-operations.mdcuda-integration.mdfft-operations.mdindex.mdio-operations.mdlinear-algebra.mdmathematical-functions.mdpolynomial-functions.mdrandom-generation.mdscipy-compatibility.md

io-operations.mddocs/

0

# Input/Output Operations

1

2

Comprehensive file I/O operations for loading, saving, and formatting array data. CuPy provides NumPy-compatible I/O functions for various data formats including binary files, compressed archives, text files, and custom formatting options with seamless GPU memory management.

3

4

## Capabilities

5

6

### Binary File Operations

7

8

Efficient binary file I/O for preserving exact array data with metadata and supporting single arrays or multiple arrays in compressed archives.

9

10

```python { .api }

11

def save(file, arr, allow_pickle=True, fix_imports=True):

12

"""

13

Save array to binary file in NumPy .npy format.

14

15

Parameters:

16

- file: str or file-like, output file path or file object

17

- arr: array_like, array data to save

18

- allow_pickle: bool, allow saving object arrays with pickle

19

- fix_imports: bool, force pickle protocol 2 for Python 2 compatibility

20

21

Notes:

22

- Data is transferred to CPU before saving

23

- Preserves dtype, shape, and array metadata

24

- Compatible with numpy.load()

25

"""

26

27

def load(file, mmap_mode=None, allow_pickle=False, fix_imports=True, encoding='ASCII'):

28

"""

29

Load arrays from binary .npy, .npz files or pickled files.

30

31

Parameters:

32

- file: str or file-like, input file path or file object

33

- mmap_mode: {None, 'r+', 'r', 'w+', 'c'}, memory mapping mode

34

- allow_pickle: bool, allow loading pickled object arrays

35

- fix_imports: bool, assume pickle protocol 2 names for Python 2 compatibility

36

- encoding: str, encoding for reading Python 2 strings

37

38

Returns:

39

- ndarray or NpzFile: Loaded array data on GPU

40

41

Notes:

42

- Automatically transfers loaded data to GPU

43

- Supports .npy single array and .npz archive formats

44

- Compatible with numpy.save() output

45

"""

46

47

def savez(file, *args, **kwds):

48

"""

49

Save multiple arrays to single compressed file.

50

51

Parameters:

52

- file: str or file-like, output file path

53

- *args: arrays to save with automatic naming (arr_0, arr_1, ...)

54

- **kwds: arrays to save with specified names

55

56

Notes:

57

- Creates .npz archive with multiple arrays

58

- Arrays transferred to CPU before saving

59

- Useful for saving related datasets together

60

"""

61

62

def savez_compressed(file, *args, **kwds):

63

"""

64

Save multiple arrays to compressed .npz archive.

65

66

Parameters:

67

- file: str or file-like, output file path

68

- *args: arrays to save with automatic naming

69

- **kwds: arrays to save with specified names

70

71

Notes:

72

- Same as savez() but with compression for smaller files

73

- Slower saving but reduced disk space usage

74

- Recommended for long-term storage

75

"""

76

```

77

78

### Text File Operations

79

80

Human-readable text format I/O for data exchange, debugging, and integration with other tools and programming languages.

81

82

```python { .api }

83

def loadtxt(fname, dtype=float, comments='#', delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0, encoding='bytes', max_rows=None):

84

"""

85

Load data from text file with each row containing array elements.

86

87

Parameters:

88

- fname: str or file-like, input file path or file object

89

- dtype: data type, optional (default: float)

90

- comments: str or sequence, characters marking comment lines

91

- delimiter: str, optional, field delimiter (default: whitespace)

92

- converters: dict, optional, mapping column to conversion function

93

- skiprows: int, lines to skip at beginning of file

94

- usecols: int or sequence, columns to read

95

- unpack: bool, return separate arrays for each column

96

- ndmin: int, minimum number of dimensions for returned array

97

- encoding: str, encoding for decoding input file

98

- max_rows: int, optional, maximum rows to read

99

100

Returns:

101

- ndarray: Loaded data on GPU

102

103

Notes:

104

- Data loaded on CPU then transferred to GPU

105

- Compatible with CSV and whitespace-delimited formats

106

- Handles various numeric formats and missing values

107

"""

108

109

def savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\n', header='', footer='', comments='# ', encoding=None):

110

"""

111

Save array to text file.

112

113

Parameters:

114

- fname: str or file-like, output file path or file object

115

- X: 1D or 2D array_like, data to save

116

- fmt: str or sequence, format string for elements

117

- delimiter: str, string separating columns

118

- newline: str, string separating lines

119

- header: str, string written at beginning of file

120

- footer: str, string written at end of file

121

- comments: str, string prefixing header and footer

122

- encoding: str, encoding for output file

123

124

Notes:

125

- Array transferred to CPU before saving

126

- Supports custom formatting for each column

127

- Human-readable output suitable for external tools

128

"""

129

130

def genfromtxt(fname, dtype=float, comments='#', delimiter=None, skip_header=0, skip_footer=0, converters=None, missing_values=None, filling_values=None, usecols=None, names=None, excludelist=None, deletechars=''.join(sorted(''.join(sorted("~!@#$%^&*()+={}[]|\\:;\"'<>,.?/")))), defaultfmt="f%i", autostrip=False, replace_space='_', case_sensitive=True, unpack=None, ndmin=0, encoding='bytes', max_rows=None):

131

"""

132

Load data from text file with enhanced handling of missing values.

133

134

Parameters:

135

- fname: str or file-like, input file path

136

- dtype: data type for array

137

- comments: str, characters marking comment lines

138

- delimiter: str, field delimiter

139

- skip_header: int, lines to skip at start

140

- skip_footer: int, lines to skip at end

141

- converters: dict, column converters

142

- missing_values: set, strings representing missing data

143

- filling_values: values to use for missing data

144

- usecols: sequence, columns to read

145

- names: bool or list, field names for structured arrays

146

- excludelist: sequence, names to exclude

147

- deletechars: str, characters to remove from field names

148

- defaultfmt: str, default field name format

149

- autostrip: bool, automatically strip whitespace

150

- replace_space: str, character to replace spaces in names

151

- case_sensitive: bool, field name case sensitivity

152

- unpack: bool, return separate arrays

153

- ndmin: int, minimum dimensions

154

- encoding: str, file encoding

155

- max_rows: int, maximum rows to read

156

157

Returns:

158

- ndarray: Loaded data on GPU

159

160

Notes:

161

- More robust than loadtxt for complex text formats

162

- Handles missing values and structured data

163

- Supports named fields and data validation

164

"""

165

166

def fromfile(file, dtype=float, count=-1, sep='', offset=0):

167

"""

168

Construct array from data in text or binary file.

169

170

Parameters:

171

- file: str or file-like, input file

172

- dtype: data type for reading

173

- count: int, number of items to read (-1 for all)

174

- sep: str, separator between items (empty for binary)

175

- offset: int, offset from start of file

176

177

Returns:

178

- ndarray: 1D array constructed from file data

179

180

Notes:

181

- Binary mode when sep is empty string

182

- Text mode when sep is specified

183

- Data transferred to GPU after reading

184

"""

185

```

186

187

### Data Conversion and Transfer

188

189

Functions for seamless data transfer between CPU and GPU memory with format conversion capabilities.

190

191

```python { .api }

192

def frombuffer(buffer, dtype=float, count=-1, offset=0):

193

"""

194

Interpret buffer as 1D array.

195

196

Parameters:

197

- buffer: buffer_like, object exposing buffer interface

198

- dtype: data type for interpretation

199

- count: int, number of items to read (-1 for all)

200

- offset: int, start reading from this offset

201

202

Returns:

203

- ndarray: 1D array view of buffer data on GPU

204

205

Notes:

206

- Creates view into existing buffer

207

- Data copied to GPU memory

208

- Useful for interfacing with other libraries

209

"""

210

211

def fromstring(string, dtype=float, count=-1, sep=''):

212

"""

213

Create array from string data.

214

215

Parameters:

216

- string: str, string containing array data

217

- dtype: data type for parsing

218

- count: int, number of items to read (-1 for all)

219

- sep: str, separator between items

220

221

Returns:

222

- ndarray: 1D array parsed from string on GPU

223

224

Notes:

225

- Whitespace-separated when sep is empty

226

- Custom separator supported

227

- Convenient for parsing string-formatted data

228

"""

229

230

def fromfunction(func, shape, dtype=float, **kwargs):

231

"""

232

Construct array by executing function over coordinate arrays.

233

234

Parameters:

235

- func: callable, function to evaluate over coordinate grids

236

- shape: sequence of ints, shape of output array

237

- dtype: data type for output

238

- **kwargs: additional arguments passed to func

239

240

Returns:

241

- ndarray: Array with values func(coordinates) on GPU

242

243

Notes:

244

- Function called with coordinate arrays as arguments

245

- Useful for generating coordinate-based patterns

246

- Function executed on GPU when possible

247

"""

248

249

def fromiter(iterable, dtype, count=-1):

250

"""

251

Create array from iterable object.

252

253

Parameters:

254

- iterable: iterable, sequence of values

255

- dtype: data type for array elements

256

- count: int, number of items to read (-1 for all)

257

258

Returns:

259

- ndarray: 1D array created from iterable on GPU

260

261

Notes:

262

- Iterates through all items if count is -1

263

- Efficient for converting Python sequences

264

- Data transferred to GPU after creation

265

"""

266

```

267

268

### Array Formatting and Display

269

270

Comprehensive formatting functions for array visualization, debugging, and custom string representations.

271

272

```python { .api }

273

def array_repr(arr, max_line_width=None, precision=None, suppress_small=None):

274

"""

275

Return string representation of array.

276

277

Parameters:

278

- arr: ndarray, input array

279

- max_line_width: int, maximum characters per line

280

- precision: int, floating point precision

281

- suppress_small: bool, suppress small floating point values

282

283

Returns:

284

- str: String representation suitable for eval()

285

286

Notes:

287

- Creates repr() string that could recreate array

288

- Respects NumPy print options

289

- Array data transferred to CPU for formatting

290

"""

291

292

def array_str(a, max_line_width=None, precision=None, suppress_small=None):

293

"""

294

Return string representation of array data.

295

296

Parameters:

297

- a: ndarray, input array

298

- max_line_width: int, maximum characters per line

299

- precision: int, floating point precision

300

- suppress_small: bool, suppress small values

301

302

Returns:

303

- str: String representation of array contents

304

305

Notes:

306

- Creates str() representation for display

307

- Does not include array constructor syntax

308

- Formatted for human readability

309

"""

310

311

def array2string(a, max_line_width=None, precision=None, suppress_small=None, separator=' ', prefix='', style=float64, formatter=None, threshold=None, edgeitems=None, sign=None, floatmode=None, suffix='', legacy=None):

312

"""

313

Return string representation with full formatting control.

314

315

Parameters:

316

- a: ndarray, input array

317

- max_line_width: int, maximum line width

318

- precision: int, floating point precision

319

- suppress_small: bool, suppress small values

320

- separator: str, element separator

321

- prefix: str, prefix for each line

322

- style: callable, deprecated formatting function

323

- formatter: dict, custom formatters for different types

324

- threshold: int, total items before summarizing

325

- edgeitems: int, items at each edge when summarizing

326

- sign: str, control sign printing ('+', '-', ' ')

327

- floatmode: str, floating point format mode

328

- suffix: str, suffix for each line

329

- legacy: str, compatibility mode

330

331

Returns:

332

- str: Formatted string representation

333

334

Notes:

335

- Most flexible formatting function

336

- Supports custom formatters for different data types

337

- Handles large arrays with summarization

338

"""

339

340

def format_float_positional(x, precision=None, unique=True, fractional=True, trim='k', sign=False, pad_left=None, pad_right=None):

341

"""

342

Format float in positional notation.

343

344

Parameters:

345

- x: float, value to format

346

- precision: int, number of digits after decimal

347

- unique: bool, use minimum precision for unique representation

348

- fractional: bool, use fractional precision mode

349

- trim: str, trim trailing zeros ('k', '0', '.')

350

- sign: bool, always show sign

351

- pad_left: int, minimum total width

352

- pad_right: int, pad right side

353

354

Returns:

355

- str: Formatted float string

356

"""

357

358

def format_float_scientific(x, precision=None, unique=True, trim='k', sign=False, pad_left=None, exp_digits=None):

359

"""

360

Format float in scientific notation.

361

362

Parameters:

363

- x: float, value to format

364

- precision: int, number of digits after decimal

365

- unique: bool, use minimum precision for unique representation

366

- trim: str, trim trailing zeros

367

- sign: bool, always show sign

368

- pad_left: int, minimum total width

369

- exp_digits: int, minimum exponent digits

370

371

Returns:

372

- str: Formatted float string in scientific notation

373

"""

374

```

375

376

### Usage Examples

377

378

#### Basic File I/O

379

380

```python

381

import cupy as cp

382

383

# Create sample data

384

data = cp.random.random((1000, 100))

385

labels = cp.random.randint(0, 10, 1000)

386

387

# Save arrays to files

388

cp.save('data.npy', data)

389

cp.savez('dataset.npz', features=data, labels=labels)

390

cp.savez_compressed('dataset_compressed.npz', features=data, labels=labels)

391

392

# Load arrays from files

393

loaded_data = cp.load('data.npy')

394

archive = cp.load('dataset.npz')

395

features = archive['features']

396

labels = archive['labels']

397

398

print(f"Original shape: {data.shape}, Loaded shape: {loaded_data.shape}")

399

print(f"Data matches: {cp.allclose(data, loaded_data)}")

400

```

401

402

#### Text File Operations

403

404

```python

405

import cupy as cp

406

407

# Save data to text file

408

data = cp.array([[1.1, 2.2, 3.3],

409

[4.4, 5.5, 6.6],

410

[7.7, 8.8, 9.9]])

411

412

cp.savetxt('data.txt', data, delimiter=',', header='col1,col2,col3', fmt='%.2f')

413

414

# Load data from text file

415

loaded = cp.loadtxt('data.txt', delimiter=',', skiprows=1)

416

print(f"Text data shape: {loaded.shape}")

417

418

# Handle CSV with mixed data types using genfromtxt

419

# Assuming file with columns: name, age, score

420

mixed_data = cp.genfromtxt('mixed_data.csv',

421

delimiter=',',

422

names=True,

423

dtype=None,

424

encoding='utf-8')

425

```

426

427

#### Advanced Formatting

428

429

```python

430

import cupy as cp

431

432

# Create array for formatting examples

433

arr = cp.array([[1.23456789, 2.87654321],

434

[0.00000012, 999999.999]])

435

436

# Different representation formats

437

print("Default repr:")

438

print(cp.array_repr(arr))

439

440

print("\nCustom precision:")

441

print(cp.array_str(arr, precision=2))

442

443

print("\nScientific notation:")

444

print(cp.array2string(arr, formatter={'float': '{:.2e}'.format}))

445

446

# Format individual floats

447

value = 123.456789

448

positional = cp.format_float_positional(value, precision=2)

449

scientific = cp.format_float_scientific(value, precision=2)

450

print(f"Positional: {positional}, Scientific: {scientific}")

451

```

452

453

#### Data Transfer Workflows

454

455

```python

456

import cupy as cp

457

import numpy as np

458

459

# CPU to GPU workflow

460

cpu_data = np.random.random((10000, 1000))

461

462

# Method 1: Direct conversion

463

gpu_data = cp.asarray(cpu_data)

464

465

# Method 2: Save/load (useful for large datasets)

466

np.save('temp_data.npy', cpu_data)

467

gpu_data = cp.load('temp_data.npy')

468

469

# GPU to CPU workflow

470

result = cp.random.random((1000, 1000))

471

472

# Method 1: Direct conversion

473

cpu_result = cp.asnumpy(result)

474

475

# Method 2: Save to file

476

cp.save('gpu_result.npy', result)

477

# Later load on CPU

478

cpu_result = np.load('gpu_result.npy')

479

480

# Verify data integrity

481

print(f"Data preserved: {np.allclose(cpu_data, cp.asnumpy(gpu_data))}")

482

```

483

484

#### Batch Processing

485

486

```python

487

import cupy as cp

488

import os

489

490

# Process multiple files

491

file_pattern = 'data_batch_*.npy'

492

results = []

493

494

for filename in sorted(os.glob(file_pattern)):

495

# Load batch

496

batch = cp.load(filename)

497

498

# Process on GPU

499

processed = cp.fft.fft2(batch)

500

result = cp.abs(processed).mean(axis=(1,2))

501

502

results.append(result)

503

504

# Combine results and save

505

final_result = cp.concatenate(results)

506

cp.save('processed_results.npy', final_result)

507

508

# Save processing log as text

509

processing_info = cp.array([len(results), final_result.shape[0], final_result.mean()])

510

cp.savetxt('processing_log.txt', processing_info,

511

header='num_batches,total_samples,mean_value',

512

fmt='%.6f')

513

```

514

515

## Notes

516

517

- All I/O operations automatically handle CPU/GPU memory transfers

518

- Binary formats (.npy, .npz) preserve exact precision and metadata

519

- Text formats are human-readable but may lose precision for floating-point data

520

- Compressed archives (.npz with compression) balance storage efficiency and loading speed

521

- File I/O operations are synchronous and will block until completion

522

- Large datasets may benefit from chunked I/O to manage memory usage

523

- CuPy I/O functions are fully compatible with NumPy file formats

524

- For maximum performance, keep data on GPU and minimize CPU/GPU transfers during processing