or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-keras-preprocessing

Easy data preprocessing and data augmentation for deep learning models

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/keras-preprocessing@1.1.x

To install, run

npx @tessl/cli install tessl/pypi-keras-preprocessing@1.1.0

0

# Keras-Preprocessing

1

2

Easy data preprocessing and data augmentation for deep learning models. Keras-Preprocessing provides comprehensive utilities for text tokenization, sequence padding, and image augmentation specifically designed for training deep neural networks.

3

4

## Package Information

5

6

- **Package Name**: keras-preprocessing

7

- **Language**: Python

8

- **Installation**: `pip install keras-preprocessing`

9

10

## Core Imports

11

12

```python

13

import keras_preprocessing

14

```

15

16

Specific modules:

17

18

```python

19

from keras_preprocessing.text import Tokenizer, text_to_word_sequence

20

from keras_preprocessing.sequence import pad_sequences, TimeseriesGenerator

21

from keras_preprocessing.image import ImageDataGenerator, load_img, img_to_array

22

```

23

24

Legacy compatibility imports:

25

26

```python

27

from keras_preprocessing import image, text, sequence

28

```

29

30

## Basic Usage

31

32

```python

33

from keras_preprocessing.text import Tokenizer

34

from keras_preprocessing.sequence import pad_sequences

35

from keras_preprocessing.image import ImageDataGenerator

36

37

# Text preprocessing

38

tokenizer = Tokenizer(num_words=1000)

39

texts = ['hello world', 'deep learning', 'neural networks']

40

tokenizer.fit_on_texts(texts)

41

sequences = tokenizer.texts_to_sequences(texts)

42

43

# Sequence padding

44

padded = pad_sequences(sequences, maxlen=10, padding='post')

45

46

# Image data augmentation

47

datagen = ImageDataGenerator(

48

rotation_range=20,

49

width_shift_range=0.2,

50

height_shift_range=0.2,

51

horizontal_flip=True

52

)

53

54

# Load data from directory

55

train_generator = datagen.flow_from_directory(

56

'train_data/',

57

target_size=(224, 224),

58

batch_size=32,

59

class_mode='categorical'

60

)

61

```

62

63

## Architecture

64

65

Keras-Preprocessing is organized into three main functional modules:

66

67

- **Text Module**: Tokenization, text-to-sequence conversion, and vocabulary management for NLP tasks

68

- **Sequence Module**: Padding, sampling, and temporal data generation for sequential models

69

- **Image Module**: Data generators, augmentation pipelines, and image transformations for computer vision

70

71

Each module provides both low-level utilities and high-level generators that integrate seamlessly with Keras training workflows.

72

73

## Capabilities

74

75

### Text Processing

76

77

Text tokenization, vocabulary management, and text-to-sequence conversion utilities for natural language processing. Includes hashing tricks, one-hot encoding, and comprehensive tokenization with configurable filtering and preprocessing.

78

79

```python { .api }

80

class Tokenizer:

81

def __init__(self, num_words=None, filters='!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n',

82

lower=True, split=' ', char_level=False, oov_token=None, **kwargs): ...

83

def fit_on_texts(self, texts): ...

84

def texts_to_sequences(self, texts): ...

85

def texts_to_matrix(self, texts, mode='binary'): ...

86

87

def text_to_word_sequence(text, filters='!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n',

88

lower=True, split=" "): ...

89

def one_hot(text, n, filters='!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n',

90

lower=True, split=' '): ...

91

```

92

93

[Text Processing](./text-processing.md)

94

95

### Sequence Processing

96

97

Sequence padding, temporal data generation, and utilities for preparing sequential data for neural networks. Includes padding sequences to uniform length, generating skipgrams for word2vec, and creating time series batches.

98

99

```python { .api }

100

def pad_sequences(sequences, maxlen=None, dtype='int32', padding='pre',

101

truncating='pre', value=0.): ...

102

103

class TimeseriesGenerator:

104

def __init__(self, data, targets, length, sampling_rate=1, stride=1,

105

start_index=0, end_index=None, shuffle=False, reverse=False,

106

batch_size=128): ...

107

def __getitem__(self, index): ...

108

109

def skipgrams(sequence, vocabulary_size, window_size=4, negative_samples=1.,

110

shuffle=True, categorical=False, sampling_table=None, seed=None): ...

111

```

112

113

[Sequence Processing](./sequence-processing.md)

114

115

### Image Processing

116

117

Comprehensive image data augmentation, loading, and preprocessing utilities for computer vision models. Includes data generators, transformation functions, file utilities, and multiple data source iterators.

118

119

```python { .api }

120

class ImageDataGenerator:

121

def __init__(self, rotation_range=0., width_shift_range=0.,

122

height_shift_range=0., horizontal_flip=False, **kwargs): ...

123

def flow(self, x, y=None, batch_size=32, shuffle=True, **kwargs): ...

124

def flow_from_directory(self, directory, target_size=(256, 256),

125

color_mode='rgb', batch_size=32, **kwargs): ...

126

def flow_from_dataframe(self, dataframe, x_col="filename", y_col="class",

127

target_size=(256, 256), **kwargs): ...

128

129

def load_img(path, color_mode='rgb', target_size=None, interpolation='nearest'): ...

130

def img_to_array(img, data_format='channels_last', dtype='float32'): ...

131

def array_to_img(x, data_format='channels_last', scale=True, dtype='float32'): ...

132

```

133

134

[Image Processing](./image-processing.md)

135

136

## Types

137

138

```python { .api }

139

# Common types used across modules

140

NDArray = numpy.ndarray

141

PILImage = PIL.Image.Image

142

Generator = typing.Generator

143

Iterator = typing.Iterator

144

```

145

146

## Compatibility

147

148

```python { .api }

149

def set_keras_submodules(backend, utils):

150

"""Set Keras backend and utils submodules (deprecated)."""

151

152

def get_keras_submodule(name):

153

"""Retrieve Keras submodule by name (deprecated)."""

154

155

__version__ = '1.1.2'

156

```