Tessl Tile for pypi/mlxtend@0.23.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

classification.md clustering.md datasets.md evaluation.md feature-engineering.md file-io.md index.md math-utils.md pattern-mining.md plotting.md preprocessing.md regression.md text-processing.md utilities.md

preprocessing.mddocs/

0
# Data Preprocessing
1

2
Data transformation utilities including scaling, encoding, and array manipulation functions compatible with scikit-learn pipelines.
3

4
## Capabilities
5

6
### Mean Centering
7

8
Center data around the mean for normalization.
9

10
```python { .api }
11
class MeanCenterer:
12
    def __init__(self):
13
        """Mean centering transformer"""
14
    
15
    def fit(self, X, y=None):
16
        """Compute the mean to be used for centering"""
17
        
18
    def transform(self, X):
19
        """Center data around the mean"""
20
        
21
    def fit_transform(self, X, y=None):
22
        """Fit and transform data"""
23
        
24
    mean_: # Computed mean values
25
```
26

27
### Transaction Encoding
28

29
Encode transaction data for frequent pattern mining algorithms.
30

31
```python { .api }
32
class TransactionEncoder:
33
    def __init__(self):
34
        """Encode transaction data to binary matrix format"""
35
    
36
    def fit(self, X):
37
        """Learn the unique items in the transaction dataset"""
38
        
39
    def transform(self, X):
40
        """Transform transactions to binary matrix"""
41
        
42
    def fit_transform(self, X):
43
        """Fit and transform transactions"""
44
        
45
    columns_: # Column names (unique items)
46
```
47

48
### Scaling Functions
49

50
Scaling and standardization utilities for feature normalization.
51

52
```python { .api }
53
def standardize(array, columns=None, ddof=0):
54
    """
55
    Z-score standardization of features.
56
    
57
    Parameters:
58
    - array: array-like, input data
59
    - columns: list, columns to standardize (all if None)
60
    - ddof: int, degrees of freedom for standard deviation
61
    
62
    Returns:
63
    - standardized_array: array-like, standardized data
64
    """
65

66
def minmax_scaling(array, columns=None, min_val=0, max_val=1):
67
    """
68
    Min-max feature scaling to specified range.
69
    
70
    Parameters:
71
    - array: array-like, input data
72
    - columns: list, columns to scale (all if None)
73
    - min_val: float, minimum value of scaled range
74
    - max_val: float, maximum value of scaled range
75
    
76
    Returns:
77
    - scaled_array: array-like, scaled data
78
    """
79
```
80

81
### Additional Transformers
82

83
Utility transformers for data pipeline integration.
84

85
```python { .api }
86
class CopyTransformer:
87
    def __init__(self):
88
        """Identity transformer that copies input data"""
89
    
90
    def fit(self, X, y=None):
91
        """Fit transformer (no-op)"""
92
        
93
    def transform(self, X):
94
        """Return copy of input data"""
95

96
class DenseTransformer:
97
    def __init__(self):
98
        """Convert sparse matrices to dense format"""
99
    
100
    def fit(self, X, y=None):
101
        """Fit transformer (no-op)"""
102
        
103
    def transform(self, X):
104
        """Convert sparse matrix to dense"""
105

106
def one_hot(y, dtype=int):
107
    """
108
    One-hot encode categorical labels.
109
    
110
    Parameters:
111
    - y: array-like, categorical labels
112
    - dtype: data type for output array
113
    
114
    Returns:
115
    - encoded: array, one-hot encoded matrix
116
    """
117

118
def shuffle_arrays_unison(*arrays, random_seed=None):
119
    """
120
    Shuffle multiple arrays in unison.
121
    
122
    Parameters:
123
    - arrays: array-like objects to shuffle together
124
    - random_seed: int, random seed for reproducibility
125
    
126
    Returns:
127
    - shuffled_arrays: tuple of shuffled arrays
128
    """
129
```
130

131
## Usage Examples
132

133
```python
134
from mlxtend.preprocessing import TransactionEncoder, MeanCenterer, standardize
135
import pandas as pd
136
import numpy as np
137

138
# Transaction encoding example
139
transactions = [['bread', 'milk'], ['bread', 'beer'], ['milk', 'beer']]
140
te = TransactionEncoder()
141
te_ary = te.fit(transactions).transform(transactions)
142
df = pd.DataFrame(te_ary, columns=te.columns_)
143

144
# Mean centering example
145
X = np.random.randn(100, 5)
146
mc = MeanCenterer()
147
X_centered = mc.fit_transform(X)
148

149
# Standardization example
150
X_std = standardize(X)
151
```

Version

Tile

Files

preprocessing.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

preprocessing.mddocs/