0
# General Utilities
1
2
General-purpose utilities for testing, data validation, and parameter handling in machine learning workflows.
3
4
## Capabilities
5
6
### Counter Utility
7
8
Counting utility for collections and iterables.
9
10
```python { .api }
11
class Counter:
12
def __init__(self, iterable=None):
13
"""
14
Counter for counting hashable objects.
15
16
Parameters:
17
- iterable: iterable, initial data to count
18
"""
19
20
def update(self, iterable):
21
"""Update counts from iterable"""
22
23
def most_common(self, n=None):
24
"""
25
Return list of (element, count) tuples for most common elements.
26
27
Parameters:
28
- n: int, number of most common elements to return
29
30
Returns:
31
- common_elements: list, tuples of (element, count)
32
"""
33
34
def keys(self):
35
"""Return iterator over counter keys"""
36
37
def values(self):
38
"""Return iterator over counter values"""
39
40
def items(self):
41
"""Return iterator over (key, value) pairs"""
42
```
43
44
### Data Validation
45
46
Functions for validating input data format and consistency.
47
48
```python { .api }
49
def check_Xy(X, y, y_int=True):
50
"""
51
Validate input data format for machine learning algorithms.
52
53
Parameters:
54
- X: array-like, feature matrix
55
- y: array-like, target labels/values
56
- y_int: bool, whether y should contain integers
57
58
Returns:
59
- X_validated: array, validated feature matrix
60
- y_validated: array, validated target array
61
62
Raises:
63
- ValueError: if data format is invalid
64
"""
65
```
66
67
### Testing Utilities
68
69
Utilities for testing and exception handling.
70
71
```python { .api }
72
def assert_raises(exception_type, callable_obj, *args, **kwargs):
73
"""
74
Test utility for verifying that a function raises expected exception.
75
76
Parameters:
77
- exception_type: Exception class, expected exception type
78
- callable_obj: callable, function to test
79
- args: arguments to pass to callable
80
- kwargs: keyword arguments to pass to callable
81
82
Raises:
83
- AssertionError: if expected exception is not raised
84
"""
85
```
86
87
### Parameter Formatting
88
89
Utilities for formatting and handling parameters.
90
91
```python { .api }
92
def format_kwarg_dictionaries(**kwargs):
93
"""
94
Format keyword argument dictionaries for display or logging.
95
96
Parameters:
97
- kwargs: keyword arguments to format
98
99
Returns:
100
- formatted_dict: dict, formatted parameter dictionary
101
"""
102
```
103
104
## Usage Examples
105
106
### Counter Example
107
108
```python
109
from mlxtend.utils import Counter
110
111
# Count elements in a list
112
data = ['apple', 'banana', 'apple', 'cherry', 'banana', 'apple']
113
counter = Counter(data)
114
115
print("Element counts:")
116
for item, count in counter.items():
117
print(f" {item}: {count}")
118
119
print(f"\nMost common: {counter.most_common(2)}")
120
121
# Update with more data
122
counter.update(['banana', 'date', 'apple'])
123
print(f"After update: {counter.most_common()}")
124
```
125
126
### Data Validation Example
127
128
```python
129
from mlxtend.utils import check_Xy
130
import numpy as np
131
132
# Valid data
133
X = np.random.randn(100, 5)
134
y = np.random.randint(0, 3, 100)
135
136
try:
137
X_val, y_val = check_Xy(X, y, y_int=True)
138
print("Data validation passed")
139
print(f"X shape: {X_val.shape}, y shape: {y_val.shape}")
140
except ValueError as e:
141
print(f"Validation error: {e}")
142
143
# Invalid data (mismatched samples)
144
X_invalid = np.random.randn(100, 5)
145
y_invalid = np.random.randint(0, 3, 90) # Wrong number of samples
146
147
try:
148
X_val, y_val = check_Xy(X_invalid, y_invalid)
149
except ValueError as e:
150
print(f"Expected validation error: {e}")
151
```
152
153
### Testing Utilities Example
154
155
```python
156
from mlxtend.utils import assert_raises
157
158
def divide_by_zero():
159
return 1 / 0
160
161
def safe_divide(a, b):
162
if b == 0:
163
raise ValueError("Cannot divide by zero")
164
return a / b
165
166
# Test that functions raise expected exceptions
167
try:
168
assert_raises(ZeroDivisionError, divide_by_zero)
169
print("✓ ZeroDivisionError assertion passed")
170
except AssertionError:
171
print("✗ Expected ZeroDivisionError not raised")
172
173
try:
174
assert_raises(ValueError, safe_divide, 10, 0)
175
print("✓ ValueError assertion passed")
176
except AssertionError:
177
print("✗ Expected ValueError not raised")
178
```
179
180
### Parameter Formatting Example
181
182
```python
183
from mlxtend.utils import format_kwarg_dictionaries
184
185
# Format parameters for logging
186
params = {
187
'learning_rate': 0.01,
188
'epochs': 100,
189
'batch_size': 32,
190
'optimizer': 'adam'
191
}
192
193
formatted = format_kwarg_dictionaries(**params)
194
print("Formatted parameters:")
195
for key, value in formatted.items():
196
print(f" {key}: {value}")
197
```