or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

blackbox.mddata.mdglassbox.mdgreybox.mdindex.mdperformance.mdprivacy.mdutils.mdvisualization.md

data.mddocs/

0

# Data Analysis

1

2

Tools for understanding dataset characteristics and feature distributions to inform model selection and feature engineering decisions.

3

4

## Capabilities

5

6

### Class Distribution Analysis

7

8

Analyzes class distributions in classification datasets to identify imbalances and understand target variable characteristics.

9

10

```python { .api }

11

class ClassHistogram:

12

def __init__(self, feature_names=None, **kwargs):

13

"""

14

Class distribution analyzer.

15

16

Parameters:

17

feature_names (list, optional): Names for features

18

**kwargs: Additional arguments

19

"""

20

21

def explain_data(self, X, y, name=None):

22

"""

23

Analyze class distributions in the dataset.

24

25

Parameters:

26

X (array-like): Feature data

27

y (array-like): Target labels

28

name (str, optional): Name for explanation

29

30

Returns:

31

Explanation object with class distribution analysis

32

"""

33

```

34

35

### Marginal Distribution Analysis

36

37

Analyzes marginal distributions of features to understand data characteristics and identify potential issues.

38

39

```python { .api }

40

class Marginal:

41

def __init__(self, feature_names=None, feature_types=None, **kwargs):

42

"""

43

Marginal distribution analyzer.

44

45

Parameters:

46

feature_names (list, optional): Names for features

47

feature_types (list, optional): Types for features

48

**kwargs: Additional arguments

49

"""

50

51

def explain_data(self, X, y=None, name=None):

52

"""

53

Analyze marginal feature distributions.

54

55

Parameters:

56

X (array-like): Feature data

57

y (array-like, optional): Target labels

58

name (str, optional): Name for explanation

59

60

Returns:

61

Explanation object with marginal distribution analysis

62

"""

63

```

64

65

## Usage Examples

66

67

```python

68

from interpret.data import ClassHistogram, Marginal

69

from interpret import show

70

from sklearn.datasets import load_wine

71

72

# Load dataset

73

data = load_wine()

74

X, y = data.data, data.target

75

76

# Analyze class distribution

77

class_hist = ClassHistogram(feature_names=data.feature_names)

78

class_exp = class_hist.explain_data(X, y, name="Class Distribution")

79

show(class_exp)

80

81

# Analyze feature distributions

82

marginal = Marginal(feature_names=data.feature_names)

83

marginal_exp = marginal.explain_data(X, y, name="Feature Distributions")

84

show(marginal_exp)

85

```