or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

configuration.mdcore-analysis.mdindex.mdreport-display.md

index.mddocs/

0

# Sweetviz

1

2

A pandas-based library that generates beautiful, high-density visualizations for exploratory data analysis (EDA) with minimal code. Sweetviz specializes in target analysis, dataset comparison, and feature analysis, offering unified mixed-type associations that integrate numerical correlations, categorical associations, and categorical-numerical relationships seamlessly.

3

4

## Package Information

5

6

- **Package Name**: sweetviz

7

- **Language**: Python

8

- **Installation**: `pip install sweetviz`

9

10

## Core Imports

11

12

```python

13

import sweetviz as sv

14

```

15

16

## Basic Usage

17

18

```python

19

import sweetviz as sv

20

import pandas as pd

21

22

# Load your dataset

23

df = pd.read_csv('your_dataset.csv')

24

25

# Create a report analyzing the entire dataset

26

my_report = sv.analyze(df)

27

my_report.show_html() # Opens in browser

28

29

# Analyze with a target feature

30

my_report = sv.analyze(df, target_feat='target_column')

31

my_report.show_html()

32

33

# Compare two datasets (e.g., training vs test)

34

train_report = sv.compare([train_df, "Training"], [test_df, "Test"])

35

train_report.show_html()

36

37

# Compare subsets within the same dataset

38

my_report = sv.compare_intra(df, df["gender"] == "male", ["Male", "Female"])

39

my_report.show_html()

40

```

41

42

## Architecture

43

44

Sweetviz operates through a three-step process:

45

46

1. **Analysis Functions**: `analyze()`, `compare()`, or `compare_intra()` create `DataframeReport` objects

47

2. **Report Processing**: The library analyzes feature types, calculates statistics, and generates associations

48

3. **Output Generation**: Reports are rendered as self-contained HTML files or embedded in notebooks

49

50

Key components:

51

- **Analysis Engine**: Automatic type detection and statistical analysis

52

- **Association Matrix**: Unified correlation analysis across numerical, categorical, and mixed data types

53

- **Visualization Generator**: High-density charts and interactive HTML reports

54

- **Configuration System**: Customizable settings via INI files and FeatureConfig objects

55

56

## Capabilities

57

58

### Core Analysis Functions

59

60

Primary functions for creating exploratory data analysis reports. These functions analyze dataframes and return DataframeReport objects containing comprehensive statistics, visualizations, and association matrices.

61

62

```python { .api }

63

def analyze(source, target_feat=None, feat_cfg=None, pairwise_analysis='auto'): ...

64

def compare(source, compare, target_feat=None, feat_cfg=None, pairwise_analysis='auto'): ...

65

def compare_intra(source_df, condition_series, names, target_feat=None, feat_cfg=None, pairwise_analysis='auto'): ...

66

```

67

68

[Core Analysis](./core-analysis.md)

69

70

### Report Generation and Display

71

72

Methods for rendering and outputting analysis reports in various formats. DataframeReport objects provide multiple output options including HTML files, notebook embedding, and experiment tracking integration.

73

74

```python { .api }

75

class DataframeReport:

76

def show_html(filepath='SWEETVIZ_REPORT.html', open_browser=True, layout='widescreen', scale=None): ...

77

def show_notebook(w=None, h=None, scale=None, layout=None, filepath=None, file_layout=None, file_scale=None): ...

78

def log_comet(experiment): ...

79

```

80

81

[Report Display](./report-display.md)

82

83

### Feature Configuration

84

85

Configuration system for controlling feature type detection, analysis parameters, and report customization. Enables fine-tuned control over which features to analyze and how they should be interpreted.

86

87

```python { .api }

88

class FeatureConfig:

89

def __init__(skip=None, force_cat=None, force_text=None, force_num=None): ...

90

def get_predetermined_type(feature_name): ...

91

def get_all_mentioned_features(): ...

92

```

93

94

[Configuration](./configuration.md)

95

96

## Types

97

98

```python { .api }

99

from typing import Union, Tuple, List

100

import pandas as pd

101

from enum import Enum

102

103

# Core type aliases

104

DataFrameInput = Union[pd.DataFrame, Tuple[pd.DataFrame, str]]

105

106

class FeatureType(Enum):

107

TYPE_CAT = "CATEGORICAL"

108

TYPE_BOOL = "BOOL"

109

TYPE_NUM = "NUMERIC"

110

TYPE_TEXT = "TEXT"

111

TYPE_UNSUPPORTED = "UNSUPPORTED"

112

TYPE_ALL_NAN = "ALL_NAN"

113

TYPE_UNKNOWN = "UNKNOWN"

114

TYPE_SKIPPED = "SKIPPED"

115

def __str__(self): ...

116

117

class NumWithPercent:

118

def __init__(self, number, total_for_percentage): ...

119

def __int__(self): ...

120

def __float__(self): ...

121

def __repr__(self): ...

122

```