0
# Data Management
1
2
Data handling and transformation capabilities for Vincent visualizations. Provides seamless conversion between Python data structures and Vega data specifications, with extensive support for pandas integration and data transformations.
3
4
## Capabilities
5
6
### Data Container
7
8
The core Data class that represents data sources in Vincent visualizations, supporting both embedded data and external data references.
9
10
```python { .api }
11
class Data(GrammarClass):
12
"""Data container for visualization"""
13
14
def __init__(self, name=None, **kwargs):
15
"""
16
Initialize a Data object
17
18
Parameters:
19
- name: Name of the data set (str or None, defaults to 'table')
20
- **kwargs: Additional attributes to set on initialization
21
"""
22
23
@classmethod
24
def from_pandas(cls, data, columns=None, key_on='idx', name=None,
25
series_key='data', grouped=False, records=False, **kwargs):
26
"""
27
Create Data object from pandas DataFrame or Series
28
29
Parameters:
30
- data: pandas DataFrame or Series
31
- columns: DataFrame columns to convert (list or None for all)
32
- key_on: Value to key on for x-axis data (str, default 'idx')
33
- name: Name for the data set (str or None)
34
- series_key: Key name for Series data (str, default 'data')
35
- grouped: Whether to treat data as grouped (bool, default False)
36
- records: Whether to output records format (bool, default False)
37
- **kwargs: Additional attributes for initialization
38
39
Returns:
40
Data: Vincent Data object with converted pandas data
41
"""
42
43
@classmethod
44
def from_iter(cls, data, name=None):
45
"""
46
Create Data object from Python iterables
47
48
Parameters:
49
- data: Python list, tuple, or dictionary iterable
50
- name: Name of the data set (str or None)
51
52
Returns:
53
Data: Vincent Data object with converted iterable data
54
"""
55
56
@classmethod
57
def from_mult_iters(cls, name=None, idx=None, **kwargs):
58
"""
59
Create Data object from multiple iterables
60
61
Parameters:
62
- name: Name of the data set (str or None, defaults to 'table')
63
- idx: Index iterator (iterable or None)
64
- **kwargs: Named iterables to combine into data structure
65
66
Returns:
67
Data: Vincent Data object with combined iterator data
68
"""
69
```
70
71
**Usage Examples:**
72
73
```python
74
import vincent
75
import pandas as pd
76
77
# From pandas DataFrame
78
df = pd.DataFrame({'x': [1, 2, 3], 'y': [10, 20, 30]})
79
data = vincent.Data.from_pandas(df)
80
81
# From Python list
82
list_data = [('A', 10), ('B', 20), ('C', 30)]
83
data = vincent.Data.from_iter(list_data)
84
85
# From multiple iterables
86
x_vals = [1, 2, 3, 4]
87
y_vals = [10, 20, 15, 25]
88
data = vincent.Data.from_mult_iters(x=x_vals, y=y_vals)
89
```
90
91
### Data Transformations
92
93
Comprehensive data transformation system supporting all Vega transform types for data processing and manipulation.
94
95
```python { .api }
96
class Transform(GrammarClass):
97
"""Container for data transformation operations"""
98
99
@property
100
def type(self):
101
"""
102
Transform type specification
103
104
Valid transform types:
105
- Data manipulation: 'array', 'copy', 'cross', 'facet', 'filter',
106
'flatten', 'fold', 'formula', 'slice', 'sort', 'stats', 'truncate',
107
'unique', 'window', 'zip'
108
- Layout algorithms: 'force', 'treemap', 'wordcloud'
109
- Geographic: 'geo', 'geopath'
110
- Specialized: 'link', 'pie', 'stack'
111
112
Returns:
113
str: Transform type name
114
"""
115
```
116
117
**Common Transform Types:**
118
119
- **filter**: Filter data based on conditions
120
- **formula**: Compute new data fields using expressions
121
- **sort**: Sort data by specified fields
122
- **stats**: Calculate statistical summaries
123
- **pie**: Apply pie layout for pie charts
124
- **stack**: Stack data for stacked charts
125
- **wordcloud**: Generate word cloud layout
126
127
**Usage Example:**
128
129
```python
130
import vincent
131
132
# Create transform for filtering data
133
transform = vincent.Transform()
134
transform.type = 'filter'
135
transform.test = 'datum.value > 10'
136
137
# Create transform for computing new fields
138
formula_transform = vincent.Transform()
139
formula_transform.type = 'formula'
140
formula_transform.field = 'scaled_value'
141
formula_transform.expr = 'datum.value * 2'
142
```
143
144
### Data References
145
146
System for referencing data fields in scales and other visualization components.
147
148
```python { .api }
149
class DataRef(GrammarClass):
150
"""Definitions for how data is referenced by scales"""
151
152
@property
153
def data(self):
154
"""
155
Name of data-set containing the domain values
156
157
Returns:
158
str: Data set name
159
"""
160
161
@property
162
def field(self):
163
"""
164
Reference to desired data field(s)
165
166
Returns:
167
str or list: Field name(s) in dot-notation (e.g., 'data.x')
168
169
If multiple fields are given, values from all fields are included.
170
"""
171
```
172
173
**Usage Example:**
174
175
```python
176
import vincent
177
178
# Create data reference for a scale domain
179
data_ref = vincent.DataRef()
180
data_ref.data = 'table'
181
data_ref.field = 'data.x' # Reference the 'x' field
182
183
# Multiple field reference
184
multi_ref = vincent.DataRef()
185
multi_ref.data = 'table'
186
multi_ref.field = ['data.x', 'data.y'] # Reference both x and y fields
187
```
188
189
## Data Format Conversion
190
191
Vincent automatically handles conversion between different data formats:
192
193
### Pandas Integration
194
195
- **DataFrame**: Converts to row-oriented JSON format with column names as keys
196
- **Series**: Converts to array format with index preservation
197
- **Index handling**: Maintains pandas index as 'idx' field or custom key_on field
198
- **Grouping support**: Handles grouped DataFrames for multi-series visualizations
199
200
### Python Native Types
201
202
- **Lists**: Converts to Vega-compatible array format
203
- **Tuples**: Treated as ordered data points
204
- **Dictionaries**: Preserves key-value structure in Vega format
205
- **Nested structures**: Flattens appropriately for visualization
206
207
### Data Validation
208
209
All data operations include validation to ensure:
210
- Data types are supported by Vincent
211
- Field references are valid
212
- Transform parameters are within acceptable ranges
213
- Data structure matches visualization requirements