0
# SQL Analysis and Lineage Runner
1
2
The LineageRunner class provides the main programmatic interface for analyzing SQL statements and extracting lineage information. It supports multiple SQL dialects, metadata integration, and various output formats including visualization and data export.
3
4
## Capabilities
5
6
### LineageRunner Class
7
8
The primary entry point for SQL lineage analysis, offering both table-level and column-level lineage extraction with configurable parsing and metadata options.
9
10
```python { .api }
11
class LineageRunner:
12
def __init__(
13
self,
14
sql: str,
15
dialect: str = "ansi",
16
metadata_provider: MetaDataProvider = DummyMetaDataProvider(),
17
verbose: bool = False,
18
silent_mode: bool = False,
19
draw_options: Optional[Dict[str, Any]] = None,
20
file_path: str = "."
21
):
22
"""
23
The entry point of SQLLineage after command line options are parsed.
24
25
Parameters:
26
- sql: a string representation of SQL statements
27
- dialect: SQL dialect name (default: "ansi")
28
- metadata_provider: provider for table/column metadata
29
- verbose: enable verbose output for debugging
30
- silent_mode: skip unsupported SQL statements silently
31
- draw_options: options for visualization (optional)
32
- file_path: file path context for relative references
33
"""
34
```
35
36
### Statement Analysis
37
38
Methods for accessing the parsed SQL statements and their components.
39
40
```python { .api }
41
def statements(self) -> List[str]:
42
"""Get list of individual SQL statements"""
43
44
@staticmethod
45
def supported_dialects() -> Dict[str, List[str]]:
46
"""Get mapping of parser names to supported SQL dialects"""
47
```
48
49
### Table-Level Lineage
50
51
Properties for accessing table-level lineage information including source, target, and intermediate tables identified in the SQL analysis.
52
53
```python { .api }
54
@property
55
def source_tables(self) -> List[Table]:
56
"""Get list of source tables (tables being read from)"""
57
58
@property
59
def target_tables(self) -> List[Table]:
60
"""Get list of target tables (tables being written to)"""
61
62
@property
63
def intermediate_tables(self) -> List[Table]:
64
"""Get list of intermediate tables (CTEs, temp tables, etc.)"""
65
```
66
67
### Column-Level Lineage
68
69
Methods for extracting column-level dependencies and relationships between source and target columns.
70
71
```python { .api }
72
def get_column_lineage(
73
self,
74
exclude_path_ending_in_subquery: bool = True,
75
exclude_subquery_columns: bool = False
76
) -> List[Tuple[Column, Column]]:
77
"""
78
Get column-level lineage as list of (source_column, target_column) tuples.
79
80
Parameters:
81
- exclude_path_ending_in_subquery: exclude paths ending in subqueries
82
- exclude_subquery_columns: exclude columns from subqueries
83
84
Returns:
85
List of tuples where each tuple contains:
86
- source_column: Column object representing the source
87
- target_column: Column object representing the target
88
"""
89
```
90
91
### Output and Visualization
92
93
Methods for generating various output formats and launching interactive visualizations.
94
95
```python { .api }
96
def print_table_lineage(self) -> None:
97
"""Print table-level lineage to stdout in human-readable format"""
98
99
def print_column_lineage(self) -> None:
100
"""Print column-level lineage to stdout in human-readable format"""
101
102
def to_cytoscape(self, level: LineageLevel = LineageLevel.TABLE) -> List[Dict[str, Dict[str, str]]]:
103
"""
104
Export lineage graph to Cytoscape.js format for web visualization.
105
106
Parameters:
107
- level: lineage level (LineageLevel.TABLE or LineageLevel.COLUMN)
108
109
Returns:
110
List of dictionaries representing nodes and edges in Cytoscape format
111
"""
112
113
def draw(self) -> None:
114
"""
115
Launch interactive web visualization of the lineage graph.
116
Opens a web browser with an interactive graph interface.
117
"""
118
```
119
120
## Usage Examples
121
122
### Basic Table Lineage
123
124
```python
125
from sqllineage.runner import LineageRunner
126
127
sql = """
128
CREATE TABLE staging.customer_summary AS
129
SELECT
130
c.customer_id,
131
c.customer_name,
132
COUNT(o.order_id) as order_count,
133
SUM(o.total_amount) as total_spent
134
FROM raw.customers c
135
LEFT JOIN raw.orders o ON c.customer_id = o.customer_id
136
GROUP BY c.customer_id, c.customer_name
137
"""
138
139
runner = LineageRunner(sql)
140
print("Sources:", [str(t) for t in runner.source_tables])
141
print("Targets:", [str(t) for t in runner.target_tables])
142
```
143
144
### Column-Level Lineage with Metadata
145
146
```python
147
from sqllineage.runner import LineageRunner
148
from sqllineage.core.metadata.dummy import DummyMetaDataProvider
149
150
# Provide schema metadata for more accurate column lineage
151
metadata = {
152
"raw.customers": ["customer_id", "customer_name", "email", "created_date"],
153
"raw.orders": ["order_id", "customer_id", "total_amount", "order_date"]
154
}
155
metadata_provider = DummyMetaDataProvider(metadata)
156
157
runner = LineageRunner(sql, metadata_provider=metadata_provider)
158
159
# Get detailed column lineage
160
for src_col, tgt_col in runner.get_column_lineage():
161
print(f"{src_col} -> {tgt_col}")
162
```
163
164
### Multi-Dialect Support
165
166
```python
167
# Analyze Snowflake SQL
168
snowflake_sql = """
169
CREATE OR REPLACE TABLE analytics.daily_sales AS
170
SELECT
171
DATE_TRUNC('DAY', order_timestamp) as sale_date,
172
SUM(amount) as daily_revenue
173
FROM raw.transactions
174
WHERE order_timestamp >= CURRENT_DATE - INTERVAL '30 days'
175
GROUP BY DATE_TRUNC('DAY', order_timestamp)
176
"""
177
178
runner = LineageRunner(snowflake_sql, dialect="snowflake")
179
print("Supported dialects:", runner.supported_dialects())
180
```
181
182
### Visualization and Export
183
184
```python
185
# Launch interactive web visualization
186
runner.draw()
187
188
# Export to Cytoscape format for custom visualization
189
cyto_data = runner.to_cytoscape()
190
print("Graph data:", cyto_data)
191
192
# Print human-readable lineage reports
193
runner.print_table_lineage()
194
runner.print_column_lineage()
195
```
196
197
### Error Handling and Silent Mode
198
199
```python
200
complex_sql = """
201
-- Mix of supported and unsupported statements
202
CREATE VIEW sales_view AS SELECT * FROM sales;
203
GRANT SELECT ON sales_view TO analyst_role; -- Unsupported
204
DROP TABLE old_temp_table;
205
"""
206
207
# Silent mode skips unsupported statements
208
runner = LineageRunner(complex_sql, silent_mode=True)
209
print("Analysis completed, unsupported statements skipped")
210
211
# Verbose mode for debugging
212
runner_verbose = LineageRunner(complex_sql, verbose=True)
213
```