0
# Microsoft Power BI
1
2
Comprehensive Power BI integration for managing datasets, triggering refreshes, and monitoring Power BI workspace operations. This provider enables automation of Power BI dataset refresh operations and workspace management through Microsoft Graph API.
3
4
## Capabilities
5
6
### Power BI Hook
7
8
Async hook for connecting to and interacting with Power BI services through Microsoft Graph API.
9
10
```python { .api }
11
class PowerBIHook(KiotaRequestAdapterHook):
12
"""
13
Hook for Power BI operations via Microsoft Graph API.
14
15
Provides async methods for dataset management, refresh operations,
16
and workspace interactions.
17
"""
18
19
async def get_refresh_history(
20
self,
21
dataset_id: str,
22
group_id: str,
23
top: int | None = None
24
) -> list[dict[str, Any]]: ...
25
26
async def get_refresh_details_by_refresh_id(
27
self,
28
dataset_id: str,
29
group_id: str,
30
dataset_refresh_id: str
31
) -> dict[str, str]: ...
32
33
async def trigger_dataset_refresh(
34
self,
35
dataset_id: str,
36
group_id: str,
37
request_body: dict[str, Any] | None = None
38
) -> str: ...
39
40
async def get_workspace_list(self) -> list[str]: ...
41
42
async def get_dataset_list(self, *, group_id: str) -> list[str]: ...
43
44
async def cancel_dataset_refresh(
45
self,
46
dataset_id: str,
47
group_id: str,
48
dataset_refresh_id: str
49
) -> None: ...
50
```
51
52
### Dataset Refresh Operations
53
54
Operators for triggering and monitoring Power BI dataset refresh operations with support for asynchronous waits and status monitoring.
55
56
```python { .api }
57
class PowerBIDatasetRefreshOperator(BaseOperator):
58
"""
59
Refreshes a Power BI dataset.
60
61
Parameters:
62
- dataset_id: The dataset id
63
- group_id: The workspace id
64
- conn_id: Connection ID for Power BI authentication
65
- timeout: Time in seconds to wait for terminal status
66
- check_interval: Seconds between refresh status checks
67
- request_body: Additional refresh parameters
68
"""
69
70
def __init__(
71
self,
72
*,
73
dataset_id: str,
74
group_id: str,
75
conn_id: str = "powerbi_default",
76
timeout: float = 60 * 60 * 24 * 7,
77
check_interval: int = 60,
78
request_body: dict[str, Any] | None = None,
79
**kwargs,
80
): ...
81
82
def execute(self, context: Context) -> None: ...
83
```
84
85
### Power BI Sensors
86
87
Sensors for monitoring Power BI dataset refresh status and workspace operations.
88
89
```python { .api }
90
class PowerBIDatasetRefreshSensor(BaseSensorOperator):
91
"""
92
Sensor for monitoring Power BI dataset refresh completion.
93
94
Monitors dataset refresh status until it reaches a terminal state
95
(completed, failed, or cancelled).
96
"""
97
98
def __init__(
99
self,
100
*,
101
dataset_id: str,
102
group_id: str,
103
dataset_refresh_id: str,
104
conn_id: str = "powerbi_default",
105
**kwargs,
106
): ...
107
108
def poke(self, context: Context) -> bool: ...
109
```
110
111
### Power BI Triggers
112
113
Deferrable triggers for async monitoring of Power BI operations.
114
115
```python { .api }
116
class PowerBITrigger(BaseTrigger):
117
"""Base trigger for Power BI async operations."""
118
119
def __init__(
120
self,
121
conn_id: str,
122
timeout: float,
123
proxies: dict | None = None,
124
api_version: str | None = None,
125
**kwargs,
126
): ...
127
128
class PowerBIDatasetListTrigger(PowerBITrigger):
129
"""Trigger for monitoring dataset list operations."""
130
131
class PowerBIWorkspaceListTrigger(PowerBITrigger):
132
"""Trigger for monitoring workspace list operations."""
133
```
134
135
## Usage Examples
136
137
### Basic Dataset Refresh
138
139
```python
140
from airflow import DAG
141
from airflow.providers.microsoft.azure.operators.powerbi import PowerBIDatasetRefreshOperator
142
from datetime import datetime, timedelta
143
144
dag = DAG(
145
'powerbi_refresh_example',
146
default_args={'owner': 'data-team'},
147
description='Refresh Power BI dataset',
148
schedule_interval=timedelta(days=1),
149
start_date=datetime(2024, 1, 1),
150
catchup=False
151
)
152
153
# Trigger dataset refresh
154
refresh_dataset = PowerBIDatasetRefreshOperator(
155
task_id='refresh_sales_dataset',
156
dataset_id='12345678-1234-1234-1234-123456789012',
157
group_id='87654321-4321-4321-4321-210987654321',
158
conn_id='powerbi_connection',
159
timeout=3600, # 1 hour timeout
160
check_interval=300, # Check every 5 minutes
161
dag=dag
162
)
163
```
164
165
### Advanced Refresh with Custom Parameters
166
167
```python
168
# Refresh with specific tables and enhanced refresh type
169
advanced_refresh = PowerBIDatasetRefreshOperator(
170
task_id='enhanced_dataset_refresh',
171
dataset_id='12345678-1234-1234-1234-123456789012',
172
group_id='87654321-4321-4321-4321-210987654321',
173
request_body={
174
"type": "full",
175
"commitMode": "transactional",
176
"objects": [
177
{
178
"table": "SalesData"
179
},
180
{
181
"table": "CustomerData"
182
}
183
]
184
},
185
conn_id='powerbi_production',
186
timeout=7200, # 2 hour timeout
187
dag=dag
188
)
189
```
190
191
### Monitoring Dataset Refresh Status
192
193
```python
194
from airflow.providers.microsoft.azure.sensors.powerbi import PowerBIDatasetRefreshSensor
195
196
# Wait for dataset refresh completion
197
wait_for_refresh = PowerBIDatasetRefreshSensor(
198
task_id='wait_for_refresh_completion',
199
dataset_id='12345678-1234-1234-1234-123456789012',
200
group_id='87654321-4321-4321-4321-210987654321',
201
dataset_refresh_id='{{ ti.xcom_pull("refresh_sales_dataset") }}',
202
conn_id='powerbi_connection',
203
timeout=3600,
204
poke_interval=300,
205
dag=dag
206
)
207
208
refresh_dataset >> wait_for_refresh
209
```
210
211
## Authentication and Connection
212
213
Power BI integration uses Microsoft Graph API authentication methods:
214
215
- **Service Principal**: Client ID, client secret, and tenant ID
216
- **Managed Identity**: For Azure-hosted environments
217
- **Interactive Authentication**: For development scenarios
218
219
Connection configuration requires appropriate Microsoft Graph API permissions for Power BI operations including `Dataset.ReadWrite.All` and `Workspace.Read.All`.
220
221
## Status Constants and Error Handling
222
223
```python { .api }
224
class PowerBIDatasetRefreshStatus:
225
"""Power BI dataset refresh status constants."""
226
IN_PROGRESS: str = "In Progress"
227
FAILED: str = "Failed"
228
COMPLETED: str = "Completed"
229
DISABLED: str = "Disabled"
230
231
TERMINAL_STATUSES: set[str] = {FAILED, COMPLETED}
232
FAILURE_STATUSES: set[str] = {FAILED, DISABLED}
233
234
class PowerBIDatasetRefreshFields:
235
"""Power BI refresh dataset detail field names."""
236
REQUEST_ID: str = "request_id"
237
STATUS: str = "status"
238
ERROR: str = "error"
239
240
class PowerBIDatasetRefreshException(AirflowException):
241
"""Exception raised when Power BI dataset refresh fails."""
242
```