0
# Table Operations
1
2
Core table management including creation, reading, and metadata access. The DeltaTable class provides the primary interface for interacting with Delta Lake tables across various storage backends.
3
4
## Capabilities
5
6
### DeltaTable Creation and Initialization
7
8
```python { .api }
9
class DeltaTable:
10
def __init__(
11
self,
12
table_uri: str | Path | os.PathLike[str],
13
version: int | None = None,
14
storage_options: dict[str, str] | None = None,
15
without_files: bool = False,
16
log_buffer_size: int | None = None,
17
) -> None: ...
18
```
19
20
**Parameters:**
21
- `table_uri`: Path to the Delta table location (local, S3, Azure, GCS)
22
- `version`: Specific version to load (None for latest)
23
- `storage_options`: Backend-specific configuration (credentials, endpoints)
24
- `without_files`: Load metadata only, skip file tracking for memory efficiency
25
- `log_buffer_size`: Number of files to buffer when reading transaction log
26
27
### Table Creation
28
29
```python { .api }
30
@classmethod
31
def create(
32
cls,
33
table_uri: str | Path,
34
schema: Schema | ArrowSchemaExportable,
35
mode: Literal["error", "append", "overwrite", "ignore"] = "error",
36
partition_by: list[str] | str | None = None,
37
name: str | None = None,
38
description: str | None = None,
39
configuration: Mapping[str, str | None] | None = None,
40
storage_options: dict[str, str] | None = None
41
) -> DeltaTable: ...
42
```
43
44
Creates a new Delta table with the specified schema and configuration.
45
46
### Table Detection
47
48
```python { .api }
49
@staticmethod
50
def is_deltatable(
51
table_uri: str,
52
storage_options: dict[str, str] | None = None
53
) -> bool: ...
54
```
55
56
Check if a Delta table exists at the specified location.
57
58
### Table Properties and Metadata
59
60
```python { .api }
61
@property
62
def version(self) -> int: ...
63
64
@property
65
def table_uri(self) -> str: ...
66
67
@property
68
def table_config(self) -> DeltaTableConfig: ...
69
70
def schema(self) -> Schema: ...
71
72
def metadata(self) -> Metadata: ...
73
74
def protocol(self) -> ProtocolVersions: ...
75
76
def files(self, partition_filters: list[tuple[str, str, str | list[str]]] | None = None) -> list[str]: ...
77
78
def partitions(self, partition_filters: list[tuple[str, str, Any]] | None = None) -> list[dict[str, str]]: ...
79
80
def history(self, limit: int | None = None) -> list[dict[str, Any]]: ...
81
```
82
83
### Version Management
84
85
```python { .api }
86
def load_as_version(self, version: int | str | datetime) -> None: ...
87
88
def get_latest_version(self) -> int: ...
89
90
def transaction_version(self, app_id: str) -> int | None: ...
91
92
def update_incremental(self) -> None: ...
93
```
94
95
Load and navigate between different versions of the table for time travel queries.
96
97
### Metadata Classes
98
99
```python { .api }
100
@dataclass
101
class Metadata:
102
def __init__(self, table: RawDeltaTable) -> None: ...
103
104
@property
105
def id(self) -> int: ...
106
107
@property
108
def name(self) -> str: ...
109
110
@property
111
def description(self) -> str: ...
112
113
@property
114
def partition_columns(self) -> list[str]: ...
115
116
@property
117
def created_time(self) -> int: ...
118
119
@property
120
def configuration(self) -> dict[str, str]: ...
121
122
class ProtocolVersions(NamedTuple):
123
min_reader_version: int
124
min_writer_version: int
125
writer_features: list[str] | None
126
reader_features: list[str] | None
127
128
class DeltaTableConfig(NamedTuple):
129
without_files: bool
130
log_buffer_size: int
131
```
132
133
## Usage Examples
134
135
### Basic Table Operations
136
137
```python
138
from deltalake import DeltaTable, Schema, Field
139
from deltalake.schema import PrimitiveType
140
141
# Create a new table
142
schema = Schema([
143
Field("id", PrimitiveType("integer"), nullable=False),
144
Field("name", PrimitiveType("string"), nullable=True),
145
Field("age", PrimitiveType("integer"), nullable=True)
146
])
147
148
# Create table
149
dt = DeltaTable.create(
150
"path/to/table",
151
schema=schema,
152
mode="error",
153
partition_by=["age"]
154
)
155
156
# Load existing table
157
dt = DeltaTable("path/to/existing-table")
158
159
# Check table properties
160
print(f"Table version: {dt.version}")
161
print(f"Table URI: {dt.table_uri}")
162
print(f"Schema: {dt.schema()}")
163
print(f"Files: {len(dt.files())}")
164
165
# Get metadata
166
metadata = dt.metadata()
167
print(f"Table ID: {metadata.id}")
168
print(f"Partition columns: {metadata.partition_columns}")
169
```
170
171
### Working with Versions
172
173
```python
174
# Get current version
175
current_version = dt.version
176
177
# Load specific version for time travel
178
dt.load_as_version(0) # First version
179
historical_data = dt.to_pandas()
180
181
# Return to latest
182
dt.load_as_version(current_version)
183
184
# View history
185
history = dt.history(limit=10)
186
for commit in history:
187
print(f"Version {commit['version']}: {commit.get('operation', 'unknown')}")
188
```
189
190
### Storage Backend Configuration
191
192
```python
193
# S3 configuration
194
s3_options = {
195
"AWS_REGION": "us-west-2",
196
"AWS_ACCESS_KEY_ID": "your-key",
197
"AWS_SECRET_ACCESS_KEY": "your-secret"
198
}
199
200
dt = DeltaTable("s3://bucket/path/to/table", storage_options=s3_options)
201
202
# Azure configuration
203
azure_options = {
204
"AZURE_STORAGE_ACCOUNT_NAME": "account",
205
"AZURE_STORAGE_ACCESS_KEY": "key"
206
}
207
208
dt = DeltaTable("abfss://container@account.dfs.core.windows.net/path",
209
storage_options=azure_options)
210
```
211
212
## Advanced Classes
213
214
### TableFeatures
215
216
```python { .api }
217
class TableFeatures(Enum):
218
ColumnMapping = "ColumnMapping"
219
DeletionVectors = "DeletionVectors"
220
TimestampWithoutTimezone = "TimestampWithoutTimezone"
221
V2Checkpoint = "V2Checkpoint"
222
AppendOnly = "AppendOnly"
223
Invariants = "Invariants"
224
CheckConstraints = "CheckConstraints"
225
ChangeDataFeed = "ChangeDataFeed"
226
GeneratedColumns = "GeneratedColumns"
227
IdentityColumns = "IdentityColumns"
228
RowTracking = "RowTracking"
229
DomainMetadata = "DomainMetadata"
230
IcebergCompatV1 = "IcebergCompatV1"
231
```
232
233
Enumeration of Delta Lake table features that can be enabled on tables to extend the base Delta protocol.
234
235
### Transaction
236
237
```python { .api }
238
class Transaction:
239
def __init__(
240
self,
241
app_id: str,
242
version: int,
243
last_updated: int | None = None
244
) -> None: ...
245
246
app_id: str
247
version: int
248
last_updated: int | None
249
```
250
251
Represents an application transaction for Delta Lake table operations, used to coordinate concurrent operations and ensure transaction isolation.