Tessl Tile for pypi/dataset@1.6.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

chunked-operations.md database-operations.md index.md table-operations.md

table-operations.mddocs/

0
# Table Operations
1

2
Complete table manipulation including CRUD operations, schema management, indexing, and advanced querying capabilities. The Table class provides automatic column creation, type inference, and flexible data operations.
3

4
## Capabilities
5

6
### Data Insertion
7

8
Insert single or multiple rows with automatic schema creation and type inference.
9

10
```python { .api }
11
class Table:
12
    def insert(self, row, ensure=None, types=None):
13
        """
14
        Add a row dict by inserting it into the table.
15
        
16
        Parameters:
17
        - row: dict, data to insert
18
        - ensure: bool, create missing columns automatically (default: db.ensure_schema)
19
        - types: dict, SQLAlchemy types for columns {column_name: type}
20
        
21
        Returns:
22
        Primary key of inserted row or True
23
        """
24
        
25
    def insert_ignore(self, row, keys, ensure=None, types=None):
26
        """
27
        Add a row dict into the table if the row does not exist.
28
        
29
        Parameters:
30
        - row: dict, data to insert
31
        - keys: list, columns to check for existing rows
32
        - ensure: bool, create missing columns automatically
33
        - types: dict, SQLAlchemy types for columns
34
        
35
        Returns:
36
        Primary key of inserted row or False if exists
37
        """
38
        
39
    def insert_many(self, rows, chunk_size=1000, ensure=None, types=None):
40
        """
41
        Add many rows at a time.
42
        
43
        Parameters:
44
        - rows: list of dict, data to insert
45
        - chunk_size: int, number of rows per batch (default 1000)
46
        - ensure: bool, create missing columns automatically
47
        - types: dict, SQLAlchemy types for columns
48
        """
49
```
50

51
### Data Updates
52

53
Update existing records with flexible filtering and bulk operations.
54

55
```python { .api }
56
class Table:
57
    def update(self, row, keys, ensure=None, types=None, return_count=False):
58
        """
59
        Update rows in the table.
60
        
61
        Parameters:
62
        - row: dict, data values to update
63
        - keys: list, column names to use as filters
64
        - ensure: bool, create missing columns automatically
65
        - types: dict, SQLAlchemy types for columns
66
        - return_count: bool, return count of affected rows
67
        
68
        Returns:
69
        Number of updated rows (if supported by database)
70
        """
71
        
72
    def update_many(self, rows, keys, chunk_size=1000, ensure=None, types=None):
73
        """
74
        Update many rows in the table at a time.
75
        
76
        Parameters:
77
        - rows: list of dict, data to update
78
        - keys: list, column names to use as filters
79
        - chunk_size: int, number of rows per batch (default 1000)
80
        - ensure: bool, create missing columns automatically
81
        - types: dict, SQLAlchemy types for columns
82
        """
83
```
84

85
### Upsert Operations
86

87
Insert or update records based on key matching.
88

89
```python { .api }
90
class Table:
91
    def upsert(self, row, keys, ensure=None, types=None):
92
        """
93
        An UPSERT is a smart combination of insert and update.
94
        
95
        Parameters:
96
        - row: dict, data to insert or update
97
        - keys: list, columns to check for existing records
98
        - ensure: bool, create missing columns automatically
99
        - types: dict, SQLAlchemy types for columns
100
        
101
        Returns:
102
        Primary key of inserted row or True for updates
103
        """
104
        
105
    def upsert_many(self, rows, keys, chunk_size=1000, ensure=None, types=None):
106
        """
107
        Multiple upsert operations.
108
        
109
        Parameters:
110
        - rows: list of dict, data to upsert
111
        - keys: list, columns to check for existing records
112
        - chunk_size: int, processing batch size (default 1000)
113
        - ensure: bool, create missing columns automatically
114
        - types: dict, SQLAlchemy types for columns
115
        """
116
```
117

118
### Data Querying
119

120
Query and retrieve data with flexible filtering, ordering, and pagination.
121

122
```python { .api }
123
class Table:
124
    def find(self, *_clauses, **kwargs):
125
        """
126
        Perform a simple search on the table.
127
        
128
        Parameters:
129
        - **kwargs: column filters for equality matching
130
        - _limit: int, maximum number of rows to return
131
        - _offset: int, number of rows to skip
132
        - order_by: str or list, columns to sort by (prefix with '-' for desc)
133
        - _streamed: bool, use streaming for large result sets
134
        - _step: int, fetch step size (default 1000)
135
        
136
        Additional filter operators via dict values:
137
        - {'gt': value} or {'>': value}: greater than
138
        - {'lt': value} or {'<': value}: less than
139
        - {'gte': value} or {'>=': value}: greater than or equal
140
        - {'lte': value} or {'<=': value}: less than or equal
141
        - {'like': value}: SQL LIKE pattern matching
142
        - {'ilike': value}: case-insensitive LIKE
143
        - {'in': [values]}: value in list
144
        - {'between': [start, end]}: value between range
145
        
146
        Returns:
147
        ResultIter: Iterator over matching rows
148
        """
149
        
150
    def find_one(self, *args, **kwargs):
151
        """
152
        Get a single result from the table.
153
        
154
        Parameters: Same as find()
155
        
156
        Returns:
157
        dict-like row object or None
158
        """
159
        
160
    def count(self, *_clauses, **kwargs):
161
        """
162
        Return the count of results for the given filter set.
163
        
164
        Parameters: Same as find() but excludes _limit/_offset
165
        
166
        Returns:
167
        int: Number of matching rows
168
        """
169
        
170
    def distinct(self, *args, **_filter):
171
        """
172
        Return all unique (distinct) values for the given columns.
173
        
174
        Parameters:
175
        - *args: column names to get distinct values for
176
        - **_filter: additional filters to apply
177
        
178
        Returns:
179
        Iterator over unique combinations
180
        """
181
        
182
    # Legacy alias
183
    all = find
184
```
185

186
### Data Deletion
187

188
Remove records from the table with flexible filtering.
189

190
```python { .api }
191
class Table:
192
    def delete(self, *clauses, **filters):
193
        """
194
        Delete rows from the table.
195
        
196
        Parameters:
197
        - **filters: column-based equality filters
198
        - *clauses: additional SQLAlchemy clauses
199
        
200
        Returns:
201
        bool: True if any rows were deleted
202
        """
203
```
204

205
### Schema Management
206

207
Manage table structure with column operations and type handling.
208

209
```python { .api }
210
class Table:
211
    def has_column(self, column):
212
        """
213
        Check if a column with the given name exists on this table.
214
        
215
        Parameters:
216
        - column: str, column name
217
        
218
        Returns:
219
        bool: True if column exists
220
        """
221
        
222
    def create_column(self, name, type, **kwargs):
223
        """
224
        Create a new column of a specified type.
225
        
226
        Parameters:
227
        - name: str, column name
228
        - type: SQLAlchemy type, column data type
229
        - **kwargs: additional Column constructor arguments
230
        """
231
        
232
    def create_column_by_example(self, name, value):
233
        """
234
        Create a new column with a type appropriate for the example value.
235
        
236
        Parameters:
237
        - name: str, column name
238
        - value: sample value for type inference
239
        """
240
        
241
    def drop_column(self, name):
242
        """
243
        Drop the column (not supported on SQLite).
244
        
245
        Parameters:
246
        - name: str, column name to drop
247
        """
248
        
249
    def drop(self):
250
        """Drop the table from the database."""
251
```
252

253
### Index Management
254

255
Create and manage database indexes for query performance.
256

257
```python { .api }
258
class Table:
259
    def has_index(self, columns):
260
        """
261
        Check if an index exists to cover the given columns.
262
        
263
        Parameters:
264
        - columns: str or list, column names
265
        
266
        Returns:
267
        bool: True if suitable index exists
268
        """
269
        
270
    def create_index(self, columns, name=None, **kw):
271
        """
272
        Create an index to speed up queries on a table.
273
        
274
        Parameters:
275
        - columns: str or list, column names to index
276
        - name: str, index name (auto-generated if None)
277
        - **kw: additional Index constructor arguments
278
        """
279
```
280

281
### Table Properties
282

283
Access table metadata and structure information.
284

285
```python { .api }
286
class Table:
287
    @property
288
    def exists(self):
289
        """Check if the table currently exists in the database."""
290
        
291
    @property
292
    def columns(self):
293
        """Get a list of all column names that exist in the table."""
294
        
295
    @property
296
    def table(self):
297
        """Get a reference to the SQLAlchemy table object."""
298
        
299
    def __len__(self):
300
        """Return the number of rows in the table."""
301
        
302
    def __iter__(self):
303
        """Return all rows of the table as simple dictionaries."""
304
```
305

306
## Usage Examples
307

308
### Basic CRUD Operations
309

310
```python
311
import dataset
312

313
db = dataset.connect('sqlite:///example.db')
314
table = db['users']
315

316
# Insert data
317
user_id = table.insert({'name': 'John Doe', 'email': 'john@example.com', 'age': 30})
318
print(f"Inserted user with ID: {user_id}")
319

320
# Insert multiple records
321
users = [
322
    {'name': 'Jane Smith', 'email': 'jane@example.com', 'age': 25},
323
    {'name': 'Bob Johnson', 'email': 'bob@example.com', 'age': 35}
324
]
325
table.insert_many(users)
326

327
# Find records
328
for user in table.find(age={'gt': 25}):
329
    print(f"{user['name']} is {user['age']} years old")
330

331
# Find single record
332
user = table.find_one(name='John Doe')
333
if user:
334
    print(f"Found: {user['email']}")
335

336
# Update records
337
table.update({'age': 31}, ['name'], name='John Doe')
338

339
# Upsert (insert or update)
340
table.upsert({'name': 'Alice Wilson', 'email': 'alice@example.com', 'age': 28}, ['name'])
341

342
# Delete records
343
table.delete(age={'lt': 25})
344

345
# Count records
346
total_users = table.count()
347
adult_users = table.count(age={'gte': 18})
348
```
349

350
### Advanced Filtering
351

352
```python
353
# Multiple conditions
354
results = table.find(age={'gte': 18}, city='New York')
355

356
# Range queries
357
results = table.find(age={'between': [25, 35]})
358

359
# Pattern matching
360
results = table.find(email={'like': '%@gmail.com'})
361

362
# List membership
363
results = table.find(city={'in': ['New York', 'Boston', 'Chicago']})
364

365
# Complex conditions
366
results = table.find(
367
    age={'gte': 18},
368
    salary={'gt': 50000},
369
    department={'in': ['Engineering', 'Sales']}
370
)
371

372
# Ordering results
373
results = table.find(order_by='name')  # Ascending
374
results = table.find(order_by='-age')  # Descending
375
results = table.find(order_by=['department', '-salary'])  # Multiple columns
376

377
# Pagination
378
page_1 = table.find(_limit=10, _offset=0)
379
page_2 = table.find(_limit=10, _offset=10)
380
```
381

382
### Schema Operations
383

384
```python
385
# Check table structure
386
print(f"Table exists: {table.exists}")
387
print(f"Columns: {table.columns}")
388
print(f"Row count: {len(table)}")
389

390
# Column management
391
if not table.has_column('created_at'):
392
    table.create_column('created_at', db.types.datetime)
393

394
# Create column by example
395
table.create_column_by_example('score', 95.5)  # Creates FLOAT column
396

397
# Custom column types with constraints
398
table.create_column('status', db.types.string(20), nullable=False, default='active')
399

400
# Index creation
401
table.create_index(['email'])  # Single column
402
table.create_index(['department', 'salary'])  # Composite index
403

404
# Check for index
405
if table.has_index(['email']):
406
    print("Email column is indexed")
407
```
408

409
### Type Handling
410

411
```python
412
# Explicit type specification
413
table.insert(
414
    {'name': 'Test User', 'data': {'key': 'value'}},
415
    types={'data': db.types.json}  # Force JSON type
416
)
417

418
# Automatic type inference handles:
419
# - int/float -> appropriate numeric types
420
# - str -> text type  
421
# - dict/list -> JSON type
422
# - datetime/date objects -> datetime/date types
423
# - bool -> boolean type
424

425
# Custom row types
426
import dataset
427
from collections import namedtuple
428

429
UserRow = namedtuple('User', ['id', 'name', 'email'])
430
db = dataset.connect(row_type=UserRow)
431
```
432

433
### Performance Optimization
434

435
```python
436
# Bulk operations for better performance
437
with db:  # Use transaction
438
    for i in range(10000):
439
        table.insert({'name': f'User {i}', 'value': i})
440

441
# Chunked operations (alternative approach)
442
rows = [{'name': f'User {i}', 'value': i} for i in range(10000)]
443
table.insert_many(rows, chunk_size=1000)
444

445
# Streaming for large result sets
446
for row in table.find(_streamed=True, _step=100):
447
    process_row(row)
448

449
# Index frequently queried columns
450
table.create_index(['email'])
451
table.create_index(['created_at'])
452
```

Version

Tile

Files

table-operations.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

table-operations.mddocs/