CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-clickhouse-driver

Python driver with native interface for ClickHouse database providing high-performance connectivity and comprehensive data type support.

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

data-types.mddocs/

Data Types and Columns

Comprehensive support for all ClickHouse data types including integers, floats, strings, dates, arrays, tuples, maps, and specialized types. The driver handles automatic conversion between ClickHouse types and Python objects while preserving precision and supporting complex nested data structures.

Capabilities

Integer Types

Full range of signed and unsigned integer types supporting small values to large precision integers.

# Signed integers
Int8Column     # -128 to 127
Int16Column    # -32,768 to 32,767  
Int32Column    # -2^31 to 2^31-1
Int64Column    # -2^63 to 2^63-1
Int128Column   # -2^127 to 2^127-1
Int256Column   # -2^255 to 2^255-1

# Unsigned integers  
UInt8Column    # 0 to 255
UInt16Column   # 0 to 65,535
UInt32Column   # 0 to 2^32-1
UInt64Column   # 0 to 2^64-1
UInt128Column  # 0 to 2^128-1
UInt256Column  # 0 to 2^256-1

# Python mapping
# Int8-Int64 → Python int
# UInt8-UInt64 → Python int  
# Int128, Int256, UInt128, UInt256 → Python int (arbitrary precision)

Floating Point Types

IEEE 754 floating point numbers with single and double precision.

Float32Column  # 32-bit IEEE 754 floating point
Float64Column  # 64-bit IEEE 754 floating point

# Python mapping
# Float32 → Python float (may lose precision)
# Float64 → Python float

String Types

Variable and fixed-length string types with configurable encoding support.

StringColumn      # Variable-length UTF-8 strings
FixedStringColumn # Fixed-length byte arrays

# Python mapping
# String → Python str (UTF-8 by default, configurable encoding)
# FixedString → Python str (padded with null bytes)

# Encoding configuration (client settings)
settings = {
    'strings_encoding': 'utf-8',    # String encoding (default: utf-8)
    'strings_as_bytes': False,      # Return bytes instead of str
}

Date and DateTime Types

Date and timestamp types with timezone support and extended ranges.

DateColumn       # Date from 1900-01-01 to 2299-12-31
Date32Column     # Extended date range from 1900-01-01 to 2283-11-11
DateTimeColumn   # Unix timestamp with optional timezone

# Constructor for timezone-aware datetime
DateTimeColumn(timezone_name='UTC')           # UTC timezone
DateTimeColumn(timezone_name='Europe/London') # Named timezone

# Python mapping
# Date → datetime.date
# Date32 → datetime.date  
# DateTime → datetime.datetime (timezone-aware if specified)

Boolean Type

Native boolean type support with true/false values.

BoolColumn  # Boolean true/false values

# Python mapping
# Bool → Python bool

UUID Type

Universally Unique Identifier type with automatic format handling.

UUIDColumn  # 128-bit UUID values

# Python mapping  
# UUID → uuid.UUID object
# Accepts: UUID objects, hex strings with/without hyphens

IP Address Types

Specialized types for IPv4 and IPv6 addresses with automatic validation.

IPv4Column  # 32-bit IPv4 addresses
IPv6Column  # 128-bit IPv6 addresses

# Python mapping
# IPv4 → ipaddress.IPv4Address
# IPv6 → ipaddress.IPv6Address
# Accepts: IP address objects, string representations

Decimal Types

High-precision decimal numbers with configurable precision and scale.

DecimalColumn(precision, scale)  # Fixed-point decimal numbers

# Parameters
# precision: Total number of digits (1-76)
# scale: Number of digits after decimal point (0-precision)

# Examples
DecimalColumn(10, 2)  # Up to 10 digits, 2 after decimal (money)
DecimalColumn(18, 6)  # Up to 18 digits, 6 after decimal (scientific)

# Python mapping
# Decimal → decimal.Decimal (exact precision)

Enum Types

Enumeration types mapping string values to integer codes.

EnumColumn(enum_values)  # String-to-integer enumeration

# Parameters
# enum_values: Dict mapping string names to integer values
# Example: {'red': 1, 'green': 2, 'blue': 3}

# Python mapping
# Enum → str (enum name) or int (enum value)

Array Types

Dynamic arrays containing elements of a single type with arbitrary nesting.

ArrayColumn(inner_column)  # Arrays of any ClickHouse type

# Examples
ArrayColumn(Int32Column())           # Array of integers
ArrayColumn(StringColumn())          # Array of strings  
ArrayColumn(ArrayColumn(Float64Column()))  # Nested arrays

# Python mapping
# Array → Python list
# Supports arbitrary nesting: [[1, 2], [3, 4], [5, 6]]

Tuple Types

Fixed-size heterogeneous tuples with named or unnamed elements.

TupleColumn(inner_columns)  # Fixed-size heterogeneous tuples

# Parameters
# inner_columns: List of column types for tuple elements

# Examples
TupleColumn([Int32Column(), StringColumn()])           # (int, str)
TupleColumn([Float64Column(), Float64Column()])        # (float, float) 
TupleColumn([StringColumn(), ArrayColumn(Int32Column())])  # (str, [int])

# Python mapping
# Tuple → Python tuple
# Example: (42, 'hello') for Tuple(Int32, String)

Map Types

Key-value associative containers with typed keys and values.

MapColumn(key_column, value_column)  # Key-value maps

# Parameters  
# key_column: Column type for map keys
# value_column: Column type for map values

# Examples
MapColumn(StringColumn(), Int32Column())      # Map[str, int]
MapColumn(Int32Column(), ArrayColumn(StringColumn()))  # Map[int, List[str]]

# Python mapping
# Map → Python dict
# Example: {'key1': 123, 'key2': 456}

Nested Types

Complex nested structures for hierarchical data representation.

NestedColumn(inner_columns)  # Nested table structures

# Parameters
# inner_columns: Dict mapping column names to column types

# Example
NestedColumn({
    'user_id': UInt32Column(),
    'user_name': StringColumn(),
    'user_tags': ArrayColumn(StringColumn())
})

# Python mapping
# Nested → List of dicts with consistent structure
# Example: [
#     {'user_id': 1, 'user_name': 'Alice', 'user_tags': ['admin', 'user']},
#     {'user_id': 2, 'user_name': 'Bob', 'user_tags': ['user']}
# ]

Nullable Types

Wrapper allowing NULL values for any ClickHouse type.

NullableColumn(inner_column)  # NULL-aware wrapper for any type

# Examples
NullableColumn(Int32Column())     # Nullable integer
NullableColumn(StringColumn())    # Nullable string
NullableColumn(ArrayColumn(Int32Column()))  # Nullable array

# Python mapping
# Nullable → Python value or None
# Example: 42 or None for Nullable(Int32)

Low Cardinality Types

Optimization for columns with low unique value count using dictionary encoding.

LowCardinalityColumn(inner_column)  # Dictionary-encoded optimization

# Best for columns with < 10,000 unique values
# Examples
LowCardinalityColumn(StringColumn())  # Country names, categories
LowCardinalityColumn(EnumColumn({'A': 1, 'B': 2}))  # Small enums

# Python mapping
# LowCardinality → Same as inner column type
# Transparent optimization, no API differences

JSON Type

Semi-structured JSON data with dynamic schema support.

JSONColumn  # JSON objects and arrays

# Python mapping
# JSON → Python dict, list, or primitive types
# Supports nested structures: {'key': [1, 2, {'nested': True}]}

Interval Types

Time interval types for duration calculations.

IntervalYearColumn     # Year intervals
IntervalMonthColumn    # Month intervals  
IntervalWeekColumn     # Week intervals
IntervalDayColumn      # Day intervals
IntervalHourColumn     # Hour intervals
IntervalMinuteColumn   # Minute intervals
IntervalSecondColumn   # Second intervals

# Python mapping
# Interval → datetime.timedelta (where applicable)
# Year/Month intervals → custom interval objects (no standard Python equivalent)

Special Types

Additional specialized types for specific use cases.

NothingColumn  # Represents absence of data
NullColumn     # Always NULL values

# Aggregate function storage
SimpleAggregateFunctionColumn(function_name, inner_columns)

# Python mapping
# Nothing → None
# Null → None
# SimpleAggregateFunction → Depends on function type

Geographic Type Aliases

Convenient aliases for geographic coordinate systems.

# Type aliases using Tuple compositions
Point = TupleColumn([Float64Column(), Float64Column()])           # (x, y)
Ring = ArrayColumn(Point)                                        # Array of points
Polygon = ArrayColumn(Ring)                                      # Array of rings  
MultiPolygon = ArrayColumn(Polygon)                             # Array of polygons

# Python mapping
# Point → (float, float)
# Ring → [(x1, y1), (x2, y2), ...]
# Polygon → [[(x1, y1), (x2, y2), ...], [...]]  # outer ring + holes
# MultiPolygon → [[ring1, ring2], [ring3, ring4], ...]

Type Conversion Examples

Basic Type Usage

from clickhouse_driver import Client

client = Client('localhost')

# Integer types
client.execute('CREATE TABLE test_ints (i8 Int8, u64 UInt64) ENGINE = Memory')
client.execute('INSERT INTO test_ints VALUES', [(-128, 18446744073709551615)])

# String and UUID
client.execute('''
    CREATE TABLE test_strings (
        name String,
        id UUID,
        fixed FixedString(10)
    ) ENGINE = Memory
''')

import uuid
data = [
    ('Alice', uuid.uuid4(), 'fixed_data'),
    ('Bob', uuid.uuid4(), 'test12345\x00')  # Null-padded to 10 bytes
]
client.execute('INSERT INTO test_strings VALUES', data)

Date and DateTime Usage

from datetime import date, datetime
import pytz

# Date types
client.execute('''
    CREATE TABLE test_dates (
        simple_date Date,
        extended_date Date32,
        timestamp DateTime,
        utc_time DateTime('UTC'),
        london_time DateTime('Europe/London')
    ) ENGINE = Memory
''')

utc = pytz.UTC
london = pytz.timezone('Europe/London')

data = [(
    date(2023, 1, 15),
    date(2100, 12, 31),
    datetime(2023, 1, 15, 14, 30, 0),
    datetime(2023, 1, 15, 14, 30, 0, tzinfo=utc),
    datetime(2023, 1, 15, 14, 30, 0, tzinfo=london)
)]

client.execute('INSERT INTO test_dates VALUES', data)

Complex Types Usage

# Arrays and tuples
client.execute('''
    CREATE TABLE test_complex (
        numbers Array(Int32),
        coordinates Tuple(Float64, Float64),
        nested_arrays Array(Array(String)),
        key_values Map(String, Int32)
    ) ENGINE = Memory
''')

data = [(
    [1, 2, 3, 4, 5],                                    # Array
    (40.7128, -74.0060),                               # Tuple (NYC coordinates)
    [['a', 'b'], ['c', 'd'], ['e']],                   # Nested array
    {'count': 100, 'sum': 500, 'max': 50}             # Map
)]

client.execute('INSERT INTO test_complex VALUES', data)

Nullable Types Usage

# Nullable columns
client.execute('''
    CREATE TABLE test_nullable (
        optional_number Nullable(Int32),
        optional_text Nullable(String),
        optional_array Nullable(Array(String))
    ) ENGINE = Memory
''')

data = [
    (42, 'hello', ['a', 'b']),        # All values present
    (None, 'world', None),            # Some nulls
    (100, None, ['x', 'y', 'z'])      # Mixed nulls
]

client.execute('INSERT INTO test_nullable VALUES', data)

Decimal and IP Types Usage

from decimal import Decimal
import ipaddress

client.execute('''
    CREATE TABLE test_special (
        price Decimal(10, 2),
        server_ip IPv4,
        client_ip IPv6
    ) ENGINE = Memory
''')

data = [(
    Decimal('99.99'),
    ipaddress.IPv4Address('192.168.1.1'),
    ipaddress.IPv6Address('2001:db8::1')
)]

client.execute('INSERT INTO test_special VALUES', data)

Type Introspection

# Get column types from query results
columns, rows = client.execute(
    'SELECT * FROM test_complex LIMIT 1',
    with_column_types=True
)

for name, clickhouse_type in columns:
    print(f"{name}: {clickhouse_type}")

# Output:
# numbers: Array(Int32)
# coordinates: Tuple(Float64, Float64)  
# nested_arrays: Array(Array(String))
# key_values: Map(String, Int32)

Install with Tessl CLI

npx tessl i tessl/pypi-clickhouse-driver

docs

client-interface.md

compression.md

data-types.md

dbapi-interface.md

error-handling.md

index.md

results-processing.md

tile.json