Apache Flink SQL connector that enables seamless integration with HBase 2.2.x databases through Flink's Table API and SQL interface
npx @tessl/cli install tessl/maven-org-apache-flink--flink-sql-connector-hbase-2-2-2-11@1.14.0Apache Flink SQL connector that enables seamless integration with HBase 2.2.x databases through Flink's Table API and SQL interface. This shaded JAR bundles all necessary HBase client dependencies with proper relocation to avoid classpath conflicts in Flink deployments.
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-sql-connector-hbase-2.2_2.11</artifactId>
<version>1.14.6</version>
</dependency>For runtime usage (when running Flink applications):
# Copy to Flink lib directory
cp flink-sql-connector-hbase-2.2_2.11-1.14.6.jar $FLINK_HOME/lib/The connector integrates with Flink SQL through DDL statements. No explicit Java imports are needed for SQL usage.
-- Create HBase table mapping
CREATE TABLE hbase_table (
rowkey STRING,
info ROW<name STRING, age INT>,
data ROW<value DOUBLE, timestamp BIGINT>,
PRIMARY KEY (rowkey) NOT ENFORCED
) WITH (
'connector' = 'hbase-2.2',
'table-name' = 'my_hbase_table',
'zookeeper.quorum' = 'localhost:2181'
);
-- Insert data (UPSERT operation)
INSERT INTO hbase_table VALUES (
'user_001',
ROW('Alice', 25),
ROW(95.5, 1234567890)
);
-- Query data
SELECT rowkey, info.name, info.age, data.value
FROM hbase_table
WHERE info.age > 18;
-- Lookup join with another table
SELECT o.order_id, u.info.name, u.info.age
FROM orders o
JOIN hbase_table FOR SYSTEM_TIME AS OF o.proc_time AS u
ON o.user_id = u.rowkey;The connector architecture consists of several key components:
HBase2DynamicTableFactory provides the main entry point for Flink's connector discovery mechanismCore SQL operations for creating HBase table mappings, defining column families and qualifiers, and managing table schemas with proper data type mappings.
CREATE TABLE table_name (
rowkey STRING,
family_name ROW<qualifier_name DATA_TYPE, ...>,
PRIMARY KEY (rowkey) NOT ENFORCED
) WITH (
'connector' = 'hbase-2.2',
'table-name' = 'hbase_table_name',
-- connection and configuration options
);Comprehensive configuration options for HBase connections, Zookeeper settings, performance tuning, and operational parameters.
// Key configuration options available via WITH clause
'connector' = 'hbase-2.2' // Required: Connector identifier
'table-name' = 'hbase_table_name' // Required: HBase table name
'zookeeper.quorum' = 'host1:2181,host2:2181' // HBase Zookeeper quorum
'zookeeper.znode.parent' = '/hbase' // Zookeeper root (default: '/hbase')
'null-string-literal' = 'null' // Null representation (default: 'null')Temporal table join functionality with both synchronous and asynchronous lookup modes, including comprehensive caching strategies for performance optimization.
// Lookup configuration options
'lookup.async' = 'false' // Enable async lookup (default: false)
'lookup.cache.max-rows' = '1000' // Cache size (default: -1, disabled)
'lookup.cache.ttl' = '60s' // Cache TTL (default: 0)
'lookup.max-retries' = '3' // Max retry attempts (default: 3)UPSERT sink operations with intelligent buffering strategies, exactly-once semantics through checkpointing, and comprehensive error handling.
// Sink configuration options
'sink.buffer-flush.max-size' = '2mb' // Buffer size threshold (default: 2MB)
'sink.buffer-flush.max-rows' = '1000' // Buffer row count (default: 1000)
'sink.buffer-flush.interval' = '1s' // Flush interval (default: 1s)
'sink.parallelism' = '1' // Sink parallelismComplete data type mapping between Flink's type system and HBase storage format, including support for complex nested types and proper serialization handling.
// Supported Flink data types mapping to HBase bytes
BOOLEAN, TINYINT, SMALLINT, INTEGER, BIGINT // Integer types
FLOAT, DOUBLE, DECIMAL // Floating point types
CHAR, VARCHAR // String types
BINARY, VARBINARY // Binary types
DATE, TIME, TIMESTAMP // Temporal types
ROW<field_name field_type, ...> // Nested row types for column familiesThe connector provides robust error handling with: