tessl/maven-org-apache-flink--flink-sql-connector-hbase-2-2-2-11

Apache Flink SQL connector that enables seamless integration with HBase 2.2.x databases through Flink's Table API and SQL interface

—

Pending

Overview

Eval results

Files

Apache Flink HBase 2.2 SQL Connector

Name: tessl/maven-org-apache-flink--flink-sql-connector-hbase-2-2-2-11
Author: tessl

Apache Flink SQL connector that enables seamless integration with HBase 2.2.x databases through Flink's Table API and SQL interface. This shaded JAR bundles all necessary HBase client dependencies with proper relocation to avoid classpath conflicts in Flink deployments.

Package Information

Package Name: flink-sql-connector-hbase-2.2_2.11
Group ID: org.apache.flink
Language: Java
Maven Installation: Add to your project dependencies:

<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-sql-connector-hbase-2.2_2.11</artifactId>
    <version>1.14.6</version>
</dependency>

For runtime usage (when running Flink applications):

# Copy to Flink lib directory
cp flink-sql-connector-hbase-2.2_2.11-1.14.6.jar $FLINK_HOME/lib/

Core Usage

The connector integrates with Flink SQL through DDL statements. No explicit Java imports are needed for SQL usage.

Basic Usage

-- Create HBase table mapping
CREATE TABLE hbase_table (
    rowkey STRING,
    info ROW<name STRING, age INT>,
    data ROW<value DOUBLE, timestamp BIGINT>,
    PRIMARY KEY (rowkey) NOT ENFORCED
) WITH (
    'connector' = 'hbase-2.2',
    'table-name' = 'my_hbase_table',
    'zookeeper.quorum' = 'localhost:2181'
);

-- Insert data (UPSERT operation)
INSERT INTO hbase_table VALUES (
    'user_001',
    ROW('Alice', 25),
    ROW(95.5, 1234567890)
);

-- Query data
SELECT rowkey, info.name, info.age, data.value 
FROM hbase_table 
WHERE info.age > 18;

-- Lookup join with another table
SELECT o.order_id, u.info.name, u.info.age
FROM orders o
JOIN hbase_table FOR SYSTEM_TIME AS OF o.proc_time AS u
ON o.user_id = u.rowkey;

Architecture

The connector architecture consists of several key components:

Factory System: HBase2DynamicTableFactory provides the main entry point for Flink's connector discovery mechanism
Table Sources/Sinks: Support both batch scanning and streaming lookup operations with UPSERT capabilities (INSERT, UPDATE_AFTER, DELETE)
Serialization Layer: Handles type mapping between Flink's type system and HBase's byte storage format
Configuration System: Comprehensive options for connection, caching, buffering, and performance tuning
Shaded Dependencies: All HBase client libraries are relocated to prevent classpath conflicts

Capabilities

SQL DDL and Table Operations

Core SQL operations for creating HBase table mappings, defining column families and qualifiers, and managing table schemas with proper data type mappings.

CREATE TABLE table_name (
    rowkey STRING,
    family_name ROW<qualifier_name DATA_TYPE, ...>,
    PRIMARY KEY (rowkey) NOT ENFORCED
) WITH (
    'connector' = 'hbase-2.2',
    'table-name' = 'hbase_table_name',
    -- connection and configuration options
);

SQL DDL and Operations

Connection and Configuration

Comprehensive configuration options for HBase connections, Zookeeper settings, performance tuning, and operational parameters.

// Key configuration options available via WITH clause
'connector' = 'hbase-2.2'                           // Required: Connector identifier
'table-name' = 'hbase_table_name'                   // Required: HBase table name
'zookeeper.quorum' = 'host1:2181,host2:2181'       // HBase Zookeeper quorum
'zookeeper.znode.parent' = '/hbase'                 // Zookeeper root (default: '/hbase')
'null-string-literal' = 'null'                     // Null representation (default: 'null')

Connection and Configuration

Lookup Operations and Caching

Temporal table join functionality with both synchronous and asynchronous lookup modes, including comprehensive caching strategies for performance optimization.

// Lookup configuration options
'lookup.async' = 'false'                           // Enable async lookup (default: false)
'lookup.cache.max-rows' = '1000'                   // Cache size (default: -1, disabled)
'lookup.cache.ttl' = '60s'                         // Cache TTL (default: 0)
'lookup.max-retries' = '3'                         // Max retry attempts (default: 3)

Lookup Operations

Sink Operations and Buffering

UPSERT sink operations with intelligent buffering strategies, exactly-once semantics through checkpointing, and comprehensive error handling.

// Sink configuration options
'sink.buffer-flush.max-size' = '2mb'               // Buffer size threshold (default: 2MB)
'sink.buffer-flush.max-rows' = '1000'              // Buffer row count (default: 1000)
'sink.buffer-flush.interval' = '1s'                // Flush interval (default: 1s)
'sink.parallelism' = '1'                           // Sink parallelism

Sink Operations

Data Types and Schema Mapping

Complete data type mapping between Flink's type system and HBase storage format, including support for complex nested types and proper serialization handling.

// Supported Flink data types mapping to HBase bytes
BOOLEAN, TINYINT, SMALLINT, INTEGER, BIGINT        // Integer types
FLOAT, DOUBLE, DECIMAL                             // Floating point types  
CHAR, VARCHAR                                      // String types
BINARY, VARBINARY                                  // Binary types
DATE, TIME, TIMESTAMP                              // Temporal types
ROW<field_name field_type, ...>                   // Nested row types for column families

Data Types and Schema

Error Handling

The connector provides robust error handling with:

Connection Retry Logic: Configurable retry attempts for HBase connection failures
Graceful Degradation: Proper handling of table not found and region server unavailability
Buffer Management: Overflow protection and timeout handling for sink operations
Checkpointing Support: Exactly-once processing guarantees through Flink's checkpoint mechanism
Validation Errors: Clear error messages for schema mismatches and configuration issues

Install with Tessl CLI

npx tessl i tessl/maven-org-apache-flink--flink-sql-connector-hbase-2-2-2-11

Workspace: tessl
Visibility: Public
Created: 6 months ago
Last updated: about 1 month ago
Describes: pkg:maven/org.apache.flink/flink-sql-connector-hbase-2.2_2.11@1.14.x
Badge