or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

index.mdlookup-options.mdsink-operations.mdsource-operations.mdtable-factory.mdwrite-options.md
tile.json

tessl/maven-org-apache-flink--flink-connector-hbase-1-4-2-11

Apache Flink HBase 1.4 Connector provides bidirectional data integration between Flink streaming applications and HBase NoSQL database.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.flink/flink-connector-hbase-1.4_2.11@1.14.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-flink--flink-connector-hbase-1-4-2-11@1.14.0

index.mddocs/

Apache Flink HBase 1.4 Connector

Apache Flink HBase 1.4 Connector provides comprehensive bidirectional data integration between Apache Flink stream processing framework and HBase 1.4 NoSQL database. The connector enables reading from and writing to HBase tables through Flink's Table API and SQL, with support for exactly-once processing guarantees and high-performance stream processing applications.

Package Information

  • Package Name: flink-connector-hbase-1.4_2.11
  • Package Type: maven
  • Language: Java
  • Group ID: org.apache.flink
  • Artifact ID: flink-connector-hbase-1.4_2.11
  • Version: 1.14.6
  • Installation: Add to Maven dependencies:
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-connector-hbase-1.4_2.11</artifactId>
    <version>1.14.6</version>
</dependency>

Core Imports

import org.apache.flink.connector.hbase1.HBase1DynamicTableFactory;
import org.apache.flink.connector.hbase1.source.HBaseDynamicTableSource;
import org.apache.flink.connector.hbase1.sink.HBaseDynamicTableSink;
import org.apache.flink.connector.hbase.options.HBaseWriteOptions;
import org.apache.flink.connector.hbase.options.HBaseLookupOptions;
import org.apache.flink.connector.hbase.table.HBaseConnectorOptions;

Basic Usage

Creating HBase Table in Flink SQL

CREATE TABLE hbase_table (
    rowkey STRING,
    family1 ROW<col1 STRING, col2 BIGINT>,
    family2 ROW<col1 STRING, col2 BOOLEAN>,
    PRIMARY KEY (rowkey) NOT ENFORCED
) WITH (
    'connector' = 'hbase-1.4',
    'table-name' = 'my_hbase_table',
    'zookeeper.quorum' = 'localhost:2181'
);

Reading from HBase

SELECT rowkey, family1.col1, family2.col2
FROM hbase_table
WHERE family1.col2 > 100;

Writing to HBase

INSERT INTO hbase_table
SELECT rowkey, family1, family2
FROM source_table;

Architecture

The Apache Flink HBase 1.4 Connector is built around several key components:

  • Dynamic Table Factory: HBase1DynamicTableFactory serves as the main entry point for creating HBase table sources and sinks through Flink's Table API
  • Source Integration: HBaseDynamicTableSource and HBaseRowDataInputFormat handle reading data from HBase tables with support for region-aware splitting
  • Sink Integration: HBaseDynamicTableSink provides buffered write operations with configurable flushing strategies
  • Configuration System: Comprehensive configuration options for connection settings, performance tuning, and caching
  • Type System: Full integration with Flink's type system and automatic serialization/deserialization of HBase data

Capabilities

Table Factory and Configuration

Core factory class for creating HBase table sources and sinks with comprehensive configuration options. Handles connector registration and table creation.

public class HBase1DynamicTableFactory 
    implements DynamicTableSourceFactory, DynamicTableSinkFactory {
    
    public DynamicTableSource createDynamicTableSource(Context context);
    public DynamicTableSink createDynamicTableSink(Context context);
    public String factoryIdentifier(); // Returns "hbase-1.4"
    public Set<ConfigOption<?>> requiredOptions();
    public Set<ConfigOption<?>> optionalOptions();
}

Table Factory and Configuration

Source Operations

Reading data from HBase tables with support for batch and lookup operations, region-aware splitting, and configurable caching.

public class HBaseDynamicTableSource extends AbstractHBaseDynamicTableSource {
    public HBaseDynamicTableSource(
        Configuration conf,
        String tableName,
        HBaseTableSchema hbaseSchema,
        String nullStringLiteral,
        HBaseLookupOptions lookupOptions
    );
    
    public DynamicTableSource copy();
    public InputFormat<RowData, ?> getInputFormat();
    public HBaseLookupOptions getLookupOptions();
}

Source Operations

Sink Operations

Writing data to HBase tables with configurable buffering, batching, and exactly-once processing guarantees.

public class HBaseDynamicTableSink implements DynamicTableSink {
    public HBaseDynamicTableSink(
        String tableName,
        HBaseTableSchema hbaseTableSchema,
        Configuration hbaseConf,
        HBaseWriteOptions writeOptions,
        String nullStringLiteral
    );
    
    public SinkRuntimeProvider getSinkRuntimeProvider(Context context);
    public ChangelogMode getChangelogMode(ChangelogMode requestedMode);
    public DynamicTableSink copy();
}

Sink Operations

Write Options and Performance Tuning

Configuration options for optimizing write performance through buffering, batching, and parallelism control.

public class HBaseWriteOptions implements Serializable {
    public static Builder builder();
    
    public long getBufferFlushMaxSizeInBytes();
    public long getBufferFlushMaxRows();
    public long getBufferFlushIntervalMillis();
    public Integer getParallelism();
}

public static class Builder {
    public Builder setBufferFlushMaxSizeInBytes(long bufferFlushMaxSizeInBytes);
    public Builder setBufferFlushMaxRows(long bufferFlushMaxRows);
    public Builder setBufferFlushIntervalMillis(long bufferFlushIntervalMillis);
    public Builder setParallelism(Integer parallelism);
    public HBaseWriteOptions build();
}

Write Options and Performance

Lookup Options and Caching

Configuration for lookup join operations with caching, retry mechanisms, and async processing options.

public class HBaseLookupOptions implements Serializable {
    public static Builder builder();
    
    public long getCacheMaxSize();
    public long getCacheExpireMs();
    public int getMaxRetryTimes();
    public boolean getLookupAsync();
}

public static class Builder {
    public Builder setCacheMaxSize(long cacheMaxSize);
    public Builder setCacheExpireMs(long cacheExpireMs);
    public Builder setMaxRetryTimes(int maxRetryTimes);
    public Builder setLookupAsync(boolean lookupAsync);
    public HBaseLookupOptions build();
}

Lookup Options and Caching

Types

// Core configuration options
public class HBaseConnectorOptions {
    public static final ConfigOption<String> TABLE_NAME;
    public static final ConfigOption<String> ZOOKEEPER_QUORUM;
    public static final ConfigOption<String> ZOOKEEPER_ZNODE_PARENT;
    public static final ConfigOption<String> NULL_STRING_LITERAL;
    public static final ConfigOption<MemorySize> SINK_BUFFER_FLUSH_MAX_SIZE;
    public static final ConfigOption<Integer> SINK_BUFFER_FLUSH_MAX_ROWS;
    public static final ConfigOption<Duration> SINK_BUFFER_FLUSH_INTERVAL;
    public static final ConfigOption<Boolean> LOOKUP_ASYNC;
    public static final ConfigOption<Long> LOOKUP_CACHE_MAX_ROWS;
    public static final ConfigOption<Duration> LOOKUP_CACHE_TTL;
    public static final ConfigOption<Integer> LOOKUP_MAX_RETRIES;
    public static final ConfigOption<Integer> SINK_PARALLELISM;
}

// Input format for reading HBase data
public class HBaseRowDataInputFormat extends AbstractTableInputFormat<RowData> {
    public HBaseRowDataInputFormat(
        Configuration conf,
        String tableName,
        HBaseTableSchema schema,
        String nullStringLiteral
    );
}

// Abstract base class for HBase input formats
public abstract class AbstractTableInputFormat<T> 
    extends RichInputFormat<T, TableInputSplit> {
    
    protected abstract void initTable() throws IOException;
    protected abstract Scan getScanner();
    protected abstract String getTableName();
    protected abstract T mapResultToOutType(Result r);
}