or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

deserialization.mdindex.mdsink.mdsource.md
tile.json

tessl/maven-org-apache-flink--flink-connector-gcp-pubsub-2-11

Apache Flink connector for Google Cloud Pub/Sub that provides streaming data integration capabilities

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.flink/flink-connector-gcp-pubsub_2.11@1.14.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-flink--flink-connector-gcp-pubsub-2-11@1.14.0

index.mddocs/

Apache Flink GCP Pub/Sub Connector

The Apache Flink GCP Pub/Sub Connector provides streaming data integration between Apache Flink applications and Google Cloud Pub/Sub messaging service. It includes both source and sink implementations with exactly-once and at-least-once processing guarantees, authentication support, and comprehensive error handling.

Package Information

  • Package Name: flink-connector-gcp-pubsub_2.11
  • Package Type: maven
  • Language: Java
  • Group ID: org.apache.flink
  • Artifact ID: flink-connector-gcp-pubsub_2.11
  • Installation: Add dependency to your pom.xml:
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-connector-gcp-pubsub_2.11</artifactId>
    <version>1.14.6</version>
</dependency>

Core Imports

import org.apache.flink.streaming.connectors.gcp.pubsub.PubSubSource;
import org.apache.flink.streaming.connectors.gcp.pubsub.PubSubSink;
import org.apache.flink.streaming.connectors.gcp.pubsub.common.PubSubDeserializationSchema;
import org.apache.flink.api.common.serialization.DeserializationSchema;
import org.apache.flink.api.common.serialization.SerializationSchema;
import com.google.auth.Credentials;

Basic Usage

Source (Consumer) Example

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.connectors.gcp.pubsub.PubSubSource;
import org.apache.flink.api.common.serialization.SimpleStringSchema;

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

// Enable checkpointing (required for exactly-once processing)
env.enableCheckpointing(30000);

PubSubSource<String> source = PubSubSource.newBuilder()
    .withDeserializationSchema(new SimpleStringSchema())
    .withProjectName("my-gcp-project")
    .withSubscriptionName("my-subscription")
    .build();

env.addSource(source)
    .print();

env.execute("Pub/Sub Consumer");

Sink (Producer) Example

import org.apache.flink.streaming.connectors.gcp.pubsub.PubSubSink;
import org.apache.flink.api.common.serialization.SimpleStringSchema;

PubSubSink<String> sink = PubSubSink.newBuilder()
    .withSerializationSchema(new SimpleStringSchema())
    .withProjectName("my-gcp-project")
    .withTopicName("my-topic")
    .build();

env.fromElements("Hello", "World", "Pub/Sub")
    .addSink(sink);

env.execute("Pub/Sub Producer");

Architecture

The connector is built around several key components:

  • PubSubSource: Source function implementing exactly-once processing through Flink's checkpointing mechanism
  • PubSubSink: Sink function providing at-least-once delivery guarantees with automatic retries
  • Deserialization Schema: Pluggable schema system supporting both standard Flink serialization and Pub/Sub-specific deserialization with metadata access
  • Subscriber Factory: Configurable factory pattern for customizing connection parameters, timeouts, and retry policies
  • Authentication: Google Cloud credentials integration with automatic credential discovery and manual credential configuration
  • Emulator Support: Built-in support for local Pub/Sub emulator for testing scenarios

Capabilities

Message Consumption (Source)

High-performance message consumption from Pub/Sub subscriptions with exactly-once processing guarantees through Flink's checkpointing mechanism. Supports rate limiting, custom deserialization, and flexible subscriber configuration.

public static DeserializationSchemaBuilder newBuilder();

public static class PubSubSourceBuilder<OUT> {
    public PubSubSourceBuilder<OUT> withCredentials(Credentials credentials);
    public PubSubSourceBuilder<OUT> withPubSubSubscriberFactory(PubSubSubscriberFactory factory);
    public PubSubSourceBuilder<OUT> withPubSubSubscriberFactory(int maxMessagesPerPull, Duration timeout, int retries);
    public PubSubSourceBuilder<OUT> withMessageRateLimit(int messagePerSecondRateLimit);
    public PubSubSource<OUT> build() throws IOException;
}

Message Consumption

Message Publishing (Sink)

Reliable message publishing to Pub/Sub topics with at-least-once delivery semantics, automatic batching, and configurable retry policies. Includes emulator support for testing scenarios.

public static SerializationSchemaBuilder newBuilder();

public static class PubSubSinkBuilder<IN> {
    public PubSubSinkBuilder<IN> withCredentials(Credentials credentials);
    public PubSubSinkBuilder<IN> withHostAndPortForEmulator(String hostAndPort);
    public PubSubSink<IN> build() throws IOException;
}

Message Publishing

Custom Deserialization

Advanced deserialization system providing access to Pub/Sub message metadata including attributes, message ID, and publish time. Essential for applications requiring message metadata or custom deserialization logic.

public interface PubSubDeserializationSchema<T> extends Serializable, ResultTypeQueryable<T> {
    void open(InitializationContext context) throws Exception;
    boolean isEndOfStream(T nextElement);
    T deserialize(PubsubMessage message) throws Exception;
    void deserialize(PubsubMessage message, Collector<T> out) throws Exception;
    TypeInformation<T> getProducedType();
}

Custom Deserialization

Types

// Core source class
public class PubSubSource<OUT> extends RichSourceFunction<OUT> 
    implements ResultTypeQueryable<OUT>, ParallelSourceFunction<OUT>, 
               CheckpointListener, ListCheckpointed<AcknowledgeIdsForCheckpoint<String>>

// Core sink class  
public class PubSubSink<IN> extends RichSinkFunction<IN> implements CheckpointedFunction

// Builder interfaces for source
public interface ProjectNameBuilder<OUT> {
    SubscriptionNameBuilder<OUT> withProjectName(String projectName);
}

public interface SubscriptionNameBuilder<OUT> {
    PubSubSourceBuilder<OUT> withSubscriptionName(String subscriptionName);
}

// Builder interfaces for sink
public interface ProjectNameBuilder<IN> {
    TopicNameBuilder<IN> withProjectName(String projectName);
}

public interface TopicNameBuilder<IN> {
    PubSubSinkBuilder<IN> withTopicName(String topicName);
}

// Subscriber factory interface
public interface PubSubSubscriberFactory extends Serializable {
    PubSubSubscriber getSubscriber(Credentials credentials) throws IOException;
}

// Subscriber interface
public interface PubSubSubscriber extends Acknowledger<String> {
    List<ReceivedMessage> pull();
    void close() throws Exception;
}

// Acknowledger interface
public interface Acknowledger<AcknowledgeId> {
    void acknowledge(List<AcknowledgeId> ids);
}

// Emulator subscriber factory
public class PubSubSubscriberFactoryForEmulator implements PubSubSubscriberFactory {
    public PubSubSubscriberFactoryForEmulator(String hostAndPort, String project, String subscription, 
                                            int retries, Duration timeout, int maxMessagesPerPull);
}