or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

additional-streams.mdauthentication.mdcrm-streams.mdcustom-objects.mdengagements.mdindex.mdmarketing.mdproperty-history.md
tile.json

tessl/airbyte-airbyte-source-hubspot

HubSpot source connector for Airbyte that syncs CRM data including contacts, companies, deals, and marketing activities with support for OAuth and Private App authentication

Workspace
tessl
Visibility
Public
Created
Last updated
Describes

pkg:airbyte/source-hubspot@6.0.x

To install, run

npx @tessl/cli install tessl/airbyte-airbyte-source-hubspot@6.0.0

index.mddocs/

Airbyte Source HubSpot

A manifest-only declarative source connector for Airbyte that enables comprehensive data synchronization from HubSpot CRM to data warehouses and other destinations. The connector supports both OAuth and Private App authentication methods and provides access to 30+ data streams including contacts, companies, deals, marketing activities, and engagement data.

Package Information

  • Package Name: source-hubspot
  • Package Type: airbyte
  • Language: YAML (declarative manifest)
  • Installation: Available in Airbyte Cloud and Open Source as connector source-hubspot version 6.0.0

Core Configuration

source-type: airbyte/source-hubspot
config:
  credentials:
    credentials_title: "OAuth Credentials"  # or "Private App Credentials"
    # OAuth credentials
    client_id: "${HUBSPOT_CLIENT_ID}"
    client_secret: "${HUBSPOT_CLIENT_SECRET}"
    refresh_token: "${HUBSPOT_REFRESH_TOKEN}"
    # OR Private App credentials
    access_token: "${HUBSPOT_ACCESS_TOKEN}"
  start_date: "2023-01-01T00:00:00Z"

Basic Usage

source:
  type: airbyte/source-hubspot
  config:
    credentials:
      credentials_title: "Private App Credentials"
      access_token: "${HUBSPOT_ACCESS_TOKEN}"
    start_date: "2023-01-01T00:00:00Z"
    
destination:
  type: airbyte/destination-postgres
  config:
    # destination configuration

Architecture

The HubSpot source connector is built on Airbyte's declarative manifest framework:

  • Manifest-Driven: Configuration-based connector using YAML manifest instead of custom Python code
  • Authentication Layer: Selective authentication supporting both OAuth 2.0 and Private App tokens
  • Stream Definitions: 30+ predefined streams with incremental sync and pagination support
  • Custom Components: Specialized Python components for complex data transformations and API interactions
  • Dynamic Schemas: Runtime schema discovery for custom properties and objects
  • Rate Limiting: Built-in rate limiting and retry logic for HubSpot's API constraints

Capabilities

Authentication & Configuration

Flexible authentication system supporting OAuth 2.0 flows and Private App tokens, with comprehensive configuration options for data synchronization behavior.

credentials:
  type: object
  oneOf:
    - type: object  # OAuth Credentials
      properties:
        credentials_title:
          const: "OAuth Credentials"
        client_id: string
        client_secret: string
        refresh_token: string
    - type: object  # Private App Credentials
      properties:
        credentials_title:
          const: "Private App Credentials"
        access_token: string

start_date:
  type: string
  format: date-time
  pattern: "^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}Z$"
  default: "2006-06-01T00:00:00Z"

enable_experimental_streams:
  type: boolean
  default: false

num_worker:
  type: integer
  minimum: 1
  maximum: 40
  default: 3
  description: "Number of concurrent workers for data synchronization"

lookback_window:
  type: integer
  default: 0
  description: "Number of days to look back from cursor position for incremental sync"

Authentication & Configuration

CRM Data Streams

Core customer relationship management data including contacts, companies, deals, and their relationships with full incremental sync support.

# Core entity streams
contacts:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

companies: 
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

deals:
  primary_key: ["id"] 
  cursor_field: "updatedAt"
  sync_mode: incremental

tickets:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

goals:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

leads:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

contact_lists:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

deal_splits:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

CRM Streams

Property History Streams

Historical tracking of property changes for CRM entities, providing audit trails and change tracking capabilities.

contacts_property_history:
  primary_key: ["contactId", "property", "timestamp"]
  cursor_field: "timestamp"
  sync_mode: incremental

companies_property_history:
  primary_key: ["companyId", "property", "timestamp"] 
  cursor_field: "timestamp"
  sync_mode: incremental

deals_property_history:
  primary_key: ["objectId", "property", "timestamp"]
  cursor_field: "timestamp" 
  sync_mode: incremental

Property History

Engagement Streams

Activity and interaction data including calls, emails, meetings, notes, and tasks with associations to CRM entities.

engagements_calls:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

engagements_emails:
  primary_key: ["id"]
  cursor_field: "updatedAt" 
  sync_mode: incremental

engagements_meetings:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

engagements_notes:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

engagements_tasks:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

Engagement Streams

Marketing Streams

Marketing automation data including campaigns, emails, forms, workflows, and email events for comprehensive marketing analytics.

marketing_emails:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

campaigns:
  primary_key: ["id"] 
  cursor_field: "lastUpdatedTime"
  sync_mode: incremental

email_events:
  primary_key: ["id"]
  cursor_field: "created"
  sync_mode: incremental

forms:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

workflows:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

Marketing Streams

Custom Objects & Dynamic Schemas

Support for HubSpot custom objects with runtime schema discovery and automatic property mapping.

dynamic_streams:
  - type: DynamicDeclarativeStream
    stream_template:
      type: StateDelegatingStream
      name: "custom_object_stream_name"  # Generated dynamically
    components_resolver:
      type: HttpComponentsResolver
      retriever:
        path: "/crm/v3/schemas"

Custom Objects

Additional Streams

Supplementary data streams including form submissions, archived owners, and pipeline configurations for comprehensive data coverage.

form_submissions:
  primary_key: ["id"]
  cursor_field: "submittedAt"
  sync_mode: incremental

owners_archived:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

ticket_pipelines:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental

Additional Streams

Types

Authentication Types

OAuthCredentials:
  type: object
  properties:
    credentials_title:
      type: string
      const: "OAuth Credentials"
    client_id:
      type: string
      description: "HubSpot application client ID"
    client_secret:
      type: string
      description: "HubSpot application client secret"
    refresh_token:
      type: string
      description: "OAuth refresh token for token renewal"

PrivateAppCredentials:
  type: object
  properties:
    credentials_title:
      type: string
      const: "Private App Credentials"
    access_token:
      type: string
      description: "HubSpot Private App access token"

Common Stream Properties

StreamConfig:
  type: object
  properties:
    primary_key:
      type: array
      items:
        type: string
      description: "Fields that uniquely identify records"
    cursor_field:
      type: string
      description: "Field used for incremental sync"
    sync_mode:
      type: string
      enum: ["full_refresh", "incremental"]
      description: "Synchronization mode"

IncrementalSync:
  type: object
  properties:
    cursor_field:
      type: string
    start_datetime:
      type: string
      format: date-time
    datetime_format:
      type: string
    lookback_window:
      type: string
      pattern: "^P\\d+D$"