CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/airbyte-airbyte-source-hubspot

HubSpot source connector for Airbyte that syncs CRM data including contacts, companies, deals, and marketing activities with support for OAuth and Private App authentication

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

custom-objects.mddocs/

Custom Objects & Dynamic Schemas

Support for HubSpot custom objects with runtime schema discovery, automatic property mapping, and dynamic stream generation for extensible data models.

Capabilities

Dynamic Stream Generation

Custom object streams are generated dynamically based on HubSpot schema definitions discovered at runtime.

custom_object_streams:
  type: DynamicDeclarativeStream
  stream_template:
    type: StateDelegatingStream
    name: "{{ name }}"  # Generated from custom object schema
    json_schema:
      $schema: "http://json-schema.org/draft-07/schema#"
      type: ["null", "object"]
      additionalProperties: true
      properties:
        id:
          type: ["null", "string"]
        createdAt:
          type: ["null", "string"]
          format: "date-time"
        updatedAt:
          type: ["null", "string"]
          format: "date-time"
        archived:
          type: ["null", "boolean"]
        properties:
          type: ["null", "object"]
          # Dynamic properties based on custom object schema
        
  components_resolver:
    type: HttpComponentsResolver
    retriever:
      path: "/crm/v3/schemas"

Generated Stream Names:

  • Custom object schema name becomes stream name
  • Follows pattern: {custom_object_name} (e.g., vehicles, projects, invoices)

Custom Object Schema Discovery

Runtime schema discovery process for custom objects.

schema_discovery_process:
  1_fetch_schemas:
    path: "/crm/v3/schemas"
    description: "Retrieve all custom object schemas"
    
  2_process_properties:
    source: "properties array from schema response"
    transformation: "Convert HubSpot property definitions to JSON schema"
    
  3_generate_streams:
    process: "Create stream definition for each custom object"
    naming: "Use custom object name as stream name"
    
  4_apply_configuration:
    incremental_sync: true
    cursor_field: "updatedAt"
    primary_key: ["id"]

Custom Object Schema Structure

CustomObjectSchema:
  type: object
  properties:
    id:
      type: string
      description: "Custom object schema identifier"
    name:
      type: string
      description: "Custom object name (used as stream name)"
    labels:
      type: object
      properties:
        singular:
          type: string
          description: "Singular label"
        plural:
          type: string
          description: "Plural label"
    primaryDisplayProperty:
      type: string
      description: "Primary display property name"
    secondaryDisplayProperties:
      type: array
      items:
        type: string
      description: "Secondary display properties"
    searchableProperties:
      type: array
      items:
        type: string
      description: "Properties that can be searched"
    properties:
      type: array
      items:
        $ref: "#/definitions/CustomObjectProperty"
      description: "Custom object property definitions"
    associatedObjects:
      type: array
      items:
        type: string
      description: "Object types this custom object can associate with"

CustomObjectProperty:
  type: object
  properties:
    name:
      type: string
      description: "Property name"
    label:
      type: string
      description: "Property display label"
    type:
      type: string
      enum: ["string", "number", "boolean", "datetime", "date", "enumeration"]
      description: "Property data type"
    fieldType:
      type: string
      description: "UI field type"
    description:
      type: string
      description: "Property description"
    options:
      type: array
      items:
        type: object
        properties:
          label:
            type: string
          value:
            type: string
      description: "Options for enumeration properties"
    hasUniqueValue:
      type: boolean
      description: "Whether property values must be unique"
    hidden:
      type: boolean
      description: "Whether property is hidden in UI"
    displayOrder:
      type: integer
      description: "Property display order"

Schema Loader Implementation

HubspotCustomObjectsSchemaLoader:
  type: SchemaLoader
  
  schema_generation_process:
    1_extract_properties:
      source: "parameters.schema_properties"
      description: "Property definitions injected by HttpComponentsResolver"
      
    2_map_types:
      string_types: ["string", "enumeration", "phone_number", "object_coordinates", "json"]
      datetime_types: ["datetime", "date-time"]
      date_types: ["date"]
      number_types: ["number"]
      boolean_types: ["boolean", "bool"]
      
    3_generate_schema:
      base_properties:
        - id: string identifier
        - createdAt: datetime
        - updatedAt: datetime  
        - archived: boolean
      dynamic_properties:
        - properties: nested object with custom properties
        - properties_{name}: flattened custom properties
        
    4_apply_nullability:
      all_types: ["null", "{actual_type}"]
      description: "All properties nullable for flexibility"

Property Type Mapping

type_mapping:
  hubspot_to_json_schema:
    string: 
      json_type: ["null", "string"]
      examples: ["text", "textarea", "select"]
      
    enumeration:
      json_type: ["null", "string"] 
      description: "Dropdown/select options as string values"
      
    phone_number:
      json_type: ["null", "string"]
      format: "phone number string"
      
    object_coordinates:
      json_type: ["null", "string"]
      description: "Geographic coordinates as string"
      
    json:
      json_type: ["null", "string"]
      description: "JSON data stored as string"
      
    datetime:
      json_type: ["null", "string"]
      format: "date-time"
      
    date:
      json_type: ["null", "string"]
      format: "date"
      
    number:
      json_type: ["null", "number"]
      description: "Numeric values"
      
    boolean:
      json_type: ["null", "boolean"]
      description: "True/false values"
      
    unknown_types:
      json_type: ["null", "string"]
      fallback: "Cast unknown types to string with warning"

Custom Object Record Structure

CustomObjectRecord:
  type: object
  properties:
    # Standard fields (all custom objects)
    id:
      type: string
      description: "Unique custom object record identifier"
    createdAt:
      type: string
      format: date-time
      description: "Record creation timestamp"
    updatedAt:
      type: string
      format: date-time
      description: "Last update timestamp"
    archived:
      type: boolean
      description: "Whether record is archived"
      
    # Dynamic custom properties (nested)
    properties:
      type: object
      additionalProperties: true
      description: "Custom object properties as nested object"
      
    # Dynamic custom properties (flattened)
    # Format: properties_{property_name}
    # Example: properties_vehicle_make, properties_project_status

Example Custom Object Record:

{
  "id": "12345",
  "createdAt": "2024-01-15T10:30:00Z",
  "updatedAt": "2024-01-20T14:45:00Z", 
  "archived": false,
  "properties": {
    "vehicle_make": "Toyota",
    "vehicle_model": "Camry",
    "year": "2023",
    "color": "Blue",
    "price": "25000"
  },
  "properties_vehicle_make": "Toyota",
  "properties_vehicle_model": "Camry",
  "properties_year": "2023",
  "properties_color": "Blue",
  "properties_price": "25000"
}

Stream Configuration Templates

custom_object_stream_template:
  primary_key: ["id"]
  cursor_field: "updatedAt"
  sync_mode: incremental
  
  retriever:
    url_base: "https://api.hubapi.com"
    path: "/crm/v3/objects/{{ name }}"
    http_method: "GET"
    
  paginator:
    type: "DefaultPaginator"
    page_size: 100
    pagination_strategy:
      type: "CursorPagination"
      page_size: 100
      cursor_value: "{{ response.paging.next.after }}"
      
  incremental:
    type: "DatetimeBasedCursor"
    cursor_field: "updatedAt"
    datetime_format: "%Y-%m-%dT%H:%M:%S.%fZ"
    start_datetime:
      datetime: "{{ config['start_date'] }}"
      datetime_format: "%Y-%m-%dT%H:%M:%SZ"

Association Support

Custom objects support associations with standard and other custom objects.

custom_object_associations:
  supported_associations:
    - contacts
    - companies  
    - deals
    - tickets
    - other_custom_objects
    
  association_retrieval:
    path: "/crm/v4/associations/{{ custom_object_name }}/{{ association_type }}/batch/read"
    method: "POST"
    body:
      inputs: "{{ record_ids }}"
      
  association_flattening:
    format: "Array of associated object IDs"
    field_naming: "{{ association_type }}" 
    example:
      contacts: ["101", "102"]
      companies: ["201"]
      deals: ["301", "302"]

Runtime Stream Registration

dynamic_stream_registration:
  discovery_phase:
    1_fetch_schemas: "GET /crm/v3/schemas"
    2_filter_active: "Only include active custom object schemas"
    3_generate_streams: "Create stream definition for each schema"
    
  stream_registration:
    stream_name: "{{ schema.name }}"
    stream_class: "DynamicDeclarativeStream"
    schema_loader: "HubspotCustomObjectsSchemaLoader"
    
  configuration_injection:
    schema_properties: "Injected via HttpComponentsResolver"
    stream_parameters: "Include custom object schema metadata"

Error Handling

error_handling:
  schema_discovery_failures:
    - Log warning if schema endpoint unavailable
    - Continue with standard streams only
    - Retry schema discovery on subsequent syncs
    
  property_type_errors:
    - Cast unknown property types to string
    - Log warning for unrecognized types
    - Continue processing with fallback type
    
  association_failures:
    - Log warning if associations unavailable
    - Continue sync without association data
    - Preserve primary custom object data

Performance Considerations

performance_optimization:
  schema_caching:
    - Cache custom object schemas between syncs
    - Refresh schema cache on stream discovery
    - Minimize API calls during runtime
    
  property_chunking:
    - Handle large custom property lists
    - Chunk properties to stay within URL limits
    - Maintain property request consistency
    
  concurrent_processing:
    - Process multiple custom object streams in parallel
    - Respect rate limits across all streams
    - Balance throughput with API constraints

Usage Examples

Custom Object Stream Configuration:

source:
  type: airbyte/source-hubspot
  config:
    credentials:
      credentials_title: "Private App Credentials"
      access_token: "${HUBSPOT_ACCESS_TOKEN}"
    start_date: "2023-01-01T00:00:00Z"

# Custom object streams are automatically discovered and available
# Stream names match custom object schema names in HubSpot

Querying Custom Object Data:

-- Query custom vehicle object
SELECT 
  id,
  properties_vehicle_make,
  properties_vehicle_model,
  properties_year,
  properties_price,
  createdAt,
  updatedAt
FROM vehicles
WHERE properties_year >= '2020'
ORDER BY updatedAt DESC;

-- Query custom project object with associations
SELECT 
  p.id,
  p.properties_project_name,
  p.properties_status,
  p.contacts,  -- Associated contact IDs
  p.companies  -- Associated company IDs  
FROM projects p
WHERE p.properties_status = 'active';

Install with Tessl CLI

npx tessl i tessl/airbyte-airbyte-source-hubspot

docs

additional-streams.md

authentication.md

crm-streams.md

custom-objects.md

engagements.md

index.md

marketing.md

property-history.md

tile.json