CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-airbyte-source-hubspot

Airbyte source connector for HubSpot that enables data synchronization from HubSpot's CRM and marketing platform to various destinations.

80

1.40x
Overview
Eval results
Files

web-analytics.mddocs/

Web Analytics (Experimental)

Experimental web analytics streams providing engagement analytics data for various HubSpot objects. These streams offer insights into web interactions and engagement patterns for contacts, companies, deals, and other objects.

Capabilities

Base Web Analytics Stream

Foundation class for all web analytics functionality.

class WebAnalyticsStream(HttpSubStream, BaseStream):
    """
    Base class for experimental web analytics streams.
    
    Provides web analytics data for HubSpot objects including:
    - Page views and session data
    - Engagement metrics and timing
    - Traffic sources and referrers
    - Conversion tracking
    - Custom event tracking
    """

Contact Web Analytics

Web analytics data specifically for contact interactions.

class ContactsWebAnalytics(WebAnalyticsStream):
    """
    Web analytics stream for contact engagement data.
    
    Provides analytics for contact interactions including:
    - Website page views by contacts
    - Session duration and behavior
    - Form interactions and conversions
    - Email engagement tracking
    - Contact journey analytics
    """

Company Web Analytics

Web analytics data for company-level interactions.

class CompaniesWebAnalytics(WebAnalyticsStream):
    """
    Web analytics stream for company engagement data.
    
    Provides analytics for company interactions including:
    - Company domain traffic analysis
    - Employee engagement tracking
    - Account-based marketing metrics
    - Company journey analytics
    - Multi-contact attribution
    """

Deal Web Analytics

Web analytics data related to deal progression and engagement.

class DealsWebAnalytics(WebAnalyticsStream):
    """
    Web analytics stream for deal-related engagement data.
    
    Provides analytics for deal interactions including:
    - Deal-related page views and content engagement
    - Prospect behavior during sales cycle
    - Content consumption by deal stage
    - Sales process analytics
    - Deal influence tracking
    """

Ticket Web Analytics

Web analytics data for support ticket interactions.

class TicketsWebAnalytics(WebAnalyticsStream):
    """
    Web analytics stream for ticket-related engagement data.
    
    Provides analytics for support interactions including:
    - Knowledge base usage patterns
    - Self-service behavior analytics
    - Support portal engagement
    - Ticket resolution journey tracking
    - Customer effort analysis
    """

Engagement Web Analytics

Web analytics for specific engagement types.

class EngagementsCallsWebAnalytics(WebAnalyticsStream):
    """Web analytics for call engagement interactions."""

class EngagementsEmailsWebAnalytics(WebAnalyticsStream):
    """Web analytics for email engagement interactions."""

class EngagementsMeetingsWebAnalytics(WebAnalyticsStream):
    """Web analytics for meeting engagement interactions."""

class EngagementsNotesWebAnalytics(WebAnalyticsStream):
    """Web analytics for note engagement interactions."""

class EngagementsTasksWebAnalytics(WebAnalyticsStream):
    """Web analytics for task engagement interactions."""

Product & Sales Web Analytics

Web analytics for product and sales-related interactions.

class ProductsWebAnalytics(WebAnalyticsStream):
    """
    Web analytics for product-related interactions.
    
    Provides analytics including:
    - Product page engagement
    - E-commerce behavior tracking
    - Product discovery patterns
    - Purchase journey analytics
    """

class LineItemsWebAnalytics(WebAnalyticsStream):
    """Web analytics for line item interactions."""

class GoalsWebAnalytics(WebAnalyticsStream):
    """Web analytics for goal-related interactions."""

class FeedbackSubmissionsWebAnalytics(WebAnalyticsStream):
    """Web analytics for feedback submission interactions."""

Usage Examples

Enabling Experimental Streams

from source_hubspot import SourceHubspot

# Enable experimental streams in configuration
config = {
    "credentials": {
        "credentials_title": "OAuth Credentials",
        "client_id": "your_client_id",
        "client_secret": "your_client_secret",
        "refresh_token": "your_refresh_token"
    },
    "start_date": "2023-01-01T00:00:00Z",
    "enable_experimental_streams": True  # This enables web analytics streams
}

source = SourceHubspot(catalog=None, config=config, state=None)
streams = source.streams(config)

# Filter for web analytics streams
web_analytics_streams = [
    stream for stream in streams 
    if "WebAnalytics" in stream.__class__.__name__
]

print(f"Found {len(web_analytics_streams)} web analytics streams:")
for stream in web_analytics_streams:
    print(f"  - {stream.name}")

Contact Web Analytics Analysis

from source_hubspot.streams import ContactsWebAnalytics, API

api = API(credentials)

# Create contacts web analytics stream
contacts_analytics = ContactsWebAnalytics(
    api=api,
    start_date="2023-01-01T00:00:00Z",
    credentials=credentials
)

# Analyze contact engagement patterns
engagement_patterns = {}
total_sessions = 0

for record in contacts_analytics.read_records(sync_mode="full_refresh"):
    contact_id = record.get('contactId')
    session_data = record.get('sessionData', {})
    page_views = session_data.get('pageViews', 0)
    session_duration = session_data.get('sessionDurationMs', 0)
    
    if contact_id:
        if contact_id not in engagement_patterns:
            engagement_patterns[contact_id] = {
                'total_page_views': 0,
                'total_session_time': 0,
                'session_count': 0,
                'avg_session_duration': 0
            }
        
        engagement_patterns[contact_id]['total_page_views'] += page_views
        engagement_patterns[contact_id]['total_session_time'] += session_duration
        engagement_patterns[contact_id]['session_count'] += 1
        total_sessions += 1

# Calculate averages and display insights
print(f"Contact Web Analytics Summary:")
print(f"Total sessions analyzed: {total_sessions}")

high_engagement_contacts = []
for contact_id, data in engagement_patterns.items():
    if data['session_count'] > 0:
        data['avg_session_duration'] = data['total_session_time'] / data['session_count']
        data['avg_pages_per_session'] = data['total_page_views'] / data['session_count']
        
        # Identify high-engagement contacts
        if data['avg_pages_per_session'] > 5 or data['avg_session_duration'] > 300000:  # 5+ pages or 5+ minutes
            high_engagement_contacts.append((contact_id, data))

print(f"High-engagement contacts: {len(high_engagement_contacts)}")

Company Account Analytics

from source_hubspot.streams import CompaniesWebAnalytics

companies_analytics = CompaniesWebAnalytics(
    api=api,
    start_date="2023-01-01T00:00:00Z",
    credentials=credentials
)

# Track company-level engagement
company_engagement = {}

for record in companies_analytics.read_records(sync_mode="full_refresh"):
    company_id = record.get('companyId')
    domain = record.get('domain')
    engagement_score = record.get('engagementScore', 0)
    unique_visitors = record.get('uniqueVisitors', 0)
    
    if company_id:
        company_engagement[company_id] = {
            'domain': domain,
            'engagement_score': engagement_score,
            'unique_visitors': unique_visitors,
            'total_interactions': record.get('totalInteractions', 0)
        }

# Sort companies by engagement
sorted_companies = sorted(
    company_engagement.items(),
    key=lambda x: x[1]['engagement_score'],
    reverse=True
)

print("Top Engaged Companies:")
for company_id, data in sorted_companies[:10]:
    print(f"Company {company_id} ({data['domain']})")
    print(f"  Engagement Score: {data['engagement_score']}")
    print(f"  Unique Visitors: {data['unique_visitors']}")
    print(f"  Total Interactions: {data['total_interactions']}")

Deal Journey Analytics

from source_hubspot.streams import DealsWebAnalytics

deals_analytics = DealsWebAnalytics(
    api=api,
    start_date="2023-01-01T00:00:00Z",
    credentials=credentials
)

# Analyze content engagement by deal stage
deal_stage_engagement = {}

for record in deals_analytics.read_records(sync_mode="full_refresh"):
    deal_id = record.get('dealId')
    deal_stage = record.get('dealStage', 'Unknown')
    content_views = record.get('contentViews', [])
    
    if deal_stage not in deal_stage_engagement:
        deal_stage_engagement[deal_stage] = {
            'deals': set(),
            'total_content_views': 0,
            'content_types': {}
        }
    
    deal_stage_engagement[deal_stage]['deals'].add(deal_id)
    deal_stage_engagement[deal_stage]['total_content_views'] += len(content_views)
    
    # Categorize content types
    for content in content_views:
        content_type = content.get('contentType', 'Unknown')
        if content_type not in deal_stage_engagement[deal_stage]['content_types']:
            deal_stage_engagement[deal_stage]['content_types'][content_type] = 0
        deal_stage_engagement[deal_stage]['content_types'][content_type] += 1

print("Content Engagement by Deal Stage:")
for stage, data in deal_stage_engagement.items():
    unique_deals = len(data['deals'])
    avg_content_views = data['total_content_views'] / max(unique_deals, 1)
    print(f"\n{stage}:")
    print(f"  Deals: {unique_deals}")
    print(f"  Avg content views per deal: {avg_content_views:.1f}")
    print(f"  Popular content types: {dict(sorted(data['content_types'].items(), key=lambda x: x[1], reverse=True)[:3])}")

Product Analytics

from source_hubspot.streams import ProductsWebAnalytics

products_analytics = ProductsWebAnalytics(
    api=api,
    start_date="2023-01-01T00:00:00Z",
    credentials=credentials
)

# Track product page engagement
product_engagement = {}

for record in products_analytics.read_records(sync_mode="full_refresh"):
    product_id = record.get('productId')
    page_views = record.get('pageViews', 0)
    unique_visitors = record.get('uniqueVisitors', 0)
    conversion_rate = record.get('conversionRate', 0)
    
    if product_id:
        product_engagement[product_id] = {
            'page_views': page_views,
            'unique_visitors': unique_visitors,
            'conversion_rate': conversion_rate * 100,  # Convert to percentage
            'engagement_rate': (unique_visitors / max(page_views, 1)) * 100
        }

# Identify high-performing products
print("Product Performance Analytics:")
sorted_products = sorted(
    product_engagement.items(),
    key=lambda x: x[1]['conversion_rate'],
    reverse=True
)

for product_id, data in sorted_products[:5]:
    print(f"Product {product_id}:")
    print(f"  Page Views: {data['page_views']}")
    print(f"  Unique Visitors: {data['unique_visitors']}")
    print(f"  Conversion Rate: {data['conversion_rate']:.2f}%")
    print(f"  Engagement Rate: {data['engagement_rate']:.2f}%")

Configuration

Enabling Experimental Streams

Web analytics streams are only available when experimental streams are enabled:

config = {
    # ... other configuration
    "enable_experimental_streams": True
}

Stream Dependencies

Web analytics streams depend on their parent streams:

  • ContactsWebAnalytics requires Contacts stream data
  • CompaniesWebAnalytics requires Companies stream data
  • DealsWebAnalytics requires Deals stream data
  • etc.

Data Structure

Web analytics streams typically return records with the following structure:

# Example web analytics record
{
    "objectId": str,          # ID of the associated object (contact, company, etc.)
    "objectType": str,        # Type of associated object
    "sessionData": {
        "sessionId": str,
        "pageViews": int,
        "sessionDurationMs": int,
        "bounceRate": float,
        "entryPage": str,
        "exitPage": str
    },
    "engagementMetrics": {
        "engagementScore": float,
        "interactionCount": int,
        "contentViews": List[Dict],
        "formSubmissions": int
    },
    "trafficSource": {
        "source": str,
        "medium": str,
        "campaign": str,
        "referrer": str
    },
    "timestamp": int,         # Unix timestamp
    "customEvents": List[Dict] # Custom event tracking data
}

Limitations

  • Experimental Status: These streams are experimental and may change or be removed
  • Data Availability: Analytics data availability depends on HubSpot tracking implementation
  • Performance: Web analytics streams may have different performance characteristics
  • OAuth Scopes: May require additional scopes beyond standard CRM permissions

Install with Tessl CLI

npx tessl i tessl/pypi-airbyte-source-hubspot

docs

api-client.md

base-stream-classes.md

crm-streams.md

custom-objects.md

engagement-streams.md

error-handling.md

index.md

marketing-sales-streams.md

property-history-streams.md

source-connector.md

web-analytics.md

tile.json