An Airbyte source connector for extracting data from Webflow CMS collections
npx @tessl/cli install tessl/pypi-airbyte-source-webflow@0.1.00
# Airbyte Source Webflow
1
2
An Airbyte source connector that extracts data from Webflow CMS collections. This connector enables data extraction from Webflow, a content management system for hosting websites, by dynamically discovering available collections and creating schemas based on Webflow's API field definitions.
3
4
## Package Information
5
6
- **Package Name**: airbyte-source-webflow
7
- **Language**: Python
8
- **Installation**: `pip install airbyte-source-webflow` or `poetry add airbyte-source-webflow`
9
- **CLI Command**: `source-webflow` (requires installation first)
10
- **Platform Context**: Typically used within Airbyte data integration platform
11
12
## Core Imports
13
14
```python
15
from source_webflow import SourceWebflow
16
```
17
18
For running the connector:
19
20
```python
21
from source_webflow.run import run
22
```
23
24
## Basic Usage
25
26
```python
27
from source_webflow import SourceWebflow
28
import logging
29
30
# Configuration for the connector
31
config = {
32
"api_key": "your_webflow_api_token",
33
"site_id": "your_webflow_site_id",
34
"accept_version": "1.0.0" # Optional, no default in spec
35
}
36
37
# Create source instance
38
source = SourceWebflow()
39
40
# Check connection
41
logger = logging.getLogger(__name__)
42
is_connected, error = source.check_connection(logger, config)
43
44
if is_connected:
45
# Get available streams (collections)
46
streams = source.streams(config)
47
for stream in streams:
48
print(f"Available collection: {stream.name}")
49
else:
50
print(f"Connection failed: {error}")
51
```
52
53
Command-line usage (Airbyte protocol):
54
55
```bash
56
# Install the connector first
57
pip install airbyte-source-webflow
58
59
# Run the connector with Airbyte protocol
60
source-webflow check --config config.json
61
source-webflow discover --config config.json
62
source-webflow read --config config.json --catalog catalog.json
63
```
64
65
**Note**: This connector is designed for use within the Airbyte platform but can be run standalone for testing and development purposes.
66
67
## Architecture
68
69
The connector follows Airbyte's CDK (Connector Development Kit) patterns and implements a dynamic discovery approach:
70
71
- **SourceWebflow**: Main connector class implementing Airbyte's AbstractSource interface
72
- **Stream Classes**: Dynamic stream generation based on discovered Webflow collections
73
- **Authentication**: Token-based authentication with versioned API headers
74
- **Schema Mapping**: Automatic conversion from Webflow field types to Airbyte-compatible JSON schemas
75
76
The connector performs full-refresh synchronization, downloading all available data from selected collections on each run, as Webflow data volumes are typically small and incremental sync is not supported.
77
78
## Capabilities
79
80
### Source Configuration and Connection
81
82
Main source connector class with configuration validation, connection testing, and stream discovery functionality.
83
84
```python { .api }
85
class SourceWebflow(AbstractSource):
86
def check_connection(self, logger: logging.Logger, config: Mapping[str, Any]) -> Tuple[bool, any]: ...
87
def streams(self, config: Mapping[str, Any]) -> List[Stream]: ...
88
@staticmethod
89
def get_authenticator(config): ...
90
```
91
92
[Source Configuration](./source-configuration.md)
93
94
### Stream Operations
95
96
Dynamic stream classes for handling Webflow collections, schemas, and data extraction with automatic pagination and type conversion.
97
98
```python { .api }
99
class CollectionContents(WebflowStream):
100
def __init__(self, site_id: str = None, collection_id: str = None, collection_name: str = None, **kwargs): ...
101
def get_json_schema(self) -> Mapping[str, Any]: ...
102
103
class CollectionsList(WebflowStream):
104
def __init__(self, site_id: str = None, **kwargs): ...
105
106
class CollectionSchema(WebflowStream):
107
def __init__(self, collection_id: str = None, **kwargs): ...
108
```
109
110
[Stream Operations](./stream-operations.md)
111
112
### Authentication and Configuration
113
114
Token-based authentication with Webflow API version headers and configuration schema validation.
115
116
```python { .api }
117
class WebflowTokenAuthenticator(WebflowAuthMixin, TokenAuthenticator): ...
118
119
class WebflowAuthMixin:
120
def __init__(self, *, accept_version_header: str = "accept-version", accept_version: str, **kwargs): ...
121
def get_auth_header(self) -> Mapping[str, Any]: ...
122
```
123
124
[Authentication](./authentication.md)
125
126
## Configuration Schema
127
128
Configuration parameters required by the connector:
129
130
```python { .api }
131
# Required configuration parameters
132
config = {
133
"api_key": str, # Webflow API token (required, secret)
134
"site_id": str, # Webflow site identifier (required)
135
"accept_version": str # API version (optional, no default)
136
}
137
```
138
139
### Configuration Specification
140
141
Complete JSON schema specification for the connector configuration:
142
143
```python { .api }
144
# Configuration schema from spec.yaml
145
SPEC = {
146
"documentationUrl": "https://docs.airbyte.com/integrations/sources/webflow",
147
"connectionSpecification": {
148
"$schema": "http://json-schema.org/draft-07/schema#",
149
"title": "Webflow Spec",
150
"type": "object",
151
"required": ["api_key", "site_id"],
152
"additionalProperties": True,
153
"properties": {
154
"site_id": {
155
"title": "Site id",
156
"type": "string",
157
"description": "The id of the Webflow site you are requesting data from. See https://developers.webflow.com/#sites",
158
"example": "a relatively long hex sequence",
159
"order": 0
160
},
161
"api_key": {
162
"title": "API token",
163
"type": "string",
164
"description": "The API token for authenticating to Webflow. See https://university.webflow.com/lesson/intro-to-the-webflow-api",
165
"example": "a very long hex sequence",
166
"order": 1,
167
"airbyte_secret": True
168
},
169
"accept_version": {
170
"title": "Accept Version",
171
"type": "string",
172
"description": "The version of the Webflow API to use. See https://developers.webflow.com/#versioning",
173
"example": "1.0.0",
174
"order": 2
175
}
176
}
177
}
178
}
179
```
180
181
### Constants
182
183
```python { .api }
184
# Webflow API version constant
185
WEBFLOW_ACCEPT_VERSION = "1.0.0" # Default API version used by connector
186
```
187
188
## Entry Point
189
190
```python { .api }
191
def run():
192
"""Main entry point for the connector CLI."""
193
```
194
195
### Type Mapping Utilities
196
197
Utilities for converting Webflow field types to Airbyte-compatible JSON schema types.
198
199
```python { .api }
200
class WebflowToAirbyteMapping:
201
"""Utility class for mapping Webflow field types to JSON schema types."""
202
203
webflow_to_airbyte_mapping = {
204
"Bool": {"type": ["null", "boolean"]},
205
"Date": {"type": ["null", "string"], "format": "date-time"},
206
"Email": {"type": ["null", "string"]},
207
"ImageRef": {"type": ["null", "object"], "additionalProperties": True},
208
"ItemRef": {"type": ["null", "string"]},
209
"ItemRefSet": {"type": ["null", "array"]},
210
"Link": {"type": ["null", "string"]},
211
"Number": {"type": ["null", "number"]},
212
"Option": {"type": ["null", "string"]},
213
"PlainText": {"type": ["null", "string"]},
214
"RichText": {"type": ["null", "string"]},
215
"User": {"type": ["null", "string"]},
216
"Video": {"type": ["null", "string"]},
217
"FileRef": {"type": ["null", "object"]}
218
}
219
```
220
221
## Types
222
223
```python { .api }
224
# External types from airbyte_cdk
225
from typing import Any, Iterable, List, Mapping, MutableMapping, Optional, Tuple
226
import logging
227
import requests
228
229
# Airbyte CDK base classes
230
class AbstractSource:
231
"""Base class for Airbyte source connectors."""
232
def check_connection(self, logger: logging.Logger, config: Mapping[str, Any]) -> Tuple[bool, Any]: ...
233
def streams(self, config: Mapping[str, Any]) -> List['Stream']: ...
234
235
class Stream:
236
"""Base class for data streams."""
237
def read_records(self, sync_mode: str) -> Iterable[Mapping]: ...
238
239
class HttpStream(Stream):
240
"""Base class for HTTP-based streams."""
241
url_base: str
242
def request_params(self, stream_state: Mapping[str, Any], stream_slice: Mapping[str, Any] = None, next_page_token: Mapping[str, Any] = None) -> MutableMapping[str, Any]: ...
243
def parse_response(self, response: requests.Response, **kwargs) -> Iterable[Mapping]: ...
244
def next_page_token(self, response: requests.Response) -> Optional[Mapping[str, Any]]: ...
245
246
class TokenAuthenticator:
247
"""Base token authenticator from airbyte_cdk."""
248
def __init__(self, token: str, **kwargs): ...
249
def get_auth_header(self) -> Mapping[str, Any]: ...
250
251
# Configuration schema type
252
ConfigSpec = {
253
"api_key": str, # Webflow API token (required, secret)
254
"site_id": str, # Webflow site identifier (required)
255
"accept_version": str # API version (optional, no default)
256
}
257
```