Airbyte source connector for extracting data from Microsoft OneDrive cloud storage with OAuth authentication and file-based streaming capabilities.
—
Quality
Pending
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Core Airbyte source connector functionality providing the main interface for data extraction from Microsoft OneDrive. Implements the standard Airbyte protocol for specification, validation, discovery, and reading operations.
The primary connector class that orchestrates OneDrive data extraction using Airbyte's file-based framework.
class SourceMicrosoftOneDrive(FileBasedSource):
def __init__(
self,
catalog: Optional[ConfiguredAirbyteCatalog],
config: Optional[Mapping[str, Any]],
state: Optional[TState]
):
"""
Initialize the Microsoft OneDrive source connector.
Parameters:
- catalog: Optional[ConfiguredAirbyteCatalog] - Airbyte catalog configuration
- config: Optional[Mapping[str, Any]] - Connector configuration including authentication
- state: Optional[TState] - Connector state for incremental syncs
"""
def spec(self, *args: Any, **kwargs: Any) -> ConnectorSpecification:
"""
Returns the specification describing what fields can be configured by a user.
Includes OAuth 2.0 configuration for Microsoft Graph API authentication.
Returns:
ConnectorSpecification: Complete specification including authentication flows
"""Main function for command-line execution supporting all standard Airbyte operations.
def run():
"""
Main CLI entry point that processes command-line arguments and launches the connector.
Supports spec, check, discover, and read operations with proper argument parsing.
Handles:
- Command-line argument extraction (config, catalog, state paths)
- Source connector initialization
- Airbyte entrypoint launch with parsed arguments
"""from source_microsoft_onedrive import SourceMicrosoftOneDrive
from airbyte_cdk import launch
# OAuth configuration
config = {
"credentials": {
"auth_type": "Client",
"tenant_id": "your-tenant-id",
"client_id": "your-client-id",
"client_secret": "your-client-secret",
"refresh_token": "your-refresh-token"
},
"drive_name": "OneDrive",
"search_scope": "ACCESSIBLE_DRIVES",
"folder_path": "Documents",
"streams": [{
"name": "documents",
"globs": ["*.pdf", "*.docx"],
"validation_policy": "Emit Record",
"format": {"filetype": "unstructured"}
}]
}
# Initialize connector
source = SourceMicrosoftOneDrive(None, config, None)
# Launch with read operation
launch(source, ["read", "--config", "config.json", "--catalog", "catalog.json"])# Service principal configuration
service_config = {
"credentials": {
"auth_type": "Service",
"tenant_id": "your-tenant-id",
"user_principal_name": "user@yourdomain.com",
"client_id": "your-app-id",
"client_secret": "your-app-secret"
},
"drive_name": "OneDrive",
"search_scope": "ALL",
"folder_path": "."
}
source = SourceMicrosoftOneDrive(None, service_config, None)# Generate connector specification
source-microsoft-onedrive spec
# Validate configuration
source-microsoft-onedrive check --config secrets/config.json
# Discover available streams
source-microsoft-onedrive discover --config secrets/config.json
# Read data from configured streams
source-microsoft-onedrive read --config secrets/config.json --catalog catalog.jsonThe connector implements comprehensive error handling:
The spec() method returns a complete ConnectorSpecification including:
Install with Tessl CLI
npx tessl i tessl/pypi-source-microsoft-onedrive