or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-source-microsoft-onedrive

Airbyte source connector for extracting data from Microsoft OneDrive cloud storage with OAuth authentication and file-based streaming capabilities.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/source-microsoft-onedrive@0.2.x

To install, run

npx @tessl/cli install tessl/pypi-source-microsoft-onedrive@0.2.0

0

# Microsoft OneDrive Source Connector

1

2

An Airbyte source connector that enables data extraction and synchronization from Microsoft OneDrive cloud storage. Built on the Airbyte CDK file-based framework with OAuth 2.0 authentication integration, automated file discovery, and comprehensive configuration management for enterprise data integration workflows.

3

4

## Package Information

5

6

- **Package Name**: source-microsoft-onedrive

7

- **Package Type**: pypi

8

- **Language**: Python

9

- **Installation**: `pip install source-microsoft-onedrive`

10

- **Version**: 0.2.44

11

12

## Core Imports

13

14

```python

15

from source_microsoft_onedrive import SourceMicrosoftOneDrive

16

```

17

18

For CLI usage:

19

```python

20

from source_microsoft_onedrive.run import run

21

```

22

23

Internal imports (for advanced usage):

24

```python

25

from source_microsoft_onedrive.spec import SourceMicrosoftOneDriveSpec

26

from source_microsoft_onedrive.stream_reader import SourceMicrosoftOneDriveStreamReader, SourceMicrosoftOneDriveClient

27

```

28

29

## Basic Usage

30

31

### As Airbyte Source Connector

32

33

```python

34

from source_microsoft_onedrive import SourceMicrosoftOneDrive

35

from airbyte_cdk import launch

36

37

# Configuration with OAuth credentials

38

config = {

39

"credentials": {

40

"auth_type": "Client",

41

"tenant_id": "your-tenant-id",

42

"client_id": "your-client-id",

43

"client_secret": "your-client-secret",

44

"refresh_token": "your-refresh-token"

45

},

46

"drive_name": "OneDrive",

47

"search_scope": "ALL",

48

"folder_path": ".",

49

"streams": [{

50

"name": "files",

51

"globs": ["*.csv", "*.json"],

52

"validation_policy": "Emit Record",

53

"format": {"filetype": "csv"}

54

}]

55

}

56

57

# Initialize and run connector

58

source = SourceMicrosoftOneDrive(None, config, None)

59

launch(source, ["read", "--config", "config.json", "--catalog", "catalog.json"])

60

```

61

62

### CLI Usage

63

64

```bash

65

# Install via poetry

66

poetry install

67

68

# Run connector commands

69

source-microsoft-onedrive spec

70

source-microsoft-onedrive check --config config.json

71

source-microsoft-onedrive discover --config config.json

72

source-microsoft-onedrive read --config config.json --catalog catalog.json

73

```

74

75

## Architecture

76

77

The connector is built on Airbyte's file-based framework with these key components:

78

79

- **SourceMicrosoftOneDrive**: Main connector class extending FileBasedSource

80

- **SourceMicrosoftOneDriveStreamReader**: Handles file discovery and reading from OneDrive

81

- **SourceMicrosoftOneDriveClient**: Microsoft Graph API client with MSAL authentication

82

- **Configuration Models**: Pydantic models for OAuth and service authentication

83

84

The connector supports both OAuth (user delegation) and service principal authentication, can search across accessible drives and shared items, handles nested folder structures, and integrates with smart-open for efficient file reading across various formats.

85

86

## Capabilities

87

88

### Source Connector

89

90

Core Airbyte source connector functionality including specification generation, configuration validation, stream discovery, and data reading with OAuth authentication support.

91

92

```python { .api }

93

class SourceMicrosoftOneDrive(FileBasedSource):

94

def __init__(self, catalog: Optional[ConfiguredAirbyteCatalog], config: Optional[Mapping[str, Any]], state: Optional[TState]): ...

95

def spec(self, *args: Any, **kwargs: Any) -> ConnectorSpecification: ...

96

```

97

98

[Source Connector](./source-connector.md)

99

100

### Configuration Management

101

102

Comprehensive configuration models supporting OAuth and service authentication with validation, schema generation, and documentation URL management.

103

104

```python { .api }

105

class SourceMicrosoftOneDriveSpec(AbstractFileBasedSpec, BaseModel):

106

credentials: Union[OAuthCredentials, ServiceCredentials]

107

drive_name: Optional[str]

108

search_scope: str

109

folder_path: str

110

111

@classmethod

112

def documentation_url(cls) -> str: ...

113

@classmethod

114

def schema(cls, *args: Any, **kwargs: Any) -> Dict[str, Any]: ...

115

```

116

117

[Configuration](./configuration.md)

118

119

### File Operations

120

121

File discovery, enumeration, and reading capabilities across OneDrive drives and shared items with glob pattern matching and metadata extraction.

122

123

```python { .api }

124

class SourceMicrosoftOneDriveStreamReader(AbstractFileBasedStreamReader):

125

def get_matching_files(self, globs: List[str], prefix: Optional[str], logger: logging.Logger) -> Iterable[RemoteFile]: ...

126

def open_file(self, file: RemoteFile, mode: FileReadMode, encoding: Optional[str], logger: logging.Logger) -> IOBase: ...

127

def get_all_files(self): ...

128

```

129

130

[File Operations](./file-operations.md)

131

132

### Authentication

133

134

Microsoft Graph API authentication using MSAL with support for OAuth refresh tokens and service principal credentials.

135

136

```python { .api }

137

class SourceMicrosoftOneDriveClient:

138

def __init__(self, config: SourceMicrosoftOneDriveSpec): ...

139

@property

140

def client(self): ...

141

def _get_access_token(self): ...

142

```

143

144

[Authentication](./authentication.md)

145

146

## CLI Entry Points

147

148

```python { .api }

149

def run():

150

"""Main CLI entry point that processes command-line arguments and launches the connector."""

151

```

152

153

## Types

154

155

```python { .api }

156

from typing import Any, Dict, List, Mapping, Optional, Union, Iterable

157

from datetime import datetime

158

from io import IOBase

159

160

# Airbyte CDK imports

161

from airbyte_cdk import ConfiguredAirbyteCatalog, ConnectorSpecification, TState

162

from airbyte_cdk.sources.file_based.file_based_source import FileBasedSource

163

from airbyte_cdk.sources.file_based.stream.cursor.default_file_based_cursor import DefaultFileBasedCursor

164

from airbyte_cdk.sources.file_based.file_based_stream_reader import AbstractFileBasedStreamReader, FileReadMode

165

from airbyte_cdk.sources.file_based.remote_file import RemoteFile

166

from airbyte_cdk.sources.file_based.config.abstract_file_based_spec import AbstractFileBasedSpec

167

168

# Pydantic for configuration models

169

from pydantic import BaseModel, Field

170

171

# Microsoft authentication

172

from msal import ConfidentialClientApplication

173

from office365.graph_client import GraphClient

174

175

# Additional imports for error handling and web requests

176

from airbyte_cdk import AirbyteTracedException, FailureType

177

import requests

178

import smart_open

179

import logging

180

```