or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

hooks.mdindex.mdoperators.mdsensors.md

operators.mddocs/

0

# GitHub Operators

1

2

GitHub Operator provides generic execution of GitHub API operations as Airflow tasks. Uses PyGithub SDK methods dynamically with support for templated parameters and result processing.

3

4

## Capabilities

5

6

### GithubOperator

7

8

Generic operator for executing any GitHub API method through PyGithub client.

9

10

```python { .api }

11

class GithubOperator(BaseOperator):

12

"""

13

Interact and perform actions on GitHub API.

14

15

This operator is designed to use GitHub's Python SDK: https://github.com/PyGithub/PyGithub

16

Executes any method available on the PyGithub client with provided arguments.

17

"""

18

19

# Template fields for dynamic argument substitution

20

template_fields = ("github_method_args",)

21

22

def __init__(

23

self,

24

*,

25

github_method: str,

26

github_conn_id: str = "github_default",

27

github_method_args: dict | None = None,

28

result_processor: Callable | None = None,

29

**kwargs,

30

) -> None:

31

"""

32

Initialize GitHub operator.

33

34

Parameters:

35

- github_method: Method name from PyGithub client to be called

36

- github_conn_id: Reference to pre-defined GitHub Connection

37

- github_method_args: Method parameters for the github_method (templated)

38

- result_processor: Function to further process the response from GitHub API

39

- **kwargs: Additional BaseOperator parameters

40

"""

41

42

def execute(self, context: Context) -> Any:

43

"""

44

Execute GitHub method with provided arguments.

45

46

Creates GithubHook, gets client, and calls specified method.

47

Optionally processes results through result_processor function.

48

49

Parameters:

50

- context: Airflow task execution context

51

52

Returns:

53

Any: Result from GitHub API method, optionally processed

54

55

Raises:

56

AirflowException: If GitHub operation fails or method doesn't exist

57

"""

58

```

59

60

## Usage Examples

61

62

### Basic API Calls

63

64

```python

65

from airflow.providers.github.operators.github import GithubOperator

66

67

# Get user information

68

get_user = GithubOperator(

69

task_id='get_github_user',

70

github_method='get_user',

71

dag=dag

72

)

73

74

# Get specific repository

75

get_repo = GithubOperator(

76

task_id='get_repository',

77

github_method='get_repo',

78

github_method_args={'full_name_or_id': 'apache/airflow'},

79

dag=dag

80

)

81

```

82

83

### Repository Operations

84

85

```python

86

# List user repositories

87

list_repos = GithubOperator(

88

task_id='list_repositories',

89

github_method='get_user',

90

result_processor=lambda user: [repo.name for repo in user.get_repos()],

91

dag=dag

92

)

93

94

# Get repository issues

95

get_issues = GithubOperator(

96

task_id='get_repo_issues',

97

github_method='get_repo',

98

github_method_args={'full_name_or_id': 'apache/airflow'},

99

result_processor=lambda repo: list(repo.get_issues(state='open')),

100

dag=dag

101

)

102

103

# Get repository tags

104

list_tags = GithubOperator(

105

task_id='list_repo_tags',

106

github_method='get_repo',

107

github_method_args={'full_name_or_id': 'apache/airflow'},

108

result_processor=lambda repo: [tag.name for tag in repo.get_tags()],

109

dag=dag

110

)

111

```

112

113

### Organization Operations

114

115

```python

116

# Get organization

117

get_org = GithubOperator(

118

task_id='get_organization',

119

github_method='get_organization',

120

github_method_args={'login': 'apache'},

121

dag=dag

122

)

123

124

# List organization repositories

125

org_repos = GithubOperator(

126

task_id='list_org_repos',

127

github_method='get_organization',

128

github_method_args={'login': 'apache'},

129

result_processor=lambda org: [repo.name for repo in org.get_repos()],

130

dag=dag

131

)

132

```

133

134

### Templated Parameters

135

136

```python

137

# Use templated arguments with Airflow context

138

templated_operation = GithubOperator(

139

task_id='templated_github_call',

140

github_method='get_repo',

141

github_method_args={

142

'full_name_or_id': '{{ dag_run.conf["repo_name"] }}' # Templated

143

},

144

result_processor=lambda repo: repo.stargazers_count,

145

dag=dag

146

)

147

```

148

149

### Custom Result Processing

150

151

```python

152

import logging

153

154

def process_repo_info(repo):

155

"""Custom processor to extract and log repository information."""

156

info = {

157

'name': repo.name,

158

'stars': repo.stargazers_count,

159

'forks': repo.forks_count,

160

'language': repo.language,

161

'open_issues': repo.open_issues_count

162

}

163

logging.info(f"Repository info: {info}")

164

return info

165

166

analyze_repo = GithubOperator(

167

task_id='analyze_repository',

168

github_method='get_repo',

169

github_method_args={'full_name_or_id': 'apache/airflow'},

170

result_processor=process_repo_info,

171

dag=dag

172

)

173

```

174

175

### Complex Workflows

176

177

```python

178

def get_recent_releases(repo):

179

"""Get releases from the last 30 days."""

180

from datetime import datetime, timedelta

181

182

cutoff_date = datetime.now() - timedelta(days=30)

183

recent_releases = []

184

185

for release in repo.get_releases():

186

if release.created_at >= cutoff_date:

187

recent_releases.append({

188

'tag': release.tag_name,

189

'name': release.name,

190

'created': release.created_at.isoformat()

191

})

192

193

return recent_releases

194

195

recent_releases = GithubOperator(

196

task_id='get_recent_releases',

197

github_method='get_repo',

198

github_method_args={'full_name_or_id': 'apache/airflow'},

199

result_processor=get_recent_releases,

200

dag=dag

201

)

202

```

203

204

## Available GitHub Methods

205

206

The operator can call any method available on the PyGithub `Github` client. Common methods include:

207

208

### User/Authentication Methods

209

- `get_user()`: Get authenticated user

210

- `get_user(login)`: Get specific user by login

211

212

### Repository Methods

213

- `get_repo(full_name_or_id)`: Get specific repository

214

- `search_repositories(query)`: Search repositories

215

216

### Organization Methods

217

- `get_organization(login)`: Get organization

218

- `search_users(query)`: Search users

219

220

### And many more as provided by PyGithub SDK

221

222

## Error Handling

223

224

The operator wraps GitHub API exceptions:

225

226

```python

227

# GitHub API errors are caught and re-raised as AirflowException

228

try:

229

result = operator.execute(context)

230

except AirflowException as e:

231

# Handle GitHub API failures

232

if "404" in str(e):

233

print("Resource not found")

234

elif "403" in str(e):

235

print("Access forbidden - check token permissions")

236

```

237

238

## Return Values

239

240

- **Without result_processor**: Returns raw PyGithub object (Repository, User, etc.)

241

- **With result_processor**: Returns processed result from the processor function

242

- **On error**: Raises `AirflowException` with GitHub error details