or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

data-utilities.mdfull-refresh-streams.mdincremental-streams.mdindex.mdoauth-authentication.mdsource-configuration.md

data-utilities.mddocs/

0

# Data Type Utilities

1

2

Utility functions for parsing and converting Xero's custom data formats and JSON structures. These utilities ensure proper data type conversion, date parsing, and RFC3339 compliance for all Xero API responses.

3

4

## Core Imports

5

6

```python

7

import decimal

8

import re

9

from abc import ABC

10

from datetime import date, datetime, time, timedelta, timezone

11

from typing import Any, Iterable, Mapping, MutableMapping, Optional

12

import pendulum

13

import requests

14

```

15

16

## Capabilities

17

18

### Date Parsing Functions

19

20

Xero uses multiple date formats that require specialized parsing to ensure compatibility with downstream systems.

21

22

#### Date Parser

23

24

```python { .api }

25

def parse_date(value: str) -> Optional[datetime]:

26

"""

27

Parse Xero date strings in various formats to datetime objects.

28

29

Supports multiple date formats used by Xero API including:

30

- .NET JSON format: "/Date(1419937200000+0000)/"

31

- ISO 8601 format: "2023-08-15T14:30:25Z"

32

- Partial ISO format: "2023-08-15T14:30:25"

33

34

Parameters:

35

- value: String containing date in any supported format

36

37

Returns:

38

datetime object in UTC timezone, or None if parsing fails

39

40

Examples:

41

- parse_date("/Date(1419937200000+0000)/") -> datetime(2014, 12, 30, 9, 0)

42

- parse_date("2023-08-15T14:30:25Z") -> datetime(2023, 8, 15, 14, 30, 25)

43

- parse_date("invalid-date") -> None

44

"""

45

```

46

47

### JSON Processing Functions

48

49

Custom JSON processing to handle Xero's data structures and ensure RFC3339 compliance.

50

51

#### JSON Object Hook

52

53

```python { .api }

54

def _json_load_object_hook(_dict: dict) -> dict:

55

"""

56

JSON parse hook to convert Xero date formats to RFC3339 strings.

57

58

Automatically processes dictionary objects during JSON parsing

59

to identify and convert date fields from Xero's formats to

60

standardized RFC3339 format for downstream compatibility.

61

62

Parameters:

63

- _dict: Dictionary object from JSON parsing containing potential date fields

64

65

Returns:

66

Modified dictionary with converted date strings in RFC3339 format

67

68

Date Field Patterns:

69

- Fields ending in "Date", "DateUTC", or containing "Date" substring

70

- Common fields: UpdatedDateUTC, CreatedDateUTC, DueDateString, etc.

71

72

Conversion Examples:

73

- "/Date(1419937200000+0000)/" -> "2014-12-30T09:00:00+00:00"

74

- "2023-08-15T14:30:25" -> "2023-08-15T14:30:25+00:00"

75

"""

76

```

77

78

## Date Format Support

79

80

### .NET JSON Date Format

81

82

Xero's legacy .NET JSON date format requires special parsing:

83

84

```python

85

# .NET JSON date format pattern

86

NET_JSON_PATTERN = r"/Date\((\d+)([\+\-]\d{4})?\)/"

87

88

# Examples of .NET JSON dates from Xero:

89

NET_JSON_EXAMPLES = [

90

"/Date(1419937200000+0000)/", # UTC timestamp with timezone

91

"/Date(1419937200000)/", # UTC timestamp without timezone

92

"/Date(1419937200000-0500)/", # Timestamp with negative timezone offset

93

]

94

95

# Parsed results (all converted to UTC):

96

PARSED_RESULTS = [

97

"2014-12-30T09:00:00+00:00", # December 30, 2014 9:00 AM UTC

98

"2014-12-30T09:00:00+00:00", # Same timestamp, assumed UTC

99

"2014-12-30T14:00:00+00:00", # Adjusted for -0500 timezone offset

100

]

101

```

102

103

### ISO 8601 Date Format

104

105

Standard ISO date formats are also supported:

106

107

```python

108

# ISO 8601 format examples

109

ISO_8601_EXAMPLES = [

110

"2023-08-15T14:30:25Z", # Full UTC format with Z suffix

111

"2023-08-15T14:30:25+00:00", # Full UTC format with +00:00 offset

112

"2023-08-15T14:30:25", # Local time without timezone (assumed UTC)

113

"2023-08-15T14:30:25.123Z", # With milliseconds

114

]

115

```

116

117

## Usage Examples

118

119

### Manual Date Parsing

120

121

```python

122

from source_xero.streams import parse_date

123

from datetime import datetime

124

125

# Parse various date formats

126

net_date = parse_date("/Date(1419937200000+0000)/")

127

iso_date = parse_date("2023-08-15T14:30:25Z")

128

partial_date = parse_date("2023-08-15T14:30:25")

129

130

print(f".NET date: {net_date}") # 2014-12-30 09:00:00

131

print(f"ISO date: {iso_date}") # 2023-08-15 14:30:25

132

print(f"Partial date: {partial_date}") # 2023-08-15 14:30:25

133

134

# Handle invalid dates

135

invalid_date = parse_date("not-a-date")

136

print(f"Invalid date: {invalid_date}") # None

137

```

138

139

### JSON Processing with Date Conversion

140

141

```python

142

import json

143

from source_xero.streams import _json_load_object_hook

144

145

# Raw JSON response from Xero API

146

xero_json = '''

147

{

148

"ContactID": "12345678-1234-1234-1234-123456789012",

149

"Name": "Sample Customer",

150

"UpdatedDateUTC": "/Date(1419937200000+0000)/",

151

"CreatedDateUTC": "2023-08-15T14:30:25Z",

152

"EmailAddress": "customer@example.com"

153

}

154

'''

155

156

# Parse with automatic date conversion

157

parsed_data = json.loads(xero_json, object_hook=_json_load_object_hook)

158

159

print(f"Contact: {parsed_data['Name']}")

160

print(f"Updated: {parsed_data['UpdatedDateUTC']}") # Converted to RFC3339

161

print(f"Created: {parsed_data['CreatedDateUTC']}") # Already RFC3339

162

```

163

164

### Stream Response Processing

165

166

```python

167

# This processing happens automatically in all Xero streams

168

def parse_response(self, response: requests.Response, **kwargs) -> Iterable[Mapping]:

169

"""Example of how streams use the date utilities internally."""

170

171

response_data = response.json(object_hook=_json_load_object_hook)

172

data_field = self.data_field()

173

174

# Extract records from response

175

if data_field in response_data:

176

records = response_data[data_field]

177

# All date fields are now in RFC3339 format

178

return records

179

else:

180

return []

181

```

182

183

## Date Field Identification

184

185

### Common Date Fields

186

187

The JSON object hook automatically processes these common date fields:

188

189

```python

190

COMMON_DATE_FIELDS = [

191

"UpdatedDateUTC", # Most common cursor field

192

"CreatedDateUTC", # Alternative cursor field

193

"DueDateString", # Invoice due dates

194

"DateString", # Transaction dates

195

"FullyPaidOnDate", # Payment completion dates

196

"ExpectedArrivalDate", # Purchase order dates

197

"DeliveryDate", # Delivery scheduling

198

"PaymentDueDate", # Payment deadlines

199

"InvoiceDate", # Invoice issue dates

200

"LastLoginDate", # User activity tracking

201

]

202

```

203

204

### Field Detection Logic

205

206

```python

207

# Date field detection patterns

208

def is_date_field(field_name: str) -> bool:

209

"""

210

Determine if a field name likely contains date data.

211

212

Detection criteria:

213

- Field name ends with "Date" or "DateUTC"

214

- Field name contains "Date" substring

215

- Known date field patterns from Xero API

216

"""

217

218

date_patterns = [

219

field_name.endswith('Date'),

220

field_name.endswith('DateUTC'),

221

'Date' in field_name,

222

field_name.endswith('DateString')

223

]

224

225

return any(date_patterns)

226

```

227

228

## Error Handling

229

230

### Date Parsing Errors

231

232

The date parser handles various error conditions gracefully:

233

234

```python

235

# Error handling examples

236

ERROR_CASES = {

237

"Invalid .NET format": "/Date(invalid)/",

238

"Malformed timestamp": "/Date(abc123+0000)/",

239

"Invalid ISO format": "2023-13-45T25:70:99Z",

240

"Empty string": "",

241

"None value": None,

242

"Non-string input": 12345

243

}

244

245

# All error cases return None without raising exceptions

246

for case, value in ERROR_CASES.items():

247

result = parse_date(value)

248

assert result is None, f"{case} should return None"

249

```

250

251

### JSON Processing Errors

252

253

The JSON object hook handles processing errors:

254

255

- **Non-string values**: Skips non-string values in date fields

256

- **Missing fields**: Gracefully handles missing date fields

257

- **Nested objects**: Recursively processes nested date fields

258

- **Array processing**: Handles date fields within array elements

259

260

## Performance Considerations

261

262

### Regex Compilation

263

264

Date parsing uses compiled regex patterns for efficiency:

265

266

```python

267

import re

268

269

# Pre-compiled regex for .NET JSON dates

270

NET_DATE_REGEX = re.compile(r"/Date\((\d+)([\+\-]\d{4})?\)/")

271

272

# Single compilation for all parsing operations

273

# Significantly faster than re-compiling for each date

274

```

275

276

### Caching Strategy

277

278

Date parsing could benefit from caching for repeated values:

279

280

```python

281

# Potential optimization for repeated date values

282

from functools import lru_cache

283

284

@lru_cache(maxsize=1000)

285

def cached_parse_date(value: str) -> Optional[datetime]:

286

"""Cached version of parse_date for performance optimization."""

287

return parse_date(value)

288

```

289

290

### Memory Usage

291

292

The utilities are designed for minimal memory overhead:

293

294

- **Stream Processing**: Processes one record at a time

295

- **No Global State**: Functions are stateless and thread-safe

296

- **Garbage Collection**: Temporary objects are quickly released

297

- **Efficient Patterns**: Uses efficient regex and string operations

298

299

## Integration Notes

300

301

### Airbyte CDK Compatibility

302

303

The utilities integrate seamlessly with Airbyte CDK:

304

305

- **Stream Interface**: Used automatically by all stream classes

306

- **Type Consistency**: Ensures consistent datetime handling

307

- **Error Handling**: Follows Airbyte error handling patterns

308

- **Logging**: Compatible with Airbyte's logging framework

309

310

### Downstream Compatibility

311

312

Converted dates work with common data processing tools:

313

314

- **Data Warehouses**: RFC3339 format is widely supported

315

- **Analytics Tools**: Standard datetime format for analysis

316

- **ETL Pipelines**: Consistent format reduces transformation overhead

317

- **JSON Serialization**: Compatible with standard JSON libraries