or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

autorest-generator.mdconfiguration.mddocument-processing.mdfile-system.mdindex.mdmessaging.md

document-processing.mddocs/

0

# Document Processing

1

2

AutoRest Core provides comprehensive document type identification and format detection capabilities for OpenAPI specifications and configuration files. This enables automatic processing of various document formats and types in the code generation pipeline.

3

4

## Capabilities

5

6

### Document Type Identification

7

8

Identifies the type and format of documents for proper processing.

9

10

```typescript { .api }

11

/**

12

* Identifies the type of a document based on its content

13

* @param content - The document content to analyze

14

* @returns Promise resolving to the detected document type

15

*/

16

function IdentifyDocument(content: string): Promise<DocumentType>;

17

18

/**

19

* Converts literate configuration content to JSON format

20

* @param content - The literate content to convert

21

* @returns Promise resolving to JSON string

22

*/

23

function LiterateToJson(content: string): Promise<string>;

24

25

/**

26

* Determines if content represents a configuration document

27

* @param content - The document content to check

28

* @returns Promise resolving to true if it's a configuration document

29

*/

30

function IsConfigurationDocument(content: string): Promise<boolean>;

31

32

/**

33

* Determines if content represents an OpenAPI document

34

* @param content - The document content to check

35

* @returns Promise resolving to true if it's an OpenAPI document

36

*/

37

function IsOpenApiDocument(content: string): Promise<boolean>;

38

39

/**

40

* Checks if a file extension indicates a configuration file

41

* @param extension - The file extension to check (including dot)

42

* @returns Promise resolving to true if it's a configuration extension

43

*/

44

function IsConfigurationExtension(extension: string): Promise<boolean>;

45

46

/**

47

* Checks if a file extension indicates an OpenAPI specification file

48

* @param extension - The file extension to check (including dot)

49

* @returns Promise resolving to true if it's an OpenAPI extension

50

*/

51

function IsOpenApiExtension(extension: string): Promise<boolean>;

52

```

53

54

**Usage Examples:**

55

56

```typescript

57

import {

58

IdentifyDocument,

59

IsOpenApiDocument,

60

IsConfigurationDocument,

61

IsOpenApiExtension,

62

LiterateToJson

63

} from "@microsoft.azure/autorest-core";

64

65

// Identify document type from content

66

const swaggerContent = `{

67

"swagger": "2.0",

68

"info": { "title": "My API", "version": "1.0" }

69

}`;

70

71

const docType = await IdentifyDocument(swaggerContent);

72

console.log("Document type:", docType); // DocumentType.OpenAPI2

73

74

// Check if content is OpenAPI

75

const isOpenApi = await IsOpenApiDocument(swaggerContent);

76

console.log("Is OpenAPI:", isOpenApi); // true

77

78

// Check configuration document

79

const configContent = `{

80

"input-file": "swagger.json",

81

"output-folder": "./generated"

82

}`;

83

84

const isConfig = await IsConfigurationDocument(configContent);

85

console.log("Is configuration:", isConfig); // true

86

87

// Check file extensions

88

const isSwaggerExt = await IsOpenApiExtension(".json");

89

console.log("Is OpenAPI extension:", isSwaggerExt); // true

90

91

const isConfigExt = await IsConfigurationExtension(".autorest.json");

92

console.log("Is config extension:", isConfigExt); // true

93

94

// Convert literate config to JSON

95

const literateConfig = `

96

# AutoRest Configuration

97

98

> Input file

99

input-file: swagger.json

100

101

> Output settings

102

output-folder: ./generated

103

namespace: MyClient

104

`;

105

106

const jsonConfig = await LiterateToJson(literateConfig);

107

console.log("JSON config:", jsonConfig);

108

```

109

110

### Document Types

111

112

Enumeration of supported document types that AutoRest can process.

113

114

```typescript { .api }

115

/**

116

* Supported document types

117

*/

118

enum DocumentType {

119

/** OpenAPI 2.0 (Swagger) specification */

120

OpenAPI2 = "OpenAPI2",

121

/** OpenAPI 3.0+ specification */

122

OpenAPI3 = "OpenAPI3",

123

/** Literate configuration document */

124

LiterateConfiguration = "LiterateConfiguration",

125

/** Unknown or unsupported document type */

126

Unknown = "Unknown"

127

}

128

```

129

130

### Document Formats

131

132

Enumeration of supported document formats for parsing and processing.

133

134

```typescript { .api }

135

/**

136

* Supported document formats

137

*/

138

enum DocumentFormat {

139

/** Markdown format */

140

Markdown = "Markdown",

141

/** YAML format */

142

Yaml = "Yaml",

143

/** JSON format */

144

Json = "Json",

145

/** Unknown or unsupported format */

146

Unknown = "Unknown"

147

}

148

```

149

150

### File Extensions and Patterns

151

152

Constants defining file extension mappings and patterns for document recognition.

153

154

```typescript { .api }

155

/**

156

* File extension mappings for different document types

157

*/

158

const DocumentExtension: {

159

[key: string]: DocumentFormat;

160

};

161

162

/**

163

* File pattern mappings for document recognition

164

*/

165

const DocumentPatterns: {

166

[key: string]: RegExp;

167

};

168

```

169

170

**Usage Examples:**

171

172

```typescript

173

import {

174

DocumentType,

175

DocumentFormat,

176

DocumentExtension,

177

DocumentPatterns

178

} from "@microsoft.azure/autorest-core";

179

180

// Check document types

181

switch (docType) {

182

case DocumentType.OpenAPI2:

183

console.log("Processing Swagger 2.0 specification");

184

break;

185

case DocumentType.OpenAPI3:

186

console.log("Processing OpenAPI 3.0+ specification");

187

break;

188

case DocumentType.LiterateConfiguration:

189

console.log("Processing literate configuration");

190

break;

191

case DocumentType.Unknown:

192

console.log("Unknown document type");

193

break;

194

}

195

196

// Check document formats

197

const format = DocumentExtension[".yaml"];

198

if (format === DocumentFormat.Yaml) {

199

console.log("YAML format detected");

200

}

201

202

// Use patterns for content detection

203

const jsonPattern = DocumentPatterns["json"];

204

if (jsonPattern.test(content)) {

205

console.log("Content matches JSON pattern");

206

}

207

```

208

209

## Document Processing Pipeline

210

211

The document processing system integrates with the main AutoRest pipeline to handle various input formats:

212

213

1. **File Discovery**: Use file extensions to identify potential specification files

214

2. **Content Analysis**: Read file content and identify document type

215

3. **Format Detection**: Determine if the document is JSON, YAML, or Markdown

216

4. **Type Classification**: Classify as OpenAPI specification or configuration

217

5. **Content Transformation**: Convert literate configurations to JSON if needed

218

6. **Validation**: Validate document structure and content

219

7. **Pipeline Integration**: Pass processed documents to the generation pipeline

220

221

### Supported File Extensions

222

223

**OpenAPI Specifications:**

224

- `.json` - JSON format OpenAPI specs

225

- `.yaml`, `.yml` - YAML format OpenAPI specs

226

- `.swagger.json` - Swagger-specific JSON files

227

- `.swagger.yaml` - Swagger-specific YAML files

228

229

**Configuration Files:**

230

- `.autorest.json` - AutoRest JSON configuration

231

- `.autorest.yaml` - AutoRest YAML configuration

232

- `.autorest.md` - AutoRest literate configuration

233

- `README.md` - Markdown configuration files

234

235

### Content Detection

236

237

The document processing system uses multiple techniques for accurate identification:

238

239

- **JSON Schema Validation**: Checks for OpenAPI schema markers

240

- **Property Detection**: Looks for specific properties like `swagger`, `openapi`, `info`

241

- **Structure Analysis**: Analyzes document structure patterns

242

- **Extension Mapping**: Uses file extensions as hints for document type

243

- **Content Parsing**: Parses YAML/JSON content for type-specific properties