0
# Document Processing
1
2
AutoRest Core provides comprehensive document type identification and format detection capabilities for OpenAPI specifications and configuration files. This enables automatic processing of various document formats and types in the code generation pipeline.
3
4
## Capabilities
5
6
### Document Type Identification
7
8
Identifies the type and format of documents for proper processing.
9
10
```typescript { .api }
11
/**
12
* Identifies the type of a document based on its content
13
* @param content - The document content to analyze
14
* @returns Promise resolving to the detected document type
15
*/
16
function IdentifyDocument(content: string): Promise<DocumentType>;
17
18
/**
19
* Converts literate configuration content to JSON format
20
* @param content - The literate content to convert
21
* @returns Promise resolving to JSON string
22
*/
23
function LiterateToJson(content: string): Promise<string>;
24
25
/**
26
* Determines if content represents a configuration document
27
* @param content - The document content to check
28
* @returns Promise resolving to true if it's a configuration document
29
*/
30
function IsConfigurationDocument(content: string): Promise<boolean>;
31
32
/**
33
* Determines if content represents an OpenAPI document
34
* @param content - The document content to check
35
* @returns Promise resolving to true if it's an OpenAPI document
36
*/
37
function IsOpenApiDocument(content: string): Promise<boolean>;
38
39
/**
40
* Checks if a file extension indicates a configuration file
41
* @param extension - The file extension to check (including dot)
42
* @returns Promise resolving to true if it's a configuration extension
43
*/
44
function IsConfigurationExtension(extension: string): Promise<boolean>;
45
46
/**
47
* Checks if a file extension indicates an OpenAPI specification file
48
* @param extension - The file extension to check (including dot)
49
* @returns Promise resolving to true if it's an OpenAPI extension
50
*/
51
function IsOpenApiExtension(extension: string): Promise<boolean>;
52
```
53
54
**Usage Examples:**
55
56
```typescript
57
import {
58
IdentifyDocument,
59
IsOpenApiDocument,
60
IsConfigurationDocument,
61
IsOpenApiExtension,
62
LiterateToJson
63
} from "@microsoft.azure/autorest-core";
64
65
// Identify document type from content
66
const swaggerContent = `{
67
"swagger": "2.0",
68
"info": { "title": "My API", "version": "1.0" }
69
}`;
70
71
const docType = await IdentifyDocument(swaggerContent);
72
console.log("Document type:", docType); // DocumentType.OpenAPI2
73
74
// Check if content is OpenAPI
75
const isOpenApi = await IsOpenApiDocument(swaggerContent);
76
console.log("Is OpenAPI:", isOpenApi); // true
77
78
// Check configuration document
79
const configContent = `{
80
"input-file": "swagger.json",
81
"output-folder": "./generated"
82
}`;
83
84
const isConfig = await IsConfigurationDocument(configContent);
85
console.log("Is configuration:", isConfig); // true
86
87
// Check file extensions
88
const isSwaggerExt = await IsOpenApiExtension(".json");
89
console.log("Is OpenAPI extension:", isSwaggerExt); // true
90
91
const isConfigExt = await IsConfigurationExtension(".autorest.json");
92
console.log("Is config extension:", isConfigExt); // true
93
94
// Convert literate config to JSON
95
const literateConfig = `
96
# AutoRest Configuration
97
98
> Input file
99
input-file: swagger.json
100
101
> Output settings
102
output-folder: ./generated
103
namespace: MyClient
104
`;
105
106
const jsonConfig = await LiterateToJson(literateConfig);
107
console.log("JSON config:", jsonConfig);
108
```
109
110
### Document Types
111
112
Enumeration of supported document types that AutoRest can process.
113
114
```typescript { .api }
115
/**
116
* Supported document types
117
*/
118
enum DocumentType {
119
/** OpenAPI 2.0 (Swagger) specification */
120
OpenAPI2 = "OpenAPI2",
121
/** OpenAPI 3.0+ specification */
122
OpenAPI3 = "OpenAPI3",
123
/** Literate configuration document */
124
LiterateConfiguration = "LiterateConfiguration",
125
/** Unknown or unsupported document type */
126
Unknown = "Unknown"
127
}
128
```
129
130
### Document Formats
131
132
Enumeration of supported document formats for parsing and processing.
133
134
```typescript { .api }
135
/**
136
* Supported document formats
137
*/
138
enum DocumentFormat {
139
/** Markdown format */
140
Markdown = "Markdown",
141
/** YAML format */
142
Yaml = "Yaml",
143
/** JSON format */
144
Json = "Json",
145
/** Unknown or unsupported format */
146
Unknown = "Unknown"
147
}
148
```
149
150
### File Extensions and Patterns
151
152
Constants defining file extension mappings and patterns for document recognition.
153
154
```typescript { .api }
155
/**
156
* File extension mappings for different document types
157
*/
158
const DocumentExtension: {
159
[key: string]: DocumentFormat;
160
};
161
162
/**
163
* File pattern mappings for document recognition
164
*/
165
const DocumentPatterns: {
166
[key: string]: RegExp;
167
};
168
```
169
170
**Usage Examples:**
171
172
```typescript
173
import {
174
DocumentType,
175
DocumentFormat,
176
DocumentExtension,
177
DocumentPatterns
178
} from "@microsoft.azure/autorest-core";
179
180
// Check document types
181
switch (docType) {
182
case DocumentType.OpenAPI2:
183
console.log("Processing Swagger 2.0 specification");
184
break;
185
case DocumentType.OpenAPI3:
186
console.log("Processing OpenAPI 3.0+ specification");
187
break;
188
case DocumentType.LiterateConfiguration:
189
console.log("Processing literate configuration");
190
break;
191
case DocumentType.Unknown:
192
console.log("Unknown document type");
193
break;
194
}
195
196
// Check document formats
197
const format = DocumentExtension[".yaml"];
198
if (format === DocumentFormat.Yaml) {
199
console.log("YAML format detected");
200
}
201
202
// Use patterns for content detection
203
const jsonPattern = DocumentPatterns["json"];
204
if (jsonPattern.test(content)) {
205
console.log("Content matches JSON pattern");
206
}
207
```
208
209
## Document Processing Pipeline
210
211
The document processing system integrates with the main AutoRest pipeline to handle various input formats:
212
213
1. **File Discovery**: Use file extensions to identify potential specification files
214
2. **Content Analysis**: Read file content and identify document type
215
3. **Format Detection**: Determine if the document is JSON, YAML, or Markdown
216
4. **Type Classification**: Classify as OpenAPI specification or configuration
217
5. **Content Transformation**: Convert literate configurations to JSON if needed
218
6. **Validation**: Validate document structure and content
219
7. **Pipeline Integration**: Pass processed documents to the generation pipeline
220
221
### Supported File Extensions
222
223
**OpenAPI Specifications:**
224
- `.json` - JSON format OpenAPI specs
225
- `.yaml`, `.yml` - YAML format OpenAPI specs
226
- `.swagger.json` - Swagger-specific JSON files
227
- `.swagger.yaml` - Swagger-specific YAML files
228
229
**Configuration Files:**
230
- `.autorest.json` - AutoRest JSON configuration
231
- `.autorest.yaml` - AutoRest YAML configuration
232
- `.autorest.md` - AutoRest literate configuration
233
- `README.md` - Markdown configuration files
234
235
### Content Detection
236
237
The document processing system uses multiple techniques for accurate identification:
238
239
- **JSON Schema Validation**: Checks for OpenAPI schema markers
240
- **Property Detection**: Looks for specific properties like `swagger`, `openapi`, `info`
241
- **Structure Analysis**: Analyzes document structure patterns
242
- **Extension Mapping**: Uses file extensions as hints for document type
243
- **Content Parsing**: Parses YAML/JSON content for type-specific properties