0
# Datamodel Code Generator
1
2
A comprehensive Python library and CLI tool that automatically generates Python data models from various structured input formats. It transforms OpenAPI schemas, JSON Schema, JSON/YAML/CSV data, Python dictionaries, and GraphQL schemas into ready-to-use Python data structures including Pydantic BaseModel (v1 and v2), dataclasses, TypedDict, and msgspec.Struct types.
3
4
## Package Information
5
6
- **Package Name**: datamodel-code-generator
7
- **Language**: Python
8
- **Installation**: `pip install datamodel-code-generator`
9
- **Python Versions**: 3.9, 3.10, 3.11, 3.12, 3.13
10
- **Optional Dependencies**:
11
- `[http]` for remote schema fetching
12
- `[graphql]` for GraphQL schema support
13
- `[validation]` for OpenAPI validation
14
- `[debug]` for debugging features
15
- `[ruff]` for Ruff code formatting
16
17
## Core Imports
18
19
```python
20
from datamodel_code_generator import generate
21
```
22
23
For CLI usage:
24
```bash
25
datamodel-codegen --input schema.yaml --output models.py
26
```
27
28
Common enums and types:
29
```python
30
from datamodel_code_generator import (
31
DataModelType,
32
InputFileType,
33
PythonVersion,
34
Error,
35
OpenAPIScope,
36
GraphQLScope
37
)
38
```
39
40
Format module imports:
41
```python
42
from datamodel_code_generator.format import (
43
DatetimeClassType,
44
Formatter,
45
PythonVersionMin,
46
DEFAULT_FORMATTERS
47
)
48
```
49
50
## Basic Usage
51
52
### CLI Usage
53
54
```bash
55
# Generate from OpenAPI schema
56
datamodel-codegen --input api.yaml --output models.py
57
58
# Generate Pydantic v2 models
59
datamodel-codegen --input schema.json --output models.py --output-model-type pydantic_v2.BaseModel
60
61
# Generate dataclasses
62
datamodel-codegen --input data.json --output models.py --output-model-type dataclasses.dataclass
63
```
64
65
### Programmatic Usage
66
67
```python
68
from datamodel_code_generator import generate, DataModelType, InputFileType
69
from pathlib import Path
70
71
# Generate from file
72
generate(
73
input_=Path("api.yaml"),
74
output=Path("models.py"),
75
output_model_type=DataModelType.PydanticV2BaseModel
76
)
77
78
# Generate from string
79
schema_text = '''
80
{
81
"type": "object",
82
"properties": {
83
"name": {"type": "string"},
84
"age": {"type": "integer"}
85
}
86
}
87
'''
88
89
generate(
90
input_=schema_text,
91
input_file_type=InputFileType.JsonSchema,
92
output=Path("models.py")
93
)
94
```
95
96
## Architecture
97
98
The code generator follows a modular architecture with three main components:
99
100
- **Parsers**: Transform input schemas into internal representation (OpenAPIParser, JsonSchemaParser, GraphQLParser)
101
- **Model Types**: Generate output code in different Python model formats (Pydantic v1/v2, dataclasses, TypedDict, msgspec)
102
- **Code Formatting**: Apply consistent formatting using Black, isort, and Ruff
103
104
This design enables extensibility through custom templates, formatters, and model types while maintaining consistency across different input and output formats.
105
106
## Capabilities
107
108
### Main Generation Function
109
110
The primary programmatic interface for generating Python data models with extensive configuration options for input sources, output formats, and code generation behavior.
111
112
```python { .api }
113
def generate(
114
input_: Path | str | ParseResult | Mapping[str, Any],
115
*,
116
input_filename: str | None = None,
117
input_file_type: InputFileType = InputFileType.Auto,
118
output: Path | None = None,
119
output_model_type: DataModelType = DataModelType.PydanticBaseModel,
120
target_python_version: PythonVersion = PythonVersionMin,
121
**kwargs
122
) -> None:
123
"""
124
Generate Python data models from input schema.
125
126
Args:
127
input_: Input source (file path, URL, string content, or dict)
128
input_filename: Name for the input file (for metadata)
129
input_file_type: Type of input format (auto-detected if Auto)
130
output: Output file path (None for stdout)
131
output_model_type: Python model type to generate
132
target_python_version: Target Python version for compatibility
133
**kwargs: 70+ additional configuration parameters
134
"""
135
```
136
137
[Main Generation API](./generation.md)
138
139
### CLI Interface
140
141
Command-line interface providing access to all generation features with argument parsing and user-friendly options.
142
143
```python { .api }
144
def main() -> None:
145
"""Main CLI entry point accessed via 'datamodel-codegen' command."""
146
```
147
148
[CLI Interface](./cli.md)
149
150
### Input Type Detection
151
152
Automatic detection of input schema formats with support for multiple structured data types.
153
154
```python { .api }
155
def infer_input_type(text: str) -> InputFileType:
156
"""
157
Automatically detect input file type from content.
158
159
Args:
160
text: Input text content
161
162
Returns:
163
Detected InputFileType enum value
164
"""
165
166
def is_openapi(data: dict) -> bool:
167
"""Check if dictionary contains OpenAPI specification."""
168
169
def is_schema(data: dict) -> bool:
170
"""Check if dictionary contains JSON Schema."""
171
```
172
173
### Core Data Types and Enums
174
175
Essential enums and constants for configuring generation behavior.
176
177
```python { .api }
178
class InputFileType(Enum):
179
"""Supported input schema formats."""
180
Auto = "auto"
181
OpenAPI = "openapi"
182
JsonSchema = "jsonschema"
183
Json = "json"
184
Yaml = "yaml"
185
Dict = "dict"
186
CSV = "csv"
187
GraphQL = "graphql"
188
189
class DataModelType(Enum):
190
"""Supported output Python model types."""
191
PydanticBaseModel = "pydantic.BaseModel"
192
PydanticV2BaseModel = "pydantic_v2.BaseModel"
193
DataclassesDataclass = "dataclasses.dataclass"
194
TypingTypedDict = "typing.TypedDict"
195
MsgspecStruct = "msgspec.Struct"
196
197
class OpenAPIScope(Enum):
198
"""OpenAPI parsing scope options."""
199
Schemas = "schemas"
200
Paths = "paths"
201
Tags = "tags"
202
Parameters = "parameters"
203
```
204
205
[Data Types and Configuration](./types-config.md)
206
207
### Utility Functions
208
209
Helper functions for YAML processing, version info, and directory management.
210
211
```python { .api }
212
def get_version() -> str:
213
"""Get package version string."""
214
215
def load_yaml(stream: str | TextIO) -> Any:
216
"""Load YAML from string or stream."""
217
218
def load_yaml_from_path(path: Path, encoding: str) -> Any:
219
"""Load YAML from file path."""
220
221
@contextmanager
222
def chdir(path: Path | None) -> Iterator[None]:
223
"""Context manager for temporary directory changes."""
224
```
225
226
### Error Handling
227
228
Exception classes for error handling and validation.
229
230
```python { .api }
231
class Error(Exception):
232
"""Base exception class for datamodel-code-generator."""
233
234
def __init__(self, message: str) -> None: ...
235
236
class InvalidClassNameError(Error):
237
"""Raised when generated class names are invalid."""
238
239
def __init__(self, class_name: str) -> None: ...
240
```
241
242
[Utility Functions and Error Handling](./utilities.md)
243
244
## Types
245
246
```python { .api }
247
# Version constants
248
MIN_VERSION: Final[int] = 9 # Python 3.9
249
MAX_VERSION: Final[int] = 13 # Python 3.13
250
251
# Default values
252
DEFAULT_BASE_CLASS: str = "pydantic.BaseModel"
253
254
# Schema detection constants
255
JSON_SCHEMA_URLS: tuple[str, ...] = (
256
"http://json-schema.org/",
257
"https://json-schema.org/",
258
)
259
260
# Raw data type formats
261
RAW_DATA_TYPES: list[InputFileType] = [
262
InputFileType.Json,
263
InputFileType.Yaml,
264
InputFileType.Dict,
265
InputFileType.CSV,
266
InputFileType.GraphQL,
267
]
268
269
# Type aliases and protocols from types module
270
from typing import TYPE_CHECKING
271
if TYPE_CHECKING:
272
from collections import defaultdict
273
from datamodel_code_generator.model.pydantic_v2 import UnionMode
274
from datamodel_code_generator.parser.base import Parser
275
from datamodel_code_generator.types import StrictTypes
276
```