Pure Python implementation of the Git version control system providing comprehensive access to Git repositories without requiring the Git command-line tool
npx @tessl/cli install tessl/pypi-dulwich@0.24.00
# Dulwich
1
2
A pure Python implementation of the Git version control system providing comprehensive access to Git repositories without requiring the Git command-line tool to be installed. Dulwich offers both low-level API access to Git objects, references, and data structures, as well as higher-level 'porcelain' functionality for common Git operations like cloning, committing, and pushing.
3
4
## Package Information
5
6
- **Package Name**: dulwich
7
- **Language**: Python
8
- **Installation**: `pip install dulwich`
9
- **Optional**: `pip install dulwich[fastimport,https,pgp,paramiko]` for additional features
10
11
## Core Imports
12
13
Basic repository access:
14
15
```python
16
from dulwich.repo import Repo
17
```
18
19
High-level Git operations (porcelain API):
20
21
```python
22
from dulwich import porcelain
23
```
24
25
Git objects:
26
27
```python
28
from dulwich.objects import Blob, Tree, Commit, Tag
29
```
30
31
Error handling:
32
33
```python
34
from dulwich.errors import NotGitRepository
35
```
36
37
## Basic Usage
38
39
```python
40
from dulwich.repo import Repo
41
from dulwich import porcelain
42
43
# Open an existing repository
44
repo = Repo('.')
45
46
# Get the current HEAD commit
47
head_commit = repo[repo.head()]
48
print(f"Latest commit: {head_commit.message.decode('utf-8').strip()}")
49
50
# High-level operations using porcelain
51
porcelain.add(repo, 'new_file.txt')
52
porcelain.commit(repo, 'Added new file', author='John Doe <john@example.com>')
53
54
# Clone a repository
55
porcelain.clone('https://github.com/user/repo.git', 'local-repo')
56
57
# Show repository status
58
status = porcelain.status('.')
59
print(f"Modified files: {list(status.staged['modify'])}")
60
```
61
62
## Architecture
63
64
Dulwich is designed with multiple abstraction layers:
65
66
- **Porcelain Layer**: High-level Git command equivalents in `dulwich.porcelain`
67
- **Repository Layer**: Repository access and management via `Repo` classes
68
- **Object Model**: Git objects (Blob, Tree, Commit, Tag) with full manipulation capabilities
69
- **Storage Layer**: Object stores supporting filesystem, memory, and cloud backends
70
- **Protocol Layer**: Git protocol clients for HTTP, SSH, and local access
71
- **Transport Layer**: Network communication handling for remote operations
72
73
This layered design provides maximum flexibility, allowing users to choose the appropriate level of abstraction while maintaining compatibility with Git's internal formats and protocols.
74
75
## Capabilities
76
77
### High-Level Git Operations (Porcelain)
78
79
Git command equivalents providing familiar functionality for repository management, file operations, branching, merging, and remote synchronization.
80
81
```python { .api }
82
def init(path: str = ".", bare: bool = False) -> Repo: ...
83
def clone(source: str, target: str = None, **kwargs) -> Repo: ...
84
def add(repo, paths: List[str]) -> None: ...
85
def commit(repo, message: str, author: str = None, **kwargs) -> bytes: ...
86
def push(repo, remote_location: str = None, **kwargs) -> Dict: ...
87
def pull(repo, remote_location: str = None, **kwargs) -> None: ...
88
def status(repo) -> PorterStatus: ...
89
def log(repo, max_entries: int = None, **kwargs) -> None: ...
90
```
91
92
[Porcelain API](./porcelain.md)
93
94
### Repository Access and Management
95
96
Core repository classes for opening, creating, and managing Git repositories with support for filesystem and in-memory backends.
97
98
```python { .api }
99
class Repo:
100
def __init__(self, root: str): ...
101
def head(self) -> bytes: ...
102
def __getitem__(self, key: bytes) -> ShaFile: ...
103
def close(self) -> None: ...
104
105
class MemoryRepo:
106
def __init__(self): ...
107
```
108
109
[Repository Management](./repository.md)
110
111
### Git Object Model
112
113
Complete implementation of Git's object model including blobs, trees, commits, and tags with full read/write capabilities and format compliance.
114
115
```python { .api }
116
class ShaFile:
117
@property
118
def id(self) -> bytes: ...
119
def as_raw_string(self) -> bytes: ...
120
121
class Blob(ShaFile):
122
def __init__(self, data: bytes = b""): ...
123
@property
124
def data(self) -> bytes: ...
125
126
class Tree(ShaFile):
127
def __init__(self): ...
128
def add(self, name: bytes, mode: int, hexsha: bytes) -> None: ...
129
def items(self) -> Iterator[TreeEntry]: ...
130
131
class Commit(ShaFile):
132
def __init__(self): ...
133
@property
134
def message(self) -> bytes: ...
135
@property
136
def author(self) -> bytes: ...
137
@property
138
def tree(self) -> bytes: ...
139
```
140
141
[Git Objects](./objects.md)
142
143
### Object Storage
144
145
Flexible object storage backends supporting filesystem, memory, pack files, and cloud storage with efficient object retrieval and storage operations.
146
147
```python { .api }
148
class BaseObjectStore:
149
def __contains__(self, sha: bytes) -> bool: ...
150
def __getitem__(self, sha: bytes) -> ShaFile: ...
151
def add_object(self, obj: ShaFile) -> None: ...
152
153
class DiskObjectStore(BaseObjectStore): ...
154
class MemoryObjectStore(BaseObjectStore): ...
155
class PackBasedObjectStore(BaseObjectStore): ...
156
```
157
158
[Object Storage](./object-storage.md)
159
160
### Index and Staging
161
162
Git index manipulation for staging changes, managing file states, and preparing commits with support for conflict resolution and extensions.
163
164
```python { .api }
165
class Index:
166
def __init__(self, filename: str): ...
167
def __getitem__(self, name: bytes) -> IndexEntry: ...
168
def __setitem__(self, name: bytes, entry: IndexEntry) -> None: ...
169
def commit(self, object_store: BaseObjectStore) -> bytes: ...
170
171
class IndexEntry:
172
def __init__(self, **kwargs): ...
173
```
174
175
[Index Management](./index-management.md)
176
177
### References and Branches
178
179
Complete reference management including branches, tags, symbolic references, and packed refs with validation and atomic updates.
180
181
```python { .api }
182
class RefsContainer:
183
def __getitem__(self, name: bytes) -> bytes: ...
184
def __setitem__(self, name: bytes, value: bytes) -> None: ...
185
def keys(self) -> Iterator[bytes]: ...
186
187
def check_ref_format(refname: bytes) -> bool: ...
188
def parse_symref_value(contents: bytes) -> bytes: ...
189
```
190
191
[References](./references.md)
192
193
### Git Protocol Clients
194
195
Network protocol implementations for communicating with Git servers over HTTP, SSH, and Git protocols with authentication and progress tracking.
196
197
```python { .api }
198
class GitClient:
199
def fetch_pack(self, path: str, **kwargs) -> FetchPackResult: ...
200
def send_pack(self, path: str, **kwargs) -> SendPackResult: ...
201
202
class TCPGitClient(GitClient): ...
203
class SSHGitClient(GitClient): ...
204
class Urllib3HttpGitClient(GitClient): ...
205
206
def get_transport_and_path(uri: str) -> Tuple[GitClient, str]: ...
207
```
208
209
[Protocol Clients](./clients.md)
210
211
### Pack Files
212
213
Efficient pack file handling for Git's compressed object storage format with indexing, streaming, and delta compression support.
214
215
```python { .api }
216
class PackData:
217
def __init__(self, filename: str): ...
218
def __getitem__(self, offset: int) -> ShaFile: ...
219
220
class PackIndex:
221
def get_pack_checksum(self) -> bytes: ...
222
def object_sha1(self, index: int) -> bytes: ...
223
224
def write_pack_objects(f, objects: Iterator[ShaFile]) -> None: ...
225
```
226
227
[Pack Files](./pack-files.md)
228
229
### Configuration Management
230
231
Git configuration file parsing and manipulation supporting repository, user, and system-level configuration with type conversion and validation.
232
233
```python { .api }
234
class Config:
235
def get(self, section: bytes, name: bytes) -> bytes: ...
236
def set(self, section: bytes, name: bytes, value: bytes) -> None: ...
237
238
class ConfigFile(Config):
239
def __init__(self, filename: str): ...
240
241
class StackedConfig(Config):
242
def __init__(self, backends: List[Config]): ...
243
```
244
245
[Configuration](./configuration.md)
246
247
### Diff and Merge
248
249
Comprehensive diff generation and merge algorithms for comparing trees, detecting changes, and resolving conflicts with rename detection.
250
251
```python { .api }
252
def tree_changes(object_store, old_tree: bytes, new_tree: bytes) -> Iterator[TreeChange]: ...
253
def diff_tree_to_tree(object_store, old_tree: bytes, new_tree: bytes) -> bytes: ...
254
255
class RenameDetector:
256
def __init__(self, object_store): ...
257
def changes_with_renames(self, changes: List[TreeChange]) -> List[TreeChange]: ...
258
```
259
260
[Diff and Merge](./diff-merge.md)
261
262
### Command-Line Interface
263
264
Complete CLI implementation providing Git command equivalents with argument parsing, progress reporting, and consistent behavior.
265
266
```python { .api }
267
def main(argv: List[str] = None) -> int: ...
268
269
class Command:
270
def run(self, args: List[str]) -> int: ...
271
```
272
273
[CLI Tools](./cli.md)
274
275
## Types
276
277
```python { .api }
278
# Core types used across the API
279
ObjectID = bytes # 20-byte SHA-1 hash
280
TreeEntry = NamedTuple[bytes, int, bytes] # name, mode, sha
281
282
class PorterStatus:
283
staged: Dict[str, List[str]]
284
unstaged: Dict[str, List[str]]
285
untracked: List[str]
286
287
class FetchPackResult:
288
refs: Dict[bytes, bytes]
289
symrefs: Dict[bytes, bytes]
290
291
class SendPackResult:
292
ref_status: Dict[bytes, str]
293
```