Use Markov chain Monte Carlo to analyze districting plans and gerrymanders
npx @tessl/cli install tessl/pypi-gerrychain@0.3.00
# GerryChain
1
2
A Python library for building ensembles of districting plans using Markov chain Monte Carlo methods. GerryChain enables quantitative analysis of electoral districts and gerrymandering by generating large collections of sample districting plans for comparison against initial plans, developed by the Metric Geometry and Gerrymandering Group.
3
4
## Package Information
5
6
- **Package Name**: gerrychain
7
- **Language**: Python
8
- **Installation**: `pip install gerrychain`
9
- **Optional Dependencies**: `pip install gerrychain[geo]` (for shapely and geopandas support)
10
11
## Core Imports
12
13
```python
14
from gerrychain import MarkovChain, Graph, Partition, GeographicPartition, Election
15
```
16
17
Constraint functions:
18
```python
19
from gerrychain.constraints import contiguous, within_percent_of_ideal_population
20
```
21
22
Proposal functions:
23
```python
24
from gerrychain.proposals import recom
25
```
26
27
Acceptance and metrics:
28
```python
29
from gerrychain.accept import always_accept
30
from gerrychain.metrics import mean_median, polsby_popper
31
```
32
33
## Basic Usage
34
35
```python
36
from functools import partial
37
from gerrychain import MarkovChain, Graph, GeographicPartition, Election
38
from gerrychain.constraints import contiguous, within_percent_of_ideal_population
39
from gerrychain.proposals import recom
40
from gerrychain.accept import always_accept
41
from gerrychain.updaters import Tally, cut_edges
42
from gerrychain.metrics import mean_median
43
44
# Load geographic data into a graph
45
graph = Graph.from_file("precincts.shp")
46
47
# Create initial partition with election data
48
partition = GeographicPartition(
49
graph,
50
assignment="district",
51
updaters={
52
"population": Tally("TOTPOP"),
53
"cut_edges": cut_edges,
54
"SEN18": Election("SEN18", ["SEN18D", "SEN18R"])
55
}
56
)
57
58
# Calculate ideal population for ReCom
59
ideal_population = sum(partition["population"].values()) / len(partition)
60
61
# Configure ReCom proposal with parameters
62
my_proposal = partial(
63
recom,
64
pop_col="TOTPOP",
65
pop_target=ideal_population,
66
epsilon=0.05,
67
node_repeats=2
68
)
69
70
# Set up constraints
71
constraints = [
72
contiguous,
73
within_percent_of_ideal_population(partition, 0.05)
74
]
75
76
# Create and run Markov chain
77
chain = MarkovChain(
78
proposal=my_proposal,
79
constraints=constraints,
80
accept=always_accept,
81
initial_state=partition,
82
total_steps=1000
83
)
84
85
# Analyze the chain
86
scores = []
87
for state in chain:
88
# Compute partisan metrics
89
score = mean_median(state["SEN18"])
90
scores.append(score)
91
92
# Print progress every 100 steps
93
if len(scores) % 100 == 0:
94
print(f"Step {len(scores)}: Mean-Median = {score:.3f}")
95
```
96
97
## Architecture
98
99
GerryChain follows a modular, functional design with four core components:
100
101
- **Graph**: NetworkX-based geographic graphs representing geographic units (precincts, blocks)
102
- **Partition**: District assignments with updatable attributes (population, elections, metrics)
103
- **MarkovChain**: Iterator that proposes new partitions based on constraints and acceptance criteria
104
- **Updaters**: Functions that compute partition attributes (elections, tallies, cut edges, metrics)
105
106
The modular design allows mixing and matching proposals, constraints, acceptance functions, and metrics for custom redistricting analysis workflows.
107
108
## Capabilities
109
110
### Graph Creation and Management
111
112
Create and manipulate geographic graphs from shapefiles, GeoDataFrames, and adjacency data. Supports spatial operations, data joining, and node/edge querying.
113
114
```python { .api }
115
class Graph:
116
@classmethod
117
def from_file(cls, filename: str, adjacency: str = "rook") -> "Graph": ...
118
@classmethod
119
def from_geodataframe(cls, dataframe, adjacency: str = "rook") -> "Graph": ...
120
@classmethod
121
def from_networkx(cls, graph) -> "Graph": ...
122
@classmethod
123
def from_json(cls, json_file: str) -> "Graph": ...
124
def join(self, other_dataframe, columns: List[str] = None) -> None: ...
125
def lookup(self, key: str, target_column: str) -> Dict: ...
126
def to_json(self, json_file: str, *, include_geometries_as_geojson: bool = False) -> None: ...
127
def issue_warnings(self) -> None: ...
128
```
129
130
[Graph Operations](./graph-operations.md)
131
132
### Partition Management
133
134
Create and manipulate district partitions with automatic attribute updates. Supports both basic partitions and geographic partitions with spatial data.
135
136
```python { .api }
137
class Partition:
138
def __init__(self, graph: Graph, assignment: Dict, updaters: Dict = None) -> None: ...
139
@classmethod
140
def from_random_assignment(
141
cls,
142
graph: Graph,
143
n_parts: int,
144
epsilon: float,
145
pop_col: str,
146
**kwargs
147
) -> "Partition": ...
148
def flip(self, flips: Dict) -> "Partition": ...
149
def crosses_parts(self, edge: Tuple) -> bool: ...
150
151
class GeographicPartition(Partition):
152
@classmethod
153
def from_file(cls, filename: str, **kwargs) -> "GeographicPartition": ...
154
@classmethod
155
def from_districtr_file(
156
cls,
157
graph: Graph,
158
districtr_file: str,
159
updaters: Dict = None
160
) -> "GeographicPartition": ...
161
```
162
163
[Partition Management](./partition-management.md)
164
165
### Markov Chain Analysis
166
167
Core MCMC functionality for generating ensembles of districting plans. The MarkovChain class orchestrates proposal generation, constraint validation, and acceptance decisions.
168
169
```python { .api }
170
class MarkovChain:
171
def __init__(
172
self,
173
proposal: Callable,
174
constraints: Union[Callable, List[Callable]],
175
accept: Callable,
176
initial_state: Partition,
177
total_steps: int
178
) -> None: ...
179
def with_progress_bar(self): ...
180
```
181
182
[Markov Chain Analysis](./markov-chain-analysis.md)
183
184
### Constraint Validation
185
186
Ensure proposed partitions meet districting requirements including contiguity, population balance, and administrative unit preservation.
187
188
```python { .api }
189
class Validator:
190
def __init__(self, constraints: List[Callable]) -> None: ...
191
def __call__(self, partition: Partition) -> bool: ...
192
193
def contiguous(partition: Partition) -> bool: ...
194
def within_percent_of_ideal_population(
195
partition: Partition,
196
percent: float
197
) -> Callable[[Partition], bool]: ...
198
def no_vanishing_districts(partition: Partition) -> bool: ...
199
```
200
201
[Constraints](./constraints.md)
202
203
### Proposal Algorithms
204
205
Generate new districting plans through various algorithms including ReCom (recombination), random flips, and tree-based partitioning.
206
207
```python { .api }
208
def recom(partition: Partition) -> Partition: ...
209
def propose_random_flip(partition: Partition) -> Partition: ...
210
def propose_chunk_flip(partition: Partition, k: int = 3) -> Partition: ...
211
```
212
213
[Proposal Algorithms](./proposal-algorithms.md)
214
215
### Data Aggregation and Tracking
216
217
Compute and track partition attributes including election results, demographic data, and structural properties like cut edges.
218
219
```python { .api }
220
class Election:
221
def __init__(self, name: str, columns: List[str], alias: str = None) -> None: ...
222
def percents(self, alias: str) -> Dict[str, float]: ...
223
@property
224
def counts(self) -> Dict[str, int]: ...
225
226
class Tally:
227
def __init__(self, columns: Union[str, List[str]], alias: str = None) -> None: ...
228
229
def cut_edges(partition: Partition) -> Set[Tuple]: ...
230
```
231
232
[Data Aggregation](./data-aggregation.md)
233
234
### Evaluation Metrics
235
236
Analyze districting plans using partisan metrics (efficiency gap, mean-median), compactness measures (Polsby-Popper, Schwartzberg), and demographic analysis.
237
238
```python { .api }
239
def mean_median(election_results: Dict) -> float: ...
240
def efficiency_gap(election_results: Dict) -> float: ...
241
def partisan_bias(election_results: Dict) -> float: ...
242
def polsby_popper(partition: Partition) -> Dict[int, float]: ...
243
def schwartzberg(partition: Partition) -> Dict[int, float]: ...
244
```
245
246
[Evaluation Metrics](./evaluation-metrics.md)
247
248
### Optimization Algorithms
249
250
Single-metric optimization and specialized analysis including Gingles analysis for Voting Rights Act compliance.
251
252
```python { .api }
253
class SingleMetricOptimizer:
254
def __init__(
255
self,
256
proposal: Callable,
257
constraints: List[Callable],
258
accept: Callable,
259
initial_state: Partition,
260
maximize: bool = True
261
) -> None: ...
262
def optimize(self) -> Tuple[Partition, float]: ...
263
264
class Gingles:
265
def __init__(
266
self,
267
proposal: Callable,
268
constraints: List[Callable],
269
accept: Callable,
270
initial_state: Partition,
271
minority_column: str,
272
threshold: float = 0.5
273
) -> None: ...
274
```
275
276
[Optimization](./optimization.md)
277
278
### Acceptance Functions
279
280
Control which proposed partitions are accepted in the Markov chain, including basic acceptance and Metropolis-Hastings rules.
281
282
```python { .api }
283
def always_accept(partition: Partition) -> bool: ...
284
def cut_edge_accept(partition: Partition) -> bool: ...
285
```
286
287
[Acceptance Functions](./acceptance-functions.md)
288
289
## Types
290
291
Core type definitions used throughout the API:
292
293
```python { .api }
294
# Type aliases for common patterns
295
Assignment = Dict[int, int] # Node ID -> District ID mapping
296
NodeId = Union[int, str] # Graph node identifier
297
DistrictId = int # District identifier
298
UpdaterFunction = Callable[[Partition], Any] # Updater function type
299
ConstraintFunction = Callable[[Partition], bool] # Constraint function type
300
ProposalFunction = Callable[[Partition], Partition] # Proposal function type
301
AcceptanceFunction = Callable[[Partition], bool] # Acceptance function type
302
```