0
# Data Aggregation
1
2
Compute and track partition attributes including election results, demographic data, and structural properties. Updater functions automatically calculate district-level summaries whenever partitions change.
3
4
## Capabilities
5
6
### Election Data Handling
7
8
Track and analyze election results across districts with automatic vote tallying and percentage calculations.
9
10
```python { .api }
11
class Election:
12
def __init__(
13
self,
14
name: str,
15
columns: Union[Dict[str, str], List[str]],
16
alias: str = None
17
) -> None:
18
"""
19
Create election updater for tracking vote data by district.
20
21
Parameters:
22
- name (str): Name identifier for this election
23
- columns (Union[Dict[str, str], List[str]]): Either dict mapping party names to column names, or list of column names that serve as both party names and columns
24
- alias (str, optional): Alternative name for accessing results
25
26
Returns:
27
None
28
"""
29
```
30
31
The Election class returns an ElectionResults object when used as an updater:
32
33
```python { .api }
34
class ElectionResults:
35
def percents(self, party: str) -> Tuple[float, ...]:
36
"""
37
Get vote percentages for a party across all districts.
38
39
Parameters:
40
- party (str): Party name
41
42
Returns:
43
Tuple[float, ...]: Vote percentages by district
44
"""
45
46
def counts(self, party: str) -> Tuple[int, ...]:
47
"""
48
Get raw vote counts for a party across all districts.
49
50
Parameters:
51
- party (str): Party name
52
53
Returns:
54
Tuple[int, ...]: Vote counts by district
55
"""
56
57
@property
58
def totals_for_party(self) -> Dict[str, Dict[int, float]]:
59
"""
60
Get vote totals organized by party and district.
61
62
Returns:
63
Dict[str, Dict[int, float]]: Party -> District -> votes
64
"""
65
66
@property
67
def totals(self) -> Dict[int, int]:
68
"""
69
Get total votes by district.
70
71
Returns:
72
Dict[int, int]: District -> total votes
73
"""
74
```
75
76
Usage example:
77
```python
78
from gerrychain.updaters import Election
79
80
# Set up election tracking
81
election = Election("SEN18", ["SEN18D", "SEN18R"]) # List format
82
# Or: election = Election("SEN18", {"Democratic": "SEN18D", "Republican": "SEN18R"}) # Dict format
83
84
# Use in partition
85
partition = GeographicPartition(
86
graph,
87
assignment="district",
88
updaters={"SEN18": election}
89
)
90
91
# Access results
92
election_results = partition["SEN18"] # Returns ElectionResults object
93
dem_votes = election_results.counts("SEN18D") # Tuple of counts by district
94
dem_percents = election_results.percents("SEN18D") # Tuple of percentages by district
95
total_votes = election_results.totals # Dict[district_id, total_votes]
96
```
97
98
### Generic Data Tallying
99
100
Aggregate arbitrary numeric data by district using flexible tally functions.
101
102
```python { .api }
103
class Tally:
104
def __init__(
105
self,
106
columns: Union[str, List[str]],
107
alias: str = None
108
) -> None:
109
"""
110
Create tally updater for summing data by district.
111
112
Parameters:
113
- columns (Union[str, List[str]]): Column name(s) to sum
114
- alias (str, optional): Alternative name for accessing results
115
116
Returns:
117
None
118
"""
119
120
class DataTally:
121
def __init__(
122
self,
123
columns: Union[str, List[str]],
124
alias: str = None
125
) -> None:
126
"""
127
Generic data tally with additional processing options.
128
129
Parameters:
130
- columns (Union[str, List[str]]): Column name(s) to aggregate
131
- alias (str, optional): Alternative name for accessing results
132
133
Returns:
134
None
135
"""
136
```
137
138
Usage example:
139
```python
140
from gerrychain.updaters import Tally
141
142
# Set up demographic tallies
143
partition = GeographicPartition(
144
graph,
145
assignment="district",
146
updaters={
147
"population": Tally("TOTPOP"),
148
"vap": Tally("VAP"), # Voting age population
149
"minority_pop": Tally(["BVAP", "HVAP", "ASIANVAP"]),
150
"households": Tally("households")
151
}
152
)
153
154
# Access tallied data
155
district_pop = partition["population"][district_id]
156
minority_pop = partition["minority_pop"][district_id]
157
```
158
159
### Structural Properties
160
161
Track graph-theoretic and geometric properties of partitions.
162
163
```python { .api }
164
def cut_edges(partition: Partition) -> Set[Tuple[NodeId, NodeId]]:
165
"""
166
Find edges that cross district boundaries.
167
168
Parameters:
169
- partition (Partition): Partition to analyze
170
171
Returns:
172
Set[Tuple[NodeId, NodeId]]: Set of edges crossing districts
173
"""
174
175
def cut_edges_by_part(partition: Partition) -> Dict[DistrictId, Set[Tuple[NodeId, NodeId]]]:
176
"""
177
Find cut edges grouped by district.
178
179
Parameters:
180
- partition (Partition): Partition to analyze
181
182
Returns:
183
Dict[DistrictId, Set[Tuple[NodeId, NodeId]]]: Cut edges by district
184
"""
185
186
def county_splits(
187
partition: Partition,
188
county_column: str = "county"
189
) -> Dict[str, int]:
190
"""
191
Count number of districts each county is split across.
192
193
Parameters:
194
- partition (Partition): Partition to analyze
195
- county_column (str): Column name for county identifiers
196
197
Returns:
198
Dict[str, int]: Number of districts per county
199
"""
200
201
def boundary_nodes(partition: Partition) -> Set[NodeId]:
202
"""
203
Find all nodes on district boundaries.
204
205
Parameters:
206
- partition (Partition): Partition to analyze
207
208
Returns:
209
Set[NodeId]: Set of nodes on district boundaries
210
"""
211
212
def exterior_boundaries(partition: Partition) -> Set[Tuple[NodeId, NodeId]]:
213
"""
214
Find edges on the exterior boundary of the partition.
215
216
Parameters:
217
- partition (Partition): Partition to analyze
218
219
Returns:
220
Set[Tuple[NodeId, NodeId]]: Exterior boundary edges
221
"""
222
223
def interior_boundaries(partition: Partition) -> Set[Tuple[NodeId, NodeId]]:
224
"""
225
Find edges on interior boundaries between districts.
226
227
Parameters:
228
- partition (Partition): Partition to analyze
229
230
Returns:
231
Set[Tuple[NodeId, NodeId]]: Interior boundary edges
232
"""
233
234
def flows_from_changes(
235
changes: Dict[NodeId, DistrictId],
236
pop_col: str = "population"
237
) -> Dict[Tuple[DistrictId, DistrictId], float]:
238
"""
239
Calculate population flows from partition changes.
240
241
Parameters:
242
- changes (Dict[NodeId, DistrictId]): Node assignment changes
243
- pop_col (str): Population column name
244
245
Returns:
246
Dict[Tuple[DistrictId, DistrictId], float]: Flow between district pairs
247
"""
248
249
class CountySplit:
250
def __init__(self, county_column: str = "county") -> None:
251
"""
252
Track county splits across districts.
253
254
Parameters:
255
- county_column (str): Column name for county data
256
257
Returns:
258
None
259
"""
260
```
261
262
Usage example:
263
```python
264
from gerrychain.updaters import cut_edges, county_splits
265
266
# Track structural properties
267
partition = GeographicPartition(
268
graph,
269
assignment="district",
270
updaters={
271
"cut_edges": cut_edges,
272
"county_splits": county_splits
273
}
274
)
275
276
# Access properties
277
num_cut_edges = len(partition["cut_edges"])
278
split_counties = {
279
county: count for county, count in partition["county_splits"].items()
280
if count > 1
281
}
282
```
283
284
### Complete Updater Example
285
286
Example showing comprehensive data tracking in a real analysis workflow:
287
288
```python
289
from gerrychain import GeographicPartition, Graph
290
from gerrychain.updaters import Election, Tally, cut_edges, county_splits
291
292
# Load data
293
graph = Graph.from_file("precincts.shp")
294
295
# Set up comprehensive updaters
296
partition = GeographicPartition(
297
graph,
298
assignment="district",
299
updaters={
300
# Demographics
301
"population": Tally("TOTPOP"),
302
"vap": Tally("VAP"),
303
"white_pop": Tally("WVAP"),
304
"black_pop": Tally("BVAP"),
305
"hispanic_pop": Tally("HVAP"),
306
307
# Elections
308
"SEN18": Election("SEN18", ["SEN18D", "SEN18R"]),
309
"GOV18": Election("GOV18", ["GOV18D", "GOV18R"]),
310
"PRES16": Election("PRES16", ["PRES16D", "PRES16R"]),
311
312
# Structure
313
"cut_edges": cut_edges,
314
"county_splits": county_splits,
315
316
# Economic data
317
"median_income": Tally("median_income"),
318
"poverty_rate": Tally("poverty_count") # Will need custom calculation for rates
319
}
320
)
321
322
# Use in analysis
323
for district in partition.parts:
324
print(f"District {district}:")
325
print(f" Population: {partition['population'][district]:,}")
326
print(f" % Black: {100 * partition['black_pop'][district] / partition['population'][district]:.1f}%")
327
328
sen_votes = partition["SEN18"]["counts"][district]
329
dem_pct = 100 * sen_votes["SEN18D"] / sum(sen_votes.values())
330
print(f" Senate Dem %: {dem_pct:.1f}%")
331
332
print(f" Cut edges: {len([e for e in partition['cut_edges'] if district in e])}")
333
print()
334
335
# Track changes over Markov chain
336
populations = []
337
cut_edge_counts = []
338
339
for state in chain:
340
populations.append(list(state["population"].values()))
341
cut_edge_counts.append(len(state["cut_edges"]))
342
343
# Analyze distributions
344
import numpy as np
345
print(f"Population std dev: {np.std(populations[-1]):.0f}")
346
print(f"Avg cut edges: {np.mean(cut_edge_counts):.1f}")
347
```
348
349
### Custom Updater Functions
350
351
Examples of creating custom updater functions for specialized analysis:
352
353
```python
354
def minority_vap_percent(partition):
355
"""Calculate minority VAP percentage by district."""
356
result = {}
357
for district in partition.parts:
358
total_vap = partition["vap"][district]
359
minority_vap = (partition["black_pop"][district] +
360
partition["hispanic_pop"][district] +
361
partition["asian_pop"][district])
362
result[district] = minority_vap / total_vap if total_vap > 0 else 0
363
return result
364
365
def compactness_scores(partition):
366
"""Calculate multiple compactness measures."""
367
from gerrychain.metrics import polsby_popper, schwartzberg
368
return {
369
"polsby_popper": polsby_popper(partition),
370
"schwartzberg": schwartzberg(partition)
371
}
372
373
# Use custom updaters
374
partition = GeographicPartition(
375
graph,
376
assignment="district",
377
updaters={
378
"population": Tally("TOTPOP"),
379
"vap": Tally("VAP"),
380
"black_pop": Tally("BVAP"),
381
"hispanic_pop": Tally("HVAP"),
382
"asian_pop": Tally("ASIANVAP"),
383
"minority_vap_pct": minority_vap_percent,
384
"compactness": compactness_scores
385
}
386
)
387
```
388
389
## Types
390
391
```python { .api }
392
UpdaterFunction = Callable[[Partition], Any]
393
DistrictId = int
394
NodeId = Union[int, str]
395
VoteData = Dict[str, int] # Party -> vote count
396
PercentageData = Dict[str, float] # Party -> percentage
397
```