0
# Aggregation and Analysis
1
2
Tools for extracting insights from trajectory collections, including significant point detection, clustering, and flow analysis. These classes enable higher-level analysis of movement patterns across multiple trajectories.
3
4
## Capabilities
5
6
### Trajectory Collection Aggregation
7
8
Advanced analysis of trajectory collections to extract meaningful patterns and insights.
9
10
```python { .api }
11
class TrajectoryCollectionAggregator:
12
def __init__(self, traj_collection, max_distance, min_distance, min_stop_duration, min_angle=45):
13
"""
14
Aggregates trajectories by extracting significant points, clustering, and extracting flows.
15
16
Parameters:
17
- traj_collection: TrajectoryCollection to analyze
18
- max_distance: Maximum distance for clustering analysis
19
- min_distance: Minimum distance for filtering points
20
- min_stop_duration: Minimum duration for stop detection
21
- min_angle: Minimum angle change for significant point detection (degrees)
22
"""
23
24
def get_significant_points_gdf(self):
25
"""
26
Extract significant points from all trajectories.
27
28
Significant points include:
29
- Start and end points
30
- Points with significant direction changes
31
- Points where speed changes significantly
32
- Stop locations
33
34
Returns:
35
GeoDataFrame with significant points and their attributes
36
"""
37
38
def get_clusters_gdf(self):
39
"""
40
Get clustered significant points.
41
42
Groups nearby significant points into clusters to identify
43
common locations across multiple trajectories.
44
45
Returns:
46
GeoDataFrame with cluster information including:
47
- Cluster ID
48
- Cluster centroid
49
- Number of points in cluster
50
- Trajectories represented in cluster
51
"""
52
53
def get_flows_gdf(self):
54
"""
55
Extract flow information between clusters.
56
57
Analyzes movement patterns between significant point clusters
58
to identify common routes and flows.
59
60
Returns:
61
GeoDataFrame with flow lines including:
62
- Origin cluster ID
63
- Destination cluster ID
64
- Number of trajectories using this flow
65
- Flow geometry (LineString)
66
"""
67
```
68
69
### Significant Point Extraction
70
71
Detailed analysis for extracting important points from individual trajectories.
72
73
```python { .api }
74
class PtsExtractor:
75
def __init__(self, traj, max_distance, min_distance, min_stop_duration, min_angle=45):
76
"""
77
Extracts significant points from trajectories.
78
79
Parameters:
80
- traj: Trajectory object to analyze
81
- max_distance: Maximum distance for analysis
82
- min_distance: Minimum distance between significant points
83
- min_stop_duration: Minimum duration for stop detection
84
- min_angle: Minimum angle change for significance (degrees)
85
"""
86
87
def find_significant_points(self):
88
"""
89
Find significant points in the trajectory.
90
91
Identifies points that are important for understanding
92
trajectory structure and behavior, including:
93
- Start and end points
94
- Direction change points
95
- Speed change points
96
- Stop locations
97
98
Returns:
99
GeoDataFrame with significant points and their classifications
100
"""
101
```
102
103
### Point Clustering
104
105
Grid-based clustering for grouping nearby points across trajectories.
106
107
```python { .api }
108
class PointClusterer:
109
def __init__(self, points, max_distance, is_latlon):
110
"""
111
Grid-based point clustering for trajectory analysis.
112
113
Parameters:
114
- points: GeoDataFrame or list of Point geometries to cluster
115
- max_distance: Maximum distance for clustering (cluster size)
116
- is_latlon: Boolean indicating if coordinates are latitude/longitude
117
"""
118
119
def get_clusters(self):
120
"""
121
Perform grid-based clustering of points.
122
123
Groups points into spatial clusters based on proximity,
124
useful for identifying common locations across trajectories.
125
126
Returns:
127
GeoDataFrame or list with cluster assignments and cluster information:
128
- Original point data
129
- Cluster ID for each point
130
- Cluster centroid coordinates
131
- Number of points in each cluster
132
"""
133
```
134
135
## Usage Examples
136
137
### Trajectory Collection Analysis
138
139
```python
140
import movingpandas as mpd
141
import pandas as pd
142
143
# Assume we have a TrajectoryCollection
144
# collection = mpd.TrajectoryCollection(...)
145
146
# Create aggregator for comprehensive analysis
147
aggregator = mpd.TrajectoryCollectionAggregator(
148
traj_collection=collection,
149
max_distance=100, # 100 meter clustering distance
150
min_distance=50, # 50 meter minimum point separation
151
min_stop_duration=pd.Timedelta("5 minutes"), # 5 minute minimum stops
152
min_angle=30 # 30 degree minimum angle change
153
)
154
155
# Extract significant points across all trajectories
156
significant_points = aggregator.get_significant_points_gdf()
157
print(f"Found {len(significant_points)} significant points")
158
159
# Get clustered locations
160
clusters = aggregator.get_clusters_gdf()
161
print(f"Identified {clusters['cluster_id'].nunique()} distinct location clusters")
162
163
# Analyze flows between clusters
164
flows = aggregator.get_flows_gdf()
165
print(f"Found {len(flows)} distinct flows between clusters")
166
167
# Examine the most common flows
168
top_flows = flows.nlargest(5, 'trajectory_count')
169
for idx, flow in top_flows.iterrows():
170
print(f"Flow from cluster {flow['origin_cluster']} to {flow['dest_cluster']}: "
171
f"{flow['trajectory_count']} trajectories")
172
```
173
174
### Individual Trajectory Significant Point Analysis
175
176
```python
177
# Analyze single trajectory for significant points
178
# traj = mpd.Trajectory(...)
179
180
extractor = mpd.PtsExtractor(
181
traj=traj,
182
max_distance=200,
183
min_distance=25,
184
min_stop_duration=pd.Timedelta("2 minutes"),
185
min_angle=45
186
)
187
188
# Find significant points
189
sig_points = extractor.find_significant_points()
190
191
# Examine types of significant points found
192
point_types = sig_points['point_type'].value_counts()
193
print("Significant point types:")
194
for point_type, count in point_types.items():
195
print(f" {point_type}: {count}")
196
```
197
198
### Point Clustering Analysis
199
200
```python
201
import geopandas as gpd
202
from shapely.geometry import Point
203
204
# Create sample points for clustering
205
points_data = [
206
Point(0, 0), Point(0.1, 0.1), Point(0.2, 0.05), # Cluster 1
207
Point(5, 5), Point(5.1, 5.2), Point(4.9, 4.8), # Cluster 2
208
Point(10, 10), Point(10.3, 10.1) # Cluster 3
209
]
210
211
# Create point clusterer
212
clusterer = mpd.PointClusterer(
213
points=points_data,
214
max_distance=0.5, # 0.5 unit clustering distance
215
is_latlon=False # Using projected coordinates
216
)
217
218
# Get clusters
219
clusters = clusterer.get_clusters()
220
221
# Analyze clustering results
222
print(f"Points clustered into {len(clusters)} groups")
223
```
224
225
### Comprehensive Movement Pattern Analysis
226
227
```python
228
# Complete workflow for analyzing movement patterns
229
def analyze_movement_patterns(trajectory_collection):
230
"""Comprehensive analysis of movement patterns in trajectory collection."""
231
232
# Set up aggregator with reasonable parameters
233
aggregator = mpd.TrajectoryCollectionAggregator(
234
traj_collection=trajectory_collection,
235
max_distance=100,
236
min_distance=25,
237
min_stop_duration=pd.Timedelta("3 minutes"),
238
min_angle=45
239
)
240
241
# Extract all analysis components
242
significant_points = aggregator.get_significant_points_gdf()
243
clusters = aggregator.get_clusters_gdf()
244
flows = aggregator.get_flows_gdf()
245
246
# Summary statistics
247
analysis_summary = {
248
'total_trajectories': len(trajectory_collection),
249
'significant_points': len(significant_points),
250
'location_clusters': clusters['cluster_id'].nunique(),
251
'distinct_flows': len(flows),
252
'most_used_flow': flows.loc[flows['trajectory_count'].idxmax()] if len(flows) > 0 else None
253
}
254
255
return {
256
'summary': analysis_summary,
257
'significant_points': significant_points,
258
'clusters': clusters,
259
'flows': flows
260
}
261
262
# Use the analysis function
263
# results = analyze_movement_patterns(my_collection)
264
# print("Analysis Summary:", results['summary'])
265
```
266
267
### Clustering Different Types of Points
268
269
```python
270
# Cluster different types of trajectory points separately
271
272
# Extract start points from collection
273
start_points = collection.get_start_locations()
274
start_clusterer = mpd.PointClusterer(
275
points=start_points.geometry.tolist(),
276
max_distance=200, # 200 meter clusters for origins
277
is_latlon=True
278
)
279
origin_clusters = start_clusterer.get_clusters()
280
281
# Extract end points from collection
282
end_points = collection.get_end_locations()
283
end_clusterer = mpd.PointClusterer(
284
points=end_points.geometry.tolist(),
285
max_distance=200, # 200 meter clusters for destinations
286
is_latlon=True
287
)
288
destination_clusters = end_clusterer.get_clusters()
289
290
print(f"Found {len(origin_clusters)} origin clusters")
291
print(f"Found {len(destination_clusters)} destination clusters")
292
```
293
294
## Analysis Outputs
295
296
### Significant Points GDF Structure
297
298
The significant points GeoDataFrame typically contains:
299
300
- `geometry`: Point geometry of significant location
301
- `point_type`: Type of significant point (start, end, direction_change, speed_change, stop)
302
- `trajectory_id`: ID of source trajectory
303
- `timestamp`: Time when point occurred
304
- `speed`: Speed at this point (if available)
305
- `direction`: Direction/heading at this point (if available)
306
- `significance_score`: Numeric score indicating importance
307
308
### Clusters GDF Structure
309
310
The clusters GeoDataFrame typically contains:
311
312
- `geometry`: Centroid point of cluster
313
- `cluster_id`: Unique identifier for cluster
314
- `point_count`: Number of points in cluster
315
- `trajectory_count`: Number of different trajectories represented
316
- `cluster_radius`: Spatial extent of cluster
317
- `dominant_point_type`: Most common type of significant point in cluster
318
319
### Flows GDF Structure
320
321
The flows GeoDataFrame typically contains:
322
323
- `geometry`: LineString representing flow path
324
- `origin_cluster`: ID of origin cluster
325
- `dest_cluster`: ID of destination cluster
326
- `trajectory_count`: Number of trajectories using this flow
327
- `avg_duration`: Average time to travel this flow
328
- `avg_distance`: Average distance of this flow