Tessl Tile for pypi/movingpandas@0.22.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

aggregation-analysis.md core-data-structures.md index.md io-utilities.md trajectory-processing.md

aggregation-analysis.mddocs/

0
# Aggregation and Analysis
1

2
Tools for extracting insights from trajectory collections, including significant point detection, clustering, and flow analysis. These classes enable higher-level analysis of movement patterns across multiple trajectories.
3

4
## Capabilities
5

6
### Trajectory Collection Aggregation
7

8
Advanced analysis of trajectory collections to extract meaningful patterns and insights.
9

10
```python { .api }
11
class TrajectoryCollectionAggregator:
12
    def __init__(self, traj_collection, max_distance, min_distance, min_stop_duration, min_angle=45):
13
        """
14
        Aggregates trajectories by extracting significant points, clustering, and extracting flows.
15
        
16
        Parameters:
17
        - traj_collection: TrajectoryCollection to analyze
18
        - max_distance: Maximum distance for clustering analysis
19
        - min_distance: Minimum distance for filtering points
20
        - min_stop_duration: Minimum duration for stop detection
21
        - min_angle: Minimum angle change for significant point detection (degrees)
22
        """
23
    
24
    def get_significant_points_gdf(self):
25
        """
26
        Extract significant points from all trajectories.
27
        
28
        Significant points include:
29
        - Start and end points
30
        - Points with significant direction changes
31
        - Points where speed changes significantly
32
        - Stop locations
33
        
34
        Returns:
35
        GeoDataFrame with significant points and their attributes
36
        """
37
    
38
    def get_clusters_gdf(self):
39
        """
40
        Get clustered significant points.
41
        
42
        Groups nearby significant points into clusters to identify
43
        common locations across multiple trajectories.
44
        
45
        Returns:
46
        GeoDataFrame with cluster information including:
47
        - Cluster ID
48
        - Cluster centroid
49
        - Number of points in cluster
50
        - Trajectories represented in cluster
51
        """
52
    
53
    def get_flows_gdf(self):
54
        """
55
        Extract flow information between clusters.
56
        
57
        Analyzes movement patterns between significant point clusters
58
        to identify common routes and flows.
59
        
60
        Returns:
61
        GeoDataFrame with flow lines including:
62
        - Origin cluster ID
63
        - Destination cluster ID
64
        - Number of trajectories using this flow
65
        - Flow geometry (LineString)
66
        """
67
```
68

69
### Significant Point Extraction
70

71
Detailed analysis for extracting important points from individual trajectories.
72

73
```python { .api }
74
class PtsExtractor:
75
    def __init__(self, traj, max_distance, min_distance, min_stop_duration, min_angle=45):
76
        """
77
        Extracts significant points from trajectories.
78
        
79
        Parameters:
80
        - traj: Trajectory object to analyze
81
        - max_distance: Maximum distance for analysis
82
        - min_distance: Minimum distance between significant points
83
        - min_stop_duration: Minimum duration for stop detection
84
        - min_angle: Minimum angle change for significance (degrees)
85
        """
86
    
87
    def find_significant_points(self):
88
        """
89
        Find significant points in the trajectory.
90
        
91
        Identifies points that are important for understanding
92
        trajectory structure and behavior, including:
93
        - Start and end points
94
        - Direction change points
95
        - Speed change points
96
        - Stop locations
97
        
98
        Returns:
99
        GeoDataFrame with significant points and their classifications
100
        """
101
```
102

103
### Point Clustering
104

105
Grid-based clustering for grouping nearby points across trajectories.
106

107
```python { .api }
108
class PointClusterer:
109
    def __init__(self, points, max_distance, is_latlon):
110
        """
111
        Grid-based point clustering for trajectory analysis.
112
        
113
        Parameters:
114
        - points: GeoDataFrame or list of Point geometries to cluster
115
        - max_distance: Maximum distance for clustering (cluster size)
116
        - is_latlon: Boolean indicating if coordinates are latitude/longitude
117
        """
118
    
119
    def get_clusters(self):
120
        """
121
        Perform grid-based clustering of points.
122
        
123
        Groups points into spatial clusters based on proximity,
124
        useful for identifying common locations across trajectories.
125
        
126
        Returns:
127
        GeoDataFrame or list with cluster assignments and cluster information:
128
        - Original point data
129
        - Cluster ID for each point  
130
        - Cluster centroid coordinates
131
        - Number of points in each cluster
132
        """
133
```
134

135
## Usage Examples
136

137
### Trajectory Collection Analysis
138

139
```python
140
import movingpandas as mpd
141
import pandas as pd
142

143
# Assume we have a TrajectoryCollection
144
# collection = mpd.TrajectoryCollection(...)
145

146
# Create aggregator for comprehensive analysis
147
aggregator = mpd.TrajectoryCollectionAggregator(
148
    traj_collection=collection,
149
    max_distance=100,        # 100 meter clustering distance
150
    min_distance=50,         # 50 meter minimum point separation
151
    min_stop_duration=pd.Timedelta("5 minutes"),  # 5 minute minimum stops
152
    min_angle=30            # 30 degree minimum angle change
153
)
154

155
# Extract significant points across all trajectories
156
significant_points = aggregator.get_significant_points_gdf()
157
print(f"Found {len(significant_points)} significant points")
158

159
# Get clustered locations
160
clusters = aggregator.get_clusters_gdf()
161
print(f"Identified {clusters['cluster_id'].nunique()} distinct location clusters")
162

163
# Analyze flows between clusters
164
flows = aggregator.get_flows_gdf()
165
print(f"Found {len(flows)} distinct flows between clusters")
166

167
# Examine the most common flows
168
top_flows = flows.nlargest(5, 'trajectory_count')
169
for idx, flow in top_flows.iterrows():
170
    print(f"Flow from cluster {flow['origin_cluster']} to {flow['dest_cluster']}: "
171
          f"{flow['trajectory_count']} trajectories")
172
```
173

174
### Individual Trajectory Significant Point Analysis
175

176
```python
177
# Analyze single trajectory for significant points
178
# traj = mpd.Trajectory(...)
179

180
extractor = mpd.PtsExtractor(
181
    traj=traj,
182
    max_distance=200,
183
    min_distance=25,
184
    min_stop_duration=pd.Timedelta("2 minutes"),
185
    min_angle=45
186
)
187

188
# Find significant points
189
sig_points = extractor.find_significant_points()
190

191
# Examine types of significant points found
192
point_types = sig_points['point_type'].value_counts()
193
print("Significant point types:")
194
for point_type, count in point_types.items():
195
    print(f"  {point_type}: {count}")
196
```
197

198
### Point Clustering Analysis
199

200
```python
201
import geopandas as gpd
202
from shapely.geometry import Point
203

204
# Create sample points for clustering
205
points_data = [
206
    Point(0, 0), Point(0.1, 0.1), Point(0.2, 0.05),  # Cluster 1
207
    Point(5, 5), Point(5.1, 5.2), Point(4.9, 4.8),  # Cluster 2
208
    Point(10, 10), Point(10.3, 10.1)                 # Cluster 3
209
]
210

211
# Create point clusterer
212
clusterer = mpd.PointClusterer(
213
    points=points_data,
214
    max_distance=0.5,  # 0.5 unit clustering distance
215
    is_latlon=False    # Using projected coordinates
216
)
217

218
# Get clusters
219
clusters = clusterer.get_clusters()
220

221
# Analyze clustering results
222
print(f"Points clustered into {len(clusters)} groups")
223
```
224

225
### Comprehensive Movement Pattern Analysis
226

227
```python
228
# Complete workflow for analyzing movement patterns
229
def analyze_movement_patterns(trajectory_collection):
230
    """Comprehensive analysis of movement patterns in trajectory collection."""
231
    
232
    # Set up aggregator with reasonable parameters
233
    aggregator = mpd.TrajectoryCollectionAggregator(
234
        traj_collection=trajectory_collection,
235
        max_distance=100,
236
        min_distance=25,
237
        min_stop_duration=pd.Timedelta("3 minutes"),
238
        min_angle=45
239
    )
240
    
241
    # Extract all analysis components
242
    significant_points = aggregator.get_significant_points_gdf()
243
    clusters = aggregator.get_clusters_gdf()
244
    flows = aggregator.get_flows_gdf()
245
    
246
    # Summary statistics
247
    analysis_summary = {
248
        'total_trajectories': len(trajectory_collection),
249
        'significant_points': len(significant_points),
250
        'location_clusters': clusters['cluster_id'].nunique(),
251
        'distinct_flows': len(flows),
252
        'most_used_flow': flows.loc[flows['trajectory_count'].idxmax()] if len(flows) > 0 else None
253
    }
254
    
255
    return {
256
        'summary': analysis_summary,
257
        'significant_points': significant_points,
258
        'clusters': clusters,
259
        'flows': flows
260
    }
261

262
# Use the analysis function
263
# results = analyze_movement_patterns(my_collection)
264
# print("Analysis Summary:", results['summary'])
265
```
266

267
### Clustering Different Types of Points
268

269
```python
270
# Cluster different types of trajectory points separately
271

272
# Extract start points from collection
273
start_points = collection.get_start_locations()
274
start_clusterer = mpd.PointClusterer(
275
    points=start_points.geometry.tolist(),
276
    max_distance=200,  # 200 meter clusters for origins
277
    is_latlon=True
278
)
279
origin_clusters = start_clusterer.get_clusters()
280

281
# Extract end points from collection  
282
end_points = collection.get_end_locations()
283
end_clusterer = mpd.PointClusterer(
284
    points=end_points.geometry.tolist(),
285
    max_distance=200,  # 200 meter clusters for destinations
286
    is_latlon=True
287
)
288
destination_clusters = end_clusterer.get_clusters()
289

290
print(f"Found {len(origin_clusters)} origin clusters")
291
print(f"Found {len(destination_clusters)} destination clusters")
292
```
293

294
## Analysis Outputs
295

296
### Significant Points GDF Structure
297

298
The significant points GeoDataFrame typically contains:
299

300
- `geometry`: Point geometry of significant location
301
- `point_type`: Type of significant point (start, end, direction_change, speed_change, stop)
302
- `trajectory_id`: ID of source trajectory
303
- `timestamp`: Time when point occurred
304
- `speed`: Speed at this point (if available)
305
- `direction`: Direction/heading at this point (if available)
306
- `significance_score`: Numeric score indicating importance
307

308
### Clusters GDF Structure
309

310
The clusters GeoDataFrame typically contains:
311

312
- `geometry`: Centroid point of cluster
313
- `cluster_id`: Unique identifier for cluster
314
- `point_count`: Number of points in cluster
315
- `trajectory_count`: Number of different trajectories represented
316
- `cluster_radius`: Spatial extent of cluster
317
- `dominant_point_type`: Most common type of significant point in cluster
318

319
### Flows GDF Structure
320

321
The flows GeoDataFrame typically contains:
322

323
- `geometry`: LineString representing flow path
324
- `origin_cluster`: ID of origin cluster
325
- `dest_cluster`: ID of destination cluster
326
- `trajectory_count`: Number of trajectories using this flow
327
- `avg_duration`: Average time to travel this flow
328
- `avg_distance`: Average distance of this flow

Version

Tile

Files

aggregation-analysis.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

aggregation-analysis.mddocs/