or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

aggregation-analysis.mdcore-data-structures.mdindex.mdio-utilities.mdtrajectory-processing.md

aggregation-analysis.mddocs/

0

# Aggregation and Analysis

1

2

Tools for extracting insights from trajectory collections, including significant point detection, clustering, and flow analysis. These classes enable higher-level analysis of movement patterns across multiple trajectories.

3

4

## Capabilities

5

6

### Trajectory Collection Aggregation

7

8

Advanced analysis of trajectory collections to extract meaningful patterns and insights.

9

10

```python { .api }

11

class TrajectoryCollectionAggregator:

12

def __init__(self, traj_collection, max_distance, min_distance, min_stop_duration, min_angle=45):

13

"""

14

Aggregates trajectories by extracting significant points, clustering, and extracting flows.

15

16

Parameters:

17

- traj_collection: TrajectoryCollection to analyze

18

- max_distance: Maximum distance for clustering analysis

19

- min_distance: Minimum distance for filtering points

20

- min_stop_duration: Minimum duration for stop detection

21

- min_angle: Minimum angle change for significant point detection (degrees)

22

"""

23

24

def get_significant_points_gdf(self):

25

"""

26

Extract significant points from all trajectories.

27

28

Significant points include:

29

- Start and end points

30

- Points with significant direction changes

31

- Points where speed changes significantly

32

- Stop locations

33

34

Returns:

35

GeoDataFrame with significant points and their attributes

36

"""

37

38

def get_clusters_gdf(self):

39

"""

40

Get clustered significant points.

41

42

Groups nearby significant points into clusters to identify

43

common locations across multiple trajectories.

44

45

Returns:

46

GeoDataFrame with cluster information including:

47

- Cluster ID

48

- Cluster centroid

49

- Number of points in cluster

50

- Trajectories represented in cluster

51

"""

52

53

def get_flows_gdf(self):

54

"""

55

Extract flow information between clusters.

56

57

Analyzes movement patterns between significant point clusters

58

to identify common routes and flows.

59

60

Returns:

61

GeoDataFrame with flow lines including:

62

- Origin cluster ID

63

- Destination cluster ID

64

- Number of trajectories using this flow

65

- Flow geometry (LineString)

66

"""

67

```

68

69

### Significant Point Extraction

70

71

Detailed analysis for extracting important points from individual trajectories.

72

73

```python { .api }

74

class PtsExtractor:

75

def __init__(self, traj, max_distance, min_distance, min_stop_duration, min_angle=45):

76

"""

77

Extracts significant points from trajectories.

78

79

Parameters:

80

- traj: Trajectory object to analyze

81

- max_distance: Maximum distance for analysis

82

- min_distance: Minimum distance between significant points

83

- min_stop_duration: Minimum duration for stop detection

84

- min_angle: Minimum angle change for significance (degrees)

85

"""

86

87

def find_significant_points(self):

88

"""

89

Find significant points in the trajectory.

90

91

Identifies points that are important for understanding

92

trajectory structure and behavior, including:

93

- Start and end points

94

- Direction change points

95

- Speed change points

96

- Stop locations

97

98

Returns:

99

GeoDataFrame with significant points and their classifications

100

"""

101

```

102

103

### Point Clustering

104

105

Grid-based clustering for grouping nearby points across trajectories.

106

107

```python { .api }

108

class PointClusterer:

109

def __init__(self, points, max_distance, is_latlon):

110

"""

111

Grid-based point clustering for trajectory analysis.

112

113

Parameters:

114

- points: GeoDataFrame or list of Point geometries to cluster

115

- max_distance: Maximum distance for clustering (cluster size)

116

- is_latlon: Boolean indicating if coordinates are latitude/longitude

117

"""

118

119

def get_clusters(self):

120

"""

121

Perform grid-based clustering of points.

122

123

Groups points into spatial clusters based on proximity,

124

useful for identifying common locations across trajectories.

125

126

Returns:

127

GeoDataFrame or list with cluster assignments and cluster information:

128

- Original point data

129

- Cluster ID for each point

130

- Cluster centroid coordinates

131

- Number of points in each cluster

132

"""

133

```

134

135

## Usage Examples

136

137

### Trajectory Collection Analysis

138

139

```python

140

import movingpandas as mpd

141

import pandas as pd

142

143

# Assume we have a TrajectoryCollection

144

# collection = mpd.TrajectoryCollection(...)

145

146

# Create aggregator for comprehensive analysis

147

aggregator = mpd.TrajectoryCollectionAggregator(

148

traj_collection=collection,

149

max_distance=100, # 100 meter clustering distance

150

min_distance=50, # 50 meter minimum point separation

151

min_stop_duration=pd.Timedelta("5 minutes"), # 5 minute minimum stops

152

min_angle=30 # 30 degree minimum angle change

153

)

154

155

# Extract significant points across all trajectories

156

significant_points = aggregator.get_significant_points_gdf()

157

print(f"Found {len(significant_points)} significant points")

158

159

# Get clustered locations

160

clusters = aggregator.get_clusters_gdf()

161

print(f"Identified {clusters['cluster_id'].nunique()} distinct location clusters")

162

163

# Analyze flows between clusters

164

flows = aggregator.get_flows_gdf()

165

print(f"Found {len(flows)} distinct flows between clusters")

166

167

# Examine the most common flows

168

top_flows = flows.nlargest(5, 'trajectory_count')

169

for idx, flow in top_flows.iterrows():

170

print(f"Flow from cluster {flow['origin_cluster']} to {flow['dest_cluster']}: "

171

f"{flow['trajectory_count']} trajectories")

172

```

173

174

### Individual Trajectory Significant Point Analysis

175

176

```python

177

# Analyze single trajectory for significant points

178

# traj = mpd.Trajectory(...)

179

180

extractor = mpd.PtsExtractor(

181

traj=traj,

182

max_distance=200,

183

min_distance=25,

184

min_stop_duration=pd.Timedelta("2 minutes"),

185

min_angle=45

186

)

187

188

# Find significant points

189

sig_points = extractor.find_significant_points()

190

191

# Examine types of significant points found

192

point_types = sig_points['point_type'].value_counts()

193

print("Significant point types:")

194

for point_type, count in point_types.items():

195

print(f" {point_type}: {count}")

196

```

197

198

### Point Clustering Analysis

199

200

```python

201

import geopandas as gpd

202

from shapely.geometry import Point

203

204

# Create sample points for clustering

205

points_data = [

206

Point(0, 0), Point(0.1, 0.1), Point(0.2, 0.05), # Cluster 1

207

Point(5, 5), Point(5.1, 5.2), Point(4.9, 4.8), # Cluster 2

208

Point(10, 10), Point(10.3, 10.1) # Cluster 3

209

]

210

211

# Create point clusterer

212

clusterer = mpd.PointClusterer(

213

points=points_data,

214

max_distance=0.5, # 0.5 unit clustering distance

215

is_latlon=False # Using projected coordinates

216

)

217

218

# Get clusters

219

clusters = clusterer.get_clusters()

220

221

# Analyze clustering results

222

print(f"Points clustered into {len(clusters)} groups")

223

```

224

225

### Comprehensive Movement Pattern Analysis

226

227

```python

228

# Complete workflow for analyzing movement patterns

229

def analyze_movement_patterns(trajectory_collection):

230

"""Comprehensive analysis of movement patterns in trajectory collection."""

231

232

# Set up aggregator with reasonable parameters

233

aggregator = mpd.TrajectoryCollectionAggregator(

234

traj_collection=trajectory_collection,

235

max_distance=100,

236

min_distance=25,

237

min_stop_duration=pd.Timedelta("3 minutes"),

238

min_angle=45

239

)

240

241

# Extract all analysis components

242

significant_points = aggregator.get_significant_points_gdf()

243

clusters = aggregator.get_clusters_gdf()

244

flows = aggregator.get_flows_gdf()

245

246

# Summary statistics

247

analysis_summary = {

248

'total_trajectories': len(trajectory_collection),

249

'significant_points': len(significant_points),

250

'location_clusters': clusters['cluster_id'].nunique(),

251

'distinct_flows': len(flows),

252

'most_used_flow': flows.loc[flows['trajectory_count'].idxmax()] if len(flows) > 0 else None

253

}

254

255

return {

256

'summary': analysis_summary,

257

'significant_points': significant_points,

258

'clusters': clusters,

259

'flows': flows

260

}

261

262

# Use the analysis function

263

# results = analyze_movement_patterns(my_collection)

264

# print("Analysis Summary:", results['summary'])

265

```

266

267

### Clustering Different Types of Points

268

269

```python

270

# Cluster different types of trajectory points separately

271

272

# Extract start points from collection

273

start_points = collection.get_start_locations()

274

start_clusterer = mpd.PointClusterer(

275

points=start_points.geometry.tolist(),

276

max_distance=200, # 200 meter clusters for origins

277

is_latlon=True

278

)

279

origin_clusters = start_clusterer.get_clusters()

280

281

# Extract end points from collection

282

end_points = collection.get_end_locations()

283

end_clusterer = mpd.PointClusterer(

284

points=end_points.geometry.tolist(),

285

max_distance=200, # 200 meter clusters for destinations

286

is_latlon=True

287

)

288

destination_clusters = end_clusterer.get_clusters()

289

290

print(f"Found {len(origin_clusters)} origin clusters")

291

print(f"Found {len(destination_clusters)} destination clusters")

292

```

293

294

## Analysis Outputs

295

296

### Significant Points GDF Structure

297

298

The significant points GeoDataFrame typically contains:

299

300

- `geometry`: Point geometry of significant location

301

- `point_type`: Type of significant point (start, end, direction_change, speed_change, stop)

302

- `trajectory_id`: ID of source trajectory

303

- `timestamp`: Time when point occurred

304

- `speed`: Speed at this point (if available)

305

- `direction`: Direction/heading at this point (if available)

306

- `significance_score`: Numeric score indicating importance

307

308

### Clusters GDF Structure

309

310

The clusters GeoDataFrame typically contains:

311

312

- `geometry`: Centroid point of cluster

313

- `cluster_id`: Unique identifier for cluster

314

- `point_count`: Number of points in cluster

315

- `trajectory_count`: Number of different trajectories represented

316

- `cluster_radius`: Spatial extent of cluster

317

- `dominant_point_type`: Most common type of significant point in cluster

318

319

### Flows GDF Structure

320

321

The flows GeoDataFrame typically contains:

322

323

- `geometry`: LineString representing flow path

324

- `origin_cluster`: ID of origin cluster

325

- `dest_cluster`: ID of destination cluster

326

- `trajectory_count`: Number of trajectories using this flow

327

- `avg_duration`: Average time to travel this flow

328

- `avg_distance`: Average distance of this flow