0
# File I/O and Data Processing
1
2
Comprehensive file input/output capabilities supporting various geospatial formats, coordinate transformations, data conversion utilities, and download functionality for seamless data workflows.
3
4
## Capabilities
5
6
### File Download and Web Access
7
8
Download files from URLs with progress tracking, authentication support, and error handling for reliable data acquisition.
9
10
```python { .api }
11
def download_file(url, filename=None, **kwargs):
12
"""
13
Download file from URL.
14
15
Args:
16
url (str): URL to download from
17
filename (str): Local filename for downloaded file
18
**kwargs: Download options (headers, timeout, verify, etc.)
19
20
Returns:
21
str: Path to downloaded file
22
"""
23
24
def download_from_url(url, out_file, **kwargs):
25
"""
26
Advanced download with progress tracking.
27
28
Args:
29
url (str): Source URL
30
out_file (str): Output file path
31
**kwargs: Advanced options (chunk_size, progress, resume, etc.)
32
33
Returns:
34
str: Path to downloaded file
35
"""
36
37
def download_folder(url, out_dir, **kwargs):
38
"""
39
Download entire folder from URL.
40
41
Args:
42
url (str): Folder URL
43
out_dir (str): Output directory
44
**kwargs: Download options (recursive, filters, etc.)
45
46
Returns:
47
list: List of downloaded file paths
48
"""
49
```
50
51
### Raster Data I/O
52
53
Read, write, and process raster data in various formats with comprehensive metadata handling and format conversion.
54
55
```python { .api }
56
def read_raster(filename, **kwargs):
57
"""
58
Read raster file into memory.
59
60
Args:
61
filename (str): Path to raster file
62
**kwargs: Read options (bands, window, masked, etc.)
63
64
Returns:
65
tuple: (array, metadata) - raster data and metadata
66
"""
67
68
def write_raster(array, filename, **kwargs):
69
"""
70
Write raster array to file.
71
72
Args:
73
array (numpy.ndarray): Raster data array
74
filename (str): Output file path
75
**kwargs: Write options (crs, transform, compress, etc.)
76
77
Returns:
78
str: Path to written raster file
79
"""
80
81
def raster_info(filename):
82
"""
83
Get raster file information.
84
85
Args:
86
filename (str): Path to raster file
87
88
Returns:
89
dict: Raster metadata (bands, CRS, bounds, etc.)
90
"""
91
92
def read_netcdf(filename, variables=None, **kwargs):
93
"""
94
Read NetCDF file.
95
96
Args:
97
filename (str): Path to NetCDF file
98
variables (list): Variables to read
99
**kwargs: NetCDF read options (time, bbox, etc.)
100
101
Returns:
102
xarray.Dataset: NetCDF data as xarray Dataset
103
"""
104
```
105
106
### Vector Data I/O
107
108
Read, write, and convert vector data supporting multiple formats with attribute handling and spatial indexing.
109
110
```python { .api }
111
def read_vector(filename, **kwargs):
112
"""
113
Read vector file into GeoDataFrame.
114
115
Args:
116
filename (str): Path to vector file
117
**kwargs: Read options (bbox, rows, columns, etc.)
118
119
Returns:
120
gpd.GeoDataFrame: Vector data as GeoDataFrame
121
"""
122
123
def write_vector(gdf, filename, **kwargs):
124
"""
125
Write GeoDataFrame to vector file.
126
127
Args:
128
gdf (gpd.GeoDataFrame): GeoDataFrame to write
129
filename (str): Output file path
130
**kwargs: Write options (driver, crs, etc.)
131
132
Returns:
133
str: Path to written vector file
134
"""
135
136
def vector_info(filename):
137
"""
138
Get vector file information.
139
140
Args:
141
filename (str): Path to vector file
142
143
Returns:
144
dict: Vector metadata (geometry_type, CRS, bounds, feature_count, etc.)
145
"""
146
147
def read_lidar(filename, **kwargs):
148
"""
149
Read LiDAR data file.
150
151
Args:
152
filename (str): Path to LiDAR file (.las, .laz)
153
**kwargs: LiDAR read options (classification, returns, etc.)
154
155
Returns:
156
dict: LiDAR points and metadata
157
"""
158
```
159
160
### Data Format Conversion
161
162
Convert between different geospatial data formats with comprehensive format support and customizable conversion options.
163
164
```python { .api }
165
def csv_to_geojson(filename, lat='latitude', lon='longitude', **kwargs):
166
"""
167
Convert CSV file to GeoJSON.
168
169
Args:
170
filename (str): Path to CSV file
171
lat (str): Latitude column name
172
lon (str): Longitude column name
173
**kwargs: Conversion options (crs, properties, etc.)
174
175
Returns:
176
str: Path to GeoJSON file
177
"""
178
179
def csv_to_gdf(filename, lat='latitude', lon='longitude', **kwargs):
180
"""
181
Convert CSV to GeoDataFrame.
182
183
Args:
184
filename (str): Path to CSV file
185
lat (str): Latitude column name
186
lon (str): Longitude column name
187
**kwargs: Conversion options (crs, etc.)
188
189
Returns:
190
gpd.GeoDataFrame: CSV data as GeoDataFrame
191
"""
192
193
def csv_to_shp(filename, output, lat='latitude', lon='longitude', **kwargs):
194
"""
195
Convert CSV to shapefile.
196
197
Args:
198
filename (str): Path to CSV file
199
output (str): Output shapefile path
200
lat (str): Latitude column name
201
lon (str): Longitude column name
202
**kwargs: Conversion options (crs, etc.)
203
204
Returns:
205
str: Path to created shapefile
206
"""
207
208
def geojson_to_gdf(filename, **kwargs):
209
"""
210
Convert GeoJSON to GeoDataFrame.
211
212
Args:
213
filename (str): Path to GeoJSON file
214
**kwargs: Conversion options
215
216
Returns:
217
gpd.GeoDataFrame: GeoJSON as GeoDataFrame
218
"""
219
220
def shp_to_geojson(filename, **kwargs):
221
"""
222
Convert shapefile to GeoJSON.
223
224
Args:
225
filename (str): Path to shapefile
226
**kwargs: Conversion options (crs, precision, etc.)
227
228
Returns:
229
str: Path to GeoJSON file
230
"""
231
232
def vector_to_geojson(filename, **kwargs):
233
"""
234
Convert any vector format to GeoJSON.
235
236
Args:
237
filename (str): Path to vector file
238
**kwargs: Conversion options
239
240
Returns:
241
str: Path to GeoJSON file
242
"""
243
```
244
245
### Image Processing and Conversion
246
247
Process and convert images with geospatial metadata handling and format optimization.
248
249
```python { .api }
250
def image_to_cog(filename, output=None, **kwargs):
251
"""
252
Convert image to Cloud Optimized GeoTIFF.
253
254
Args:
255
filename (str): Input image path
256
output (str): Output COG path
257
**kwargs: COG creation options (compress, overview, blocksize, etc.)
258
259
Returns:
260
str: Path to COG file
261
"""
262
263
def numpy_to_cog(array, filename, **kwargs):
264
"""
265
Convert NumPy array to COG.
266
267
Args:
268
array (numpy.ndarray): Input array
269
filename (str): Output COG path
270
**kwargs: COG options (crs, transform, nodata, etc.)
271
272
Returns:
273
str: Path to COG file
274
"""
275
276
def array_to_image(array, filename, **kwargs):
277
"""
278
Convert NumPy array to image file.
279
280
Args:
281
array (numpy.ndarray): Input array
282
filename (str): Output image path
283
**kwargs: Image options (format, quality, etc.)
284
285
Returns:
286
str: Path to image file
287
"""
288
289
def images_to_gif(images, output, **kwargs):
290
"""
291
Convert image sequence to animated GIF.
292
293
Args:
294
images (list): List of image file paths
295
output (str): Output GIF path
296
**kwargs: GIF options (duration, loop, optimize, etc.)
297
298
Returns:
299
str: Path to GIF file
300
"""
301
```
302
303
### Archive and Compression
304
305
Handle compressed archives and files with support for various compression formats.
306
307
```python { .api }
308
def extract_archive(filename, output_dir=None, **kwargs):
309
"""
310
Extract compressed archive.
311
312
Args:
313
filename (str): Path to archive file (.zip, .tar.gz, etc.)
314
output_dir (str): Extraction directory
315
**kwargs: Extraction options (members, password, etc.)
316
317
Returns:
318
list: List of extracted file paths
319
"""
320
321
def create_archive(files, output, **kwargs):
322
"""
323
Create compressed archive from files.
324
325
Args:
326
files (list): List of file paths to archive
327
output (str): Output archive path
328
**kwargs: Archive options (compression, password, etc.)
329
330
Returns:
331
str: Path to created archive
332
"""
333
334
def compress_file(filename, **kwargs):
335
"""
336
Compress single file.
337
338
Args:
339
filename (str): File to compress
340
**kwargs: Compression options (level, format, etc.)
341
342
Returns:
343
str: Path to compressed file
344
"""
345
```
346
347
### Metadata and Information
348
349
Extract and manage metadata from geospatial files with comprehensive format support.
350
351
```python { .api }
352
def get_metadata(filename):
353
"""
354
Get comprehensive metadata from geospatial file.
355
356
Args:
357
filename (str): Path to geospatial file
358
359
Returns:
360
dict: Complete metadata including CRS, bounds, format info, etc.
361
"""
362
363
def get_crs(filename):
364
"""
365
Get coordinate reference system from file.
366
367
Args:
368
filename (str): Path to geospatial file
369
370
Returns:
371
pyproj.CRS: Coordinate reference system
372
"""
373
374
def get_bounds(filename):
375
"""
376
Get spatial bounds from file.
377
378
Args:
379
filename (str): Path to geospatial file
380
381
Returns:
382
list: Bounding box [minx, miny, maxx, maxy]
383
"""
384
385
def file_size(filename):
386
"""
387
Get file size in human-readable format.
388
389
Args:
390
filename (str): Path to file
391
392
Returns:
393
str: File size (e.g., '15.3 MB')
394
"""
395
```
396
397
## Usage Examples
398
399
### Basic File I/O Operations
400
401
```python
402
import leafmap
403
404
# Download a file
405
leafmap.download_file(
406
'https://example.com/data.tif',
407
'local_data.tif'
408
)
409
410
# Read raster data
411
array, metadata = leafmap.read_raster('local_data.tif')
412
413
print(f"Raster shape: {array.shape}")
414
print(f"CRS: {metadata['crs']}")
415
416
# Read vector data
417
gdf = leafmap.read_vector('vector_data.shp')
418
print(f"Features: {len(gdf)}")
419
```
420
421
### Data Format Conversion Workflow
422
423
```python
424
import leafmap
425
426
# Convert CSV with coordinates to GeoJSON
427
leafmap.csv_to_geojson(
428
'points.csv',
429
lat='lat_column',
430
lon='lon_column',
431
output='points.geojson'
432
)
433
434
# Convert shapefile to GeoJSON
435
leafmap.shp_to_geojson(
436
'boundaries.shp',
437
output='boundaries.geojson'
438
)
439
440
# Convert to GeoDataFrame for analysis
441
gdf = leafmap.geojson_to_gdf('points.geojson')
442
443
# Process and save back
444
gdf_processed = gdf[gdf['value'] > 100] # Filter
445
leafmap.write_vector(gdf_processed, 'filtered_points.gpkg')
446
```
447
448
### Image Processing Pipeline
449
450
```python
451
import leafmap
452
import numpy as np
453
454
# Read raster
455
array, metadata = leafmap.read_raster('input.tif')
456
457
# Process array (example: threshold)
458
processed = np.where(array > 0.5, 1, 0)
459
460
# Convert to COG
461
leafmap.numpy_to_cog(
462
processed,
463
'processed.tif',
464
crs=metadata['crs'],
465
transform=metadata['transform'],
466
compress='lzw'
467
)
468
469
# Create visualization
470
leafmap.array_to_image(
471
processed,
472
'visualization.png'
473
)
474
```
475
476
### Batch File Processing
477
478
```python
479
import leafmap
480
import glob
481
482
# Find all shapefiles
483
shapefiles = glob.glob('data/*.shp')
484
485
# Convert all to GeoJSON
486
geojson_files = []
487
for shp in shapefiles:
488
geojson = leafmap.shp_to_geojson(shp)
489
geojson_files.append(geojson)
490
491
print(f"Converted {len(geojson_files)} files")
492
493
# Combine all into single GeoDataFrame
494
import geopandas as gpd
495
combined = gpd.GeoDataFrame()
496
497
for geojson in geojson_files:
498
gdf = leafmap.geojson_to_gdf(geojson)
499
combined = gpd.concat([combined, gdf])
500
501
# Save combined data
502
leafmap.write_vector(combined, 'combined_data.gpkg')
503
```
504
505
### NetCDF Data Processing
506
507
```python
508
import leafmap
509
510
# Read NetCDF file
511
dataset = leafmap.read_netcdf(
512
'climate_data.nc',
513
variables=['temperature', 'precipitation']
514
)
515
516
print(f"Variables: {list(dataset.variables)}")
517
print(f"Time range: {dataset.time.min().values} to {dataset.time.max().values}")
518
519
# Extract single time slice
520
temp_slice = dataset['temperature'].isel(time=0)
521
522
# Convert to raster
523
leafmap.numpy_to_cog(
524
temp_slice.values,
525
'temperature.tif',
526
crs='EPSG:4326',
527
bounds=[-180, -90, 180, 90]
528
)
529
```
530
531
### LiDAR Data Processing
532
533
```python
534
import leafmap
535
536
# Read LiDAR file
537
lidar_data = leafmap.read_lidar(
538
'pointcloud.las',
539
classification=[2, 3, 4, 5] # Ground, low/medium/high vegetation
540
)
541
542
print(f"Points: {len(lidar_data['points'])}")
543
print(f"Classifications: {set(lidar_data['classification'])}")
544
545
# Create DEM from ground points
546
ground_points = lidar_data['points'][lidar_data['classification'] == 2]
547
548
# Process to raster (would need additional gridding)
549
# This is a simplified example
550
dem_array = process_points_to_grid(ground_points) # Custom function
551
552
leafmap.numpy_to_cog(
553
dem_array,
554
'dem.tif',
555
crs='EPSG:32612' # UTM zone
556
)
557
```
558
559
## File Format Support
560
561
### Raster Formats
562
- **GeoTIFF** (.tif, .tiff): Standard geospatial raster format
563
- **COG**: Cloud Optimized GeoTIFF for web access
564
- **NetCDF** (.nc, .nc4): Climate and oceanographic data
565
- **HDF5** (.h5, .hdf5): Hierarchical scientific data
566
- **JPEG2000** (.jp2): Compressed imagery
567
- **PNG/JPEG** (.png, .jpg): Standard image formats
568
- **ENVI** (.img): Remote sensing format
569
570
### Vector Formats
571
- **Shapefile** (.shp): Standard vector format
572
- **GeoJSON** (.geojson, .json): Web-friendly vector format
573
- **GeoPackage** (.gpkg): Modern SQLite-based format
574
- **KML/KMZ** (.kml, .kmz): Google Earth format
575
- **GML** (.gml): Geography Markup Language
576
- **PostGIS**: PostgreSQL spatial extension
577
- **FileGDB** (.gdb): Esri geodatabase
578
579
### Specialized Formats
580
- **LAS/LAZ** (.las, .laz): LiDAR point cloud data
581
- **CSV** (.csv): Comma-separated values with coordinates
582
- **Excel** (.xlsx, .xls): Spreadsheet formats
583
- **Parquet** (.parquet): Columnar data format
584
585
### Compression and Archives
586
- **ZIP** (.zip): Standard compression
587
- **TAR** (.tar, .tar.gz, .tgz): Unix archive format
588
- **7Z** (.7z): High compression ratio
589
- **RAR** (.rar): WinRAR format
590
591
## Configuration Options
592
593
### Download Options
594
595
```python
596
download_options = {
597
'headers': {'User-Agent': 'leafmap'}, # Custom headers
598
'timeout': 300, # Timeout in seconds
599
'verify': True, # SSL verification
600
'chunk_size': 8192, # Download chunk size
601
'resume': True, # Resume interrupted downloads
602
'progress': True # Show progress bar
603
}
604
```
605
606
### Raster I/O Options
607
608
```python
609
raster_options = {
610
'bands': [1, 2, 3], # Specific bands to read
611
'window': ((0, 512), (0, 512)), # Spatial window
612
'masked': True, # Return masked array
613
'dtype': 'float32', # Data type
614
'compress': 'lzw', # Compression method
615
'tiled': True, # Tiled output
616
'blockxsize': 512, # Tile width
617
'blockysize': 512 # Tile height
618
}
619
```
620
621
### Vector I/O Options
622
623
```python
624
vector_options = {
625
'bbox': [-180, -90, 180, 90], # Bounding box filter
626
'rows': slice(0, 1000), # Row slice
627
'columns': ['name', 'value'], # Column selection
628
'driver': 'GeoJSON', # Output driver
629
'precision': 6, # Coordinate precision
630
'drop_z': True # Drop Z coordinates
631
}
632
```