Tessl Tile for pypi/mdanalysis@2.9.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

analysis-tools.md auxiliary-data.md converters.md coordinate-transformations.md core-functionality.md index.md io-formats.md selection-language.md topology-handling.md units-utilities.md

io-formats.mddocs/

0
# File I/O and Format Support
1

2
MDAnalysis provides comprehensive support for reading and writing molecular structure and trajectory data across many file formats commonly used in molecular dynamics simulations.
3

4
## Overview
5

6
The I/O system in MDAnalysis is built around three main components:
7

8
- **Readers**: Read trajectory data with support for sequential and random access
9
- **Writers**: Write coordinate data to various output formats  
10
- **Parsers**: Extract topology information from structure files
11

12
All I/O operations use a unified interface with automatic format detection based on file extensions.
13

14
## Core I/O Functions
15

16
### Reader Function
17

18
```python { .api }
19
def reader(filename, format=None, **kwargs):
20
    """
21
    Get a trajectory reader for the specified file.
22
    
23
    Parameters
24
    ----------
25
    filename : str or file-like
26
        Path to trajectory file or file-like object.
27
    format : str, optional
28
        File format override. If None, format is guessed from file extension.
29
    **kwargs
30
        Additional arguments passed to format-specific reader.
31
        
32
    Returns
33
    -------
34
    ReaderBase
35
        Trajectory reader object appropriate for the file format.
36
        
37
    Examples
38
    --------
39
    >>> from MDAnalysis.coordinates import reader
40
    >>> traj = reader("trajectory.xtc")
41
    >>> for ts in traj:
42
    ...     print(f"Frame {ts.frame}, Time: {ts.time}")
43
    """
44
```
45

46
### Writer Function
47

48
```python { .api }
49
def Writer(filename, n_atoms=None, format=None, multiframe=None, **kwargs):
50
    """
51
    Create a trajectory writer for the specified file format.
52
    
53
    Parameters
54
    ----------
55
    filename : str or file-like
56
        Output filename or file-like object.
57
    n_atoms : int, optional
58
        Number of atoms in the system (required for some formats).
59
    format : str, optional
60
        Output format. If None, guessed from filename extension.
61
    multiframe : bool, optional
62
        Whether writer supports multiple frames. If None, determined automatically.
63
    bonds : str, optional
64
        How to handle bond information ('all', 'none', 'conect').
65
    **kwargs
66
        Additional format-specific arguments.
67
        
68
    Returns
69
    -------
70
    WriterBase
71
        Writer object for the specified format.
72
        
73
    Examples
74
    --------
75
    >>> W = Writer("output.xtc", n_atoms=1000)
76
    >>> for ts in u.trajectory:
77
    ...     W.write(u.atoms)
78
    >>> W.close()
79
    
80
    >>> # Context manager usage
81
    >>> with Writer("output.dcd", n_atoms=u.atoms.n_atoms) as W:
82
    ...     for ts in u.trajectory:
83
    ...         W.write(u.atoms)
84
    """
85
```
86

87
## Supported File Formats
88

89
### Structure Formats (Topology)
90

91
MDAnalysis supports reading topology information from these formats:
92

93
#### CHARMM Formats
94

95
```python { .api }
96
# PSF (CHARMM/NAMD Topology)
97
u = mda.Universe("system.psf", "trajectory.dcd")
98

99
# CRD (CHARMM Coordinate) 
100
u = mda.Universe("coordinates.crd")
101
```
102

103
**Capabilities:**
104
- **PSF**: Complete topology with bonds, angles, dihedrals, impropers
105
- **CRD**: Coordinate data, limited topology information
106

107
#### GROMACS Formats
108

109
```python { .api }
110
# TPR (GROMACS Binary Topology)
111
u = mda.Universe("topol.tpr", "trajectory.xtc")
112

113
# GRO (GROMACS Structure)
114
u = mda.Universe("system.gro")  
115

116
# TOP/ITP (GROMACS Text Topology) - limited support
117
u = mda.Universe("topol.top")
118
```
119

120
**Capabilities:**
121
- **TPR**: Complete binary topology with all parameters  
122
- **GRO**: Coordinates, atom names, residue information
123
- **TOP**: Basic connectivity (bonds only)
124

125
#### AMBER Formats
126

127
```python { .api }
128
# PRMTOP (AMBER Topology)
129
u = mda.Universe("system.prmtop", "trajectory.nc")
130

131
# INPCRD (AMBER Coordinate)
132
u = mda.Universe("coordinates.inpcrd")
133
```
134

135
**Capabilities:**
136
- **PRMTOP**: Complete topology with force field parameters
137
- **INPCRD**: Coordinates, box information
138

139
#### Standard Formats
140

141
```python { .api }
142
# PDB (Protein Data Bank)
143
u = mda.Universe("structure.pdb")
144

145
# PQR (PDB with Charges and Radii)  
146
u = mda.Universe("system.pqr")
147

148
# MOL2 (Tripos Molecular Structure)
149
u = mda.Universe("molecule.mol2")
150

151
# PDBQT (AutoDock format)
152
u = mda.Universe("protein.pdbqt")
153
```
154

155
### Trajectory Formats
156

157
#### Binary Trajectory Formats
158

159
```python { .api }
160
# DCD (CHARMM/NAMD/LAMMPS)
161
u = mda.Universe("topology.psf", "trajectory.dcd")
162

163
# XTC (GROMACS Compressed) 
164
u = mda.Universe("topol.tpr", "trajectory.xtc")
165

166
# TRR (GROMACS Full Precision)
167
u = mda.Universe("topol.tpr", "trajectory.trr")
168

169
# TNG (Trajectory Next Generation)
170
u = mda.Universe("topol.tpr", "trajectory.tng")
171

172
# NetCDF (AMBER NetCDF)
173
u = mda.Universe("system.prmtop", "trajectory.nc")
174
```
175

176
#### Text Trajectory Formats
177

178
```python { .api }
179
# XYZ (Generic Coordinate)
180
u = mda.Universe("trajectory.xyz")
181

182
# LAMMPS Trajectory
183
u = mda.Universe("data.lammps", "dump.lammpstrj")
184

185
# AMBER ASCII Trajectory  
186
u = mda.Universe("system.prmtop", "mdcrd")
187
```
188

189
## Reader Base Classes
190

191
### ReaderBase
192

193
```python { .api }
194
class ReaderBase:
195
    """
196
    Base class for trajectory readers supporting multiple frames.
197
    """
198
    
199
    @property
200
    def n_frames(self):
201
        """
202
        Total number of frames in trajectory.
203
        
204
        Returns
205
        -------
206
        int
207
            Number of trajectory frames.
208
        """
209
    
210
    @property  
211
    def dt(self):
212
        """
213
        Time step between frames.
214
        
215
        Returns
216
        -------
217
        float
218
            Time step in picoseconds.
219
        """
220
    
221
    @property
222
    def totaltime(self):
223
        """
224
        Total simulation time span.
225
        
226
        Returns
227
        -------  
228
        float
229
            Total time covered by trajectory in picoseconds.
230
        """
231
    
232
    def __iter__(self):
233
        """
234
        Iterate through all frames in trajectory.
235
        
236
        Yields
237
        ------
238
        Timestep
239
            Timestep object for each frame.
240
            
241
        Examples
242
        --------
243
        >>> for ts in u.trajectory:
244
        ...     print(f"Time: {ts.time}, Frame: {ts.frame}")
245
        """
246
    
247
    def __getitem__(self, frame):
248
        """
249
        Access specific frame(s) by index.
250
        
251
        Parameters
252
        ----------
253
        frame : int or slice
254
            Frame index or slice object.
255
            
256
        Returns
257
        -------
258
        Timestep  
259
            Timestep object for requested frame(s).
260
            
261
        Examples
262
        --------
263
        >>> ts = u.trajectory[0]      # First frame
264
        >>> ts = u.trajectory[-1]     # Last frame  
265
        >>> u.trajectory[10:20:2]     # Slice with step
266
        """
267
    
268
    def next(self):
269
        """
270
        Advance to next frame.
271
        
272
        Returns
273
        -------
274
        Timestep
275
            Timestep object for next frame.
276
        """
277
    
278
    def rewind(self):
279
        """
280
        Return to first frame of trajectory.
281
        
282
        Examples  
283
        --------
284
        >>> u.trajectory.rewind()
285
        >>> assert u.trajectory.frame == 0
286
        """
287
    
288
    def close(self):
289
        """
290
        Close trajectory file and free resources.
291
        """
292
```
293

294
### SingleFrameReaderBase
295

296
```python { .api }
297
class SingleFrameReaderBase:
298
    """
299
    Base class for single-frame coordinate readers (e.g., PDB, GRO).
300
    """
301
    
302
    @property
303
    def n_frames(self):
304
        """
305
        Always returns 1 for single-frame readers.
306
        
307
        Returns
308
        -------
309
        int
310
            Always 1.
311
        """
312
```
313

314
## Writer Base Classes
315

316
### WriterBase  
317

318
```python { .api }
319
class WriterBase:
320
    """
321
    Base class for coordinate writers.
322
    """
323
    
324
    def __init__(self, filename, n_atoms, **kwargs):
325
        """
326
        Initialize coordinate writer.
327
        
328
        Parameters
329
        ----------
330
        filename : str
331
            Output filename.
332
        n_atoms : int
333
            Number of atoms to write.
334
        **kwargs
335
            Format-specific arguments.
336
        """
337
    
338
    def write(self, selection, ts=None):
339
        """
340
        Write coordinates for selected atoms.
341
        
342
        Parameters
343
        ----------
344
        selection : AtomGroup
345
            Atoms to write to file.
346
        ts : Timestep, optional
347
            Timestep object with coordinate data. If None, uses
348
            current coordinates from selection.
349
            
350
        Examples
351
        --------
352
        >>> with Writer("output.pdb", n_atoms=protein.n_atoms) as W:
353
        ...     for ts in u.trajectory:
354
        ...         W.write(protein)
355
        """
356
    
357
    def close(self):
358
        """
359
        Close output file and finalize writing.
360
        """
361
    
362
    def __enter__(self):
363
        """
364
        Context manager entry.
365
        
366
        Returns
367
        -------
368
        WriterBase
369
            Self for context manager usage.
370
        """
371
    
372
    def __exit__(self, exc_type, exc_val, exc_tb):
373
        """
374
        Context manager exit, automatically closes file.
375
        """
376
```
377

378
## Timestep Class
379

380
```python { .api }
381
class Timestep:
382
    """
383
    Container for coordinate data from a single trajectory frame.
384
    """
385
    
386
    def __init__(self, n_atoms, **kwargs):
387
        """
388
        Create timestep for specified number of atoms.
389
        
390
        Parameters
391
        ----------
392
        n_atoms : int
393
            Number of atoms in the system.
394
        positions : bool, optional
395
            Whether to allocate position array (default True).
396
        velocities : bool, optional  
397
            Whether to allocate velocity array (default False).
398
        forces : bool, optional
399
            Whether to allocate force array (default False).
400
        """
401
    
402
    @property
403
    def positions(self):
404
        """
405
        Atomic coordinates for current frame.
406
        
407
        Returns
408
        -------
409
        numpy.ndarray
410
            Array of shape (n_atoms, 3) with atomic coordinates.
411
        """
412
    
413
    @property
414
    def velocities(self):
415
        """
416
        Atomic velocities for current frame.
417
        
418
        Returns
419
        -------
420
        numpy.ndarray or None
421
            Array of shape (n_atoms, 3) with velocities if available.
422
        """
423
    
424
    @property  
425
    def forces(self):
426
        """
427
        Atomic forces for current frame.
428
        
429
        Returns
430
        -------
431
        numpy.ndarray or None
432
            Array of shape (n_atoms, 3) with forces if available.
433
        """
434
    
435
    @property
436
    def dimensions(self):
437
        """
438
        Unit cell dimensions.
439
        
440
        Returns
441
        -------
442
        numpy.ndarray or None
443
            Array [a, b, c, alpha, beta, gamma] with box parameters.
444
        """
445
    
446
    @property
447
    def volume(self):
448
        """
449
        Unit cell volume.
450
        
451
        Returns
452
        -------
453
        float or None
454
            Volume in cubic Angstroms, None if no box information.
455
        """
456
    
457
    @property
458
    def time(self):
459
        """
460
        Simulation time for this frame.
461
        
462
        Returns
463
        -------
464
        float
465
            Time in picoseconds.
466
        """
467
    
468
    @property
469
    def frame(self):
470
        """
471
        Frame number in trajectory.
472
        
473
        Returns
474
        -------
475
        int
476
            Zero-based frame index.
477
        """
478
    
479
    def copy(self):
480
        """
481
        Create independent copy of timestep.
482
        
483
        Returns
484
        -------
485
        Timestep
486
            Deep copy of timestep with independent arrays.
487
        """
488
```
489

490
## Format-Specific Features
491

492
### GROMACS XTC/TRR
493

494
```python { .api }
495
# XTC compressed trajectories
496
u = mda.Universe("topol.tpr", "trajectory.xtc")
497

498
# Access precision information
499
print(f"XTC precision: {u.trajectory.precision}")
500

501
# TRR full precision with velocities/forces
502
u = mda.Universe("topol.tpr", "trajectory.trr") 
503
if hasattr(u.trajectory.ts, 'velocities'):
504
    velocities = u.trajectory.ts.velocities
505
```
506

507
### CHARMM/NAMD DCD
508

509
```python { .api }
510
u = mda.Universe("system.psf", "trajectory.dcd")
511

512
# DCD supports fixed atoms
513
if hasattr(u.trajectory, 'fixed'):
514
    fixed_atoms = u.trajectory.fixed
515

516
# Periodic boundary information
517
dimensions = u.trajectory.ts.dimensions
518
```
519

520
### AMBER NetCDF
521

522
```python { .api }
523
u = mda.Universe("system.prmtop", "trajectory.nc")
524

525
# NetCDF trajectories support metadata
526
print(f"NetCDF conventions: {u.trajectory.Conventions}")
527
print(f"Application: {u.trajectory.application}")
528
```
529

530
## I/O Usage Patterns
531

532
### Reading Multiple Trajectories
533

534
```python { .api }
535
# Concatenate multiple trajectory files
536
u = mda.Universe("topology.psf", "part1.dcd", "part2.dcd", "part3.dcd")
537

538
# All files treated as continuous trajectory
539
print(f"Total frames: {u.trajectory.n_frames}")
540

541
# Or load sequentially
542
u = mda.Universe("topology.psf", "part1.dcd") 
543
for additional in ["part2.dcd", "part3.dcd"]:
544
    u.load_new(additional)
545
```
546

547
### Writing Trajectories
548

549
```python { .api }
550
# Write subset of atoms
551
protein = u.select_atoms("protein")
552

553
with mda.Writer("protein_only.xtc", n_atoms=protein.n_atoms) as W:
554
    for ts in u.trajectory:
555
        W.write(protein)
556

557
# Write specific frames
558
with mda.Writer("every_10th.dcd", n_atoms=u.atoms.n_atoms) as W:
559
    for ts in u.trajectory[::10]:  # Every 10th frame
560
        W.write(u.atoms)
561

562
# Single frame output
563
u.atoms.write("final_frame.pdb")  # Current frame
564
u.trajectory[-1]  # Go to last frame
565
u.atoms.write("last_frame.gro")
566
```
567

568
### Memory-Efficient Processing
569

570
```python { .api }
571
# Process large trajectories in chunks
572
def process_in_chunks(universe, chunk_size=1000):
573
    n_frames = universe.trajectory.n_frames
574
    
575
    for start in range(0, n_frames, chunk_size):
576
        end = min(start + chunk_size, n_frames)
577
        
578
        # Load chunk into memory for fast access
579
        universe.transfer_to_memory(start=start, stop=end)
580
        
581
        # Process chunk
582
        for ts in universe.trajectory[start:end]:
583
            # Perform analysis
584
            pass
585
```
586

587
### Format Conversion
588

589
```python { .api }
590
def convert_trajectory(input_files, output_file, selection="all"):
591
    """
592
    Convert trajectory between formats.
593
    
594
    Parameters
595
    ----------
596
    input_files : tuple
597
        (topology, trajectory) file paths.
598
    output_file : str
599
        Output trajectory file.
600
    selection : str, optional
601
        Atom selection to write (default "all").
602
    """
603
    u = mda.Universe(*input_files)
604
    atoms = u.select_atoms(selection)
605
    
606
    with mda.Writer(output_file, n_atoms=atoms.n_atoms) as W:
607
        for ts in u.trajectory:
608
            W.write(atoms)
609

610
# Example: Convert AMBER to GROMACS
611
convert_trajectory(("system.prmtop", "trajectory.nc"), "output.xtc")
612

613
# Example: Extract protein only
614
convert_trajectory(("system.psf", "trajectory.dcd"), "protein.xtc", "protein")
615
```
616

617
### Handling File Streams
618

619
```python { .api }
620
import gzip
621
import bz2
622

623
# Compressed files (automatic detection)
624
with gzip.open("trajectory.xtc.gz", 'rb') as f:
625
    u = mda.Universe("topology.tpr", f)
626

627
# Multiple compressed trajectories  
628
u = mda.Universe("topology.tpr", "traj1.xtc.bz2", "traj2.xtc.gz")
629

630
# In-memory trajectories
631
from io import BytesIO
632
data = BytesIO(compressed_trajectory_data)
633
u = mda.Universe("topology.tpr", data, format="XTC")
634
```
635

636
## Error Handling
637

638
```python { .api }
639
from MDAnalysis.exceptions import NoDataError
640

641
try:
642
    u = mda.Universe("topology.psf", "trajectory.dcd")
643
except FileNotFoundError:
644
    print("Trajectory file not found")
645
except NoDataError as e:
646
    print(f"Missing required data: {e}")
647

648
# Check for optional data
649
if u.trajectory.ts.has_velocities:
650
    velocities = u.atoms.velocities
651
else:
652
    print("No velocity data available")
653

654
# Validate trajectory compatibility
655
if u.atoms.n_atoms != u.trajectory.n_atoms:
656
    raise ValueError("Atom count mismatch between topology and trajectory")
657
```
658

659
## Performance Considerations
660

661
### Memory Usage
662

663
```python { .api }
664
# Load trajectory into memory for repeated access
665
u.transfer_to_memory()  # Load all frames
666

667
# Partial loading for large trajectories
668
u.transfer_to_memory(start=0, stop=1000, step=10)  # Every 10th frame
669

670
# Memory-efficient single pass
671
for ts in u.trajectory:  # Streaming access
672
    # Process frame immediately
673
    pass
674
```
675

676
### Random Access Performance
677

678
```python { .api }
679
# Efficient for formats with index support (XTC, TRR, NetCDF)
680
u.trajectory[1000]  # Direct access to frame 1000
681

682
# Less efficient for sequential formats (DCD, ASCII)
683
# Consider loading into memory for random access
684
if u.trajectory.n_frames < 10000:  # Small enough for memory
685
    u.transfer_to_memory()
686
    
687
# Then random access is fast
688
u.trajectory[1000]
689
```

Version

Tile

Files

io-formats.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

io-formats.mddocs/