0
# RDD Conversions
1
2
ADAM APIs provides a comprehensive set of converter classes that implement Function2 interface for transforming between different genomic RDD types. These converters enable seamless data type transformations while preserving genomic metadata.
3
4
## Capabilities
5
6
### Base Conversion Interface
7
8
All RDD converters implement the Function2 interface for integration with Spark transformations.
9
10
```java { .api }
11
/**
12
* Base trait for same-type RDD conversions
13
* @param <T> The record type (e.g., AlignmentRecord, Variant)
14
* @param <U> The RDD wrapper type (e.g., AlignmentRecordRDD, VariantRDD)
15
*/
16
interface SameTypeConversion<T, U extends GenomicRDD<T, U>> extends Function2<U, RDD<T>, U> {
17
/**
18
* Convert source RDD to target RDD type
19
* @param v1 Source genomic RDD containing metadata
20
* @param v2 Target RDD data
21
* @return Converted genomic RDD with preserved metadata
22
*/
23
U call(U v1, RDD<T> v2);
24
}
25
```
26
27
### Contig Fragment Converters
28
29
Convert nucleotide contig fragments (reference sequences) to other genomic data types.
30
31
```java { .api }
32
// Same-type conversion for contig fragments
33
class ContigsToContigsConverter extends SameTypeConversion<NucleotideContigFragment, NucleotideContigFragmentRDD> {}
34
35
// Convert contigs to other data types
36
class ContigsToCoverageConverter extends Function2<NucleotideContigFragmentRDD, RDD<Coverage>, CoverageRDD> {
37
CoverageRDD call(NucleotideContigFragmentRDD v1, RDD<Coverage> v2);
38
}
39
40
class ContigsToFeaturesConverter extends Function2<NucleotideContigFragmentRDD, RDD<Feature>, FeatureRDD> {
41
FeatureRDD call(NucleotideContigFragmentRDD v1, RDD<Feature> v2);
42
}
43
44
class ContigsToFragmentsConverter extends Function2<NucleotideContigFragmentRDD, RDD<Fragment>, FragmentRDD> {
45
FragmentRDD call(NucleotideContigFragmentRDD v1, RDD<Fragment> v2);
46
}
47
48
class ContigsToAlignmentRecordsConverter extends Function2<NucleotideContigFragmentRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
49
AlignmentRecordRDD call(NucleotideContigFragmentRDD v1, RDD<AlignmentRecord> v2);
50
}
51
52
class ContigsToGenotypesConverter extends Function2<NucleotideContigFragmentRDD, RDD<Genotype>, GenotypeRDD> {
53
GenotypeRDD call(NucleotideContigFragmentRDD v1, RDD<Genotype> v2);
54
}
55
56
class ContigsToVariantsConverter extends Function2<NucleotideContigFragmentRDD, RDD<Variant>, VariantRDD> {
57
VariantRDD call(NucleotideContigFragmentRDD v1, RDD<Variant> v2);
58
}
59
60
class ContigsToVariantContextsConverter extends Function2<NucleotideContigFragmentRDD, RDD<VariantContext>, VariantContextRDD> {
61
VariantContextRDD call(NucleotideContigFragmentRDD v1, RDD<VariantContext> v2);
62
}
63
```
64
65
### Coverage Converters
66
67
Convert coverage data to other genomic data types.
68
69
```java { .api }
70
// Same-type conversion for coverage
71
class CoverageToCoverageConverter extends SameTypeConversion<Coverage, CoverageRDD> {}
72
73
// Convert coverage to other data types
74
class CoverageToContigsConverter extends Function2<CoverageRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
75
NucleotideContigFragmentRDD call(CoverageRDD v1, RDD<NucleotideContigFragment> v2);
76
}
77
78
class CoverageToFeaturesConverter extends Function2<CoverageRDD, RDD<Feature>, FeatureRDD> {
79
FeatureRDD call(CoverageRDD v1, RDD<Feature> v2);
80
}
81
82
class CoverageToFragmentsConverter extends Function2<CoverageRDD, RDD<Fragment>, FragmentRDD> {
83
FragmentRDD call(CoverageRDD v1, RDD<Fragment> v2);
84
}
85
86
class CoverageToAlignmentRecordsConverter extends Function2<CoverageRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
87
AlignmentRecordRDD call(CoverageRDD v1, RDD<AlignmentRecord> v2);
88
}
89
90
class CoverageToGenotypesConverter extends Function2<CoverageRDD, RDD<Genotype>, GenotypeRDD> {
91
GenotypeRDD call(CoverageRDD v1, RDD<Genotype> v2);
92
}
93
94
class CoverageToVariantsConverter extends Function2<CoverageRDD, RDD<Variant>, VariantRDD> {
95
VariantRDD call(CoverageRDD v1, RDD<Variant> v2);
96
}
97
98
class CoverageToVariantContextConverter extends Function2<CoverageRDD, RDD<VariantContext>, VariantContextRDD> {
99
VariantContextRDD call(CoverageRDD v1, RDD<VariantContext> v2);
100
}
101
```
102
103
### Feature Converters
104
105
Convert genomic feature data to other data types.
106
107
```java { .api }
108
// Same-type conversion for features
109
class FeaturesToFeatureConverter extends SameTypeConversion<Feature, FeatureRDD> {}
110
111
// Convert features to other data types
112
class FeaturesToContigsConverter extends Function2<FeatureRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
113
NucleotideContigFragmentRDD call(FeatureRDD v1, RDD<NucleotideContigFragment> v2);
114
}
115
116
class FeaturesToCoverageConverter extends Function2<FeatureRDD, RDD<Coverage>, CoverageRDD> {
117
CoverageRDD call(FeatureRDD v1, RDD<Coverage> v2);
118
}
119
120
class FeaturesToFragmentsConverter extends Function2<FeatureRDD, RDD<Fragment>, FragmentRDD> {
121
FragmentRDD call(FeatureRDD v1, RDD<Fragment> v2);
122
}
123
124
class FeaturesToAlignmentRecordsConverter extends Function2<FeatureRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
125
AlignmentRecordRDD call(FeatureRDD v1, RDD<AlignmentRecord> v2);
126
}
127
128
class FeaturesToGenotypesConverter extends Function2<FeatureRDD, RDD<Genotype>, GenotypeRDD> {
129
GenotypeRDD call(FeatureRDD v1, RDD<Genotype> v2);
130
}
131
132
class FeaturesToVariantsConverter extends Function2<FeatureRDD, RDD<Variant>, VariantRDD> {
133
VariantRDD call(FeatureRDD v1, RDD<Variant> v2);
134
}
135
136
class FeaturesToVariantContextConverter extends Function2<FeatureRDD, RDD<VariantContext>, VariantContextRDD> {
137
VariantContextRDD call(FeatureRDD v1, RDD<VariantContext> v2);
138
}
139
```
140
141
### Fragment Converters
142
143
Convert sequencing fragment data to other genomic data types.
144
145
```java { .api }
146
// Same-type conversion for fragments
147
class FragmentsToFragmentConverter extends SameTypeConversion<Fragment, FragmentRDD> {}
148
149
// Convert fragments to other data types
150
class FragmentsToContigsConverter extends Function2<FragmentRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
151
NucleotideContigFragmentRDD call(FragmentRDD v1, RDD<NucleotideContigFragment> v2);
152
}
153
154
class FragmentsToCoverageConverter extends Function2<FragmentRDD, RDD<Coverage>, CoverageRDD> {
155
CoverageRDD call(FragmentRDD v1, RDD<Coverage> v2);
156
}
157
158
class FragmentsToFeaturesConverter extends Function2<FragmentRDD, RDD<Feature>, FeatureRDD> {
159
FeatureRDD call(FragmentRDD v1, RDD<Feature> v2);
160
}
161
162
class FragmentsToAlignmentRecordsConverter extends Function2<FragmentRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
163
AlignmentRecordRDD call(FragmentRDD v1, RDD<AlignmentRecord> v2);
164
}
165
166
class FragmentsToGenotypesConverter extends Function2<FragmentRDD, RDD<Genotype>, GenotypeRDD> {
167
GenotypeRDD call(FragmentRDD v1, RDD<Genotype> v2);
168
}
169
170
class FragmentsToVariantsConverter extends Function2<FragmentRDD, RDD<Variant>, VariantRDD> {
171
VariantRDD call(FragmentRDD v1, RDD<Variant> v2);
172
}
173
174
class FragmentsToVariantContextConverter extends Function2<FragmentRDD, RDD<VariantContext>, VariantContextRDD> {
175
VariantContextRDD call(FragmentRDD v1, RDD<VariantContext> v2);
176
}
177
```
178
179
### Alignment Record Converters
180
181
Convert alignment record data to other genomic data types.
182
183
```java { .api }
184
// Same-type conversion for alignment records
185
class AlignmentRecordsToAlignmentRecordsConverter extends SameTypeConversion<AlignmentRecord, AlignmentRecordRDD> {}
186
187
// Convert alignment records to other data types
188
class AlignmentRecordsToContigsConverter extends Function2<AlignmentRecordRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
189
NucleotideContigFragmentRDD call(AlignmentRecordRDD v1, RDD<NucleotideContigFragment> v2);
190
}
191
192
class AlignmentRecordsToCoverageConverter extends Function2<AlignmentRecordRDD, RDD<Coverage>, CoverageRDD> {
193
CoverageRDD call(AlignmentRecordRDD v1, RDD<Coverage> v2);
194
}
195
196
class AlignmentRecordsToFeaturesConverter extends Function2<AlignmentRecordRDD, RDD<Feature>, FeatureRDD> {
197
FeatureRDD call(AlignmentRecordRDD v1, RDD<Feature> v2);
198
}
199
200
class AlignmentRecordsToFragmentsConverter extends Function2<AlignmentRecordRDD, RDD<Fragment>, FragmentRDD> {
201
FragmentRDD call(AlignmentRecordRDD v1, RDD<Fragment> v2);
202
}
203
204
class AlignmentRecordsToGenotypesConverter extends Function2<AlignmentRecordRDD, RDD<Genotype>, GenotypeRDD> {
205
GenotypeRDD call(AlignmentRecordRDD v1, RDD<Genotype> v2);
206
}
207
208
class AlignmentRecordsToVariantsConverter extends Function2<AlignmentRecordRDD, RDD<Variant>, VariantRDD> {
209
VariantRDD call(AlignmentRecordRDD v1, RDD<Variant> v2);
210
}
211
212
class AlignmentRecordsToVariantContextConverter extends Function2<AlignmentRecordRDD, RDD<VariantContext>, VariantContextRDD> {
213
VariantContextRDD call(AlignmentRecordRDD v1, RDD<VariantContext> v2);
214
}
215
```
216
217
### Genotype Converters
218
219
Convert genotype data to other genomic data types.
220
221
```java { .api }
222
// Same-type conversion for genotypes
223
class GenotypesToGenotypesConverter extends SameTypeConversion<Genotype, GenotypeRDD> {}
224
225
// Convert genotypes to other data types
226
class GenotypesToContigsConverter extends Function2<GenotypeRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
227
NucleotideContigFragmentRDD call(GenotypeRDD v1, RDD<NucleotideContigFragment> v2);
228
}
229
230
class GenotypesToCoverageConverter extends Function2<GenotypeRDD, RDD<Coverage>, CoverageRDD> {
231
CoverageRDD call(GenotypeRDD v1, RDD<Coverage> v2);
232
}
233
234
class GenotypesToFeaturesConverter extends Function2<GenotypeRDD, RDD<Feature>, FeatureRDD> {
235
FeatureRDD call(GenotypeRDD v1, RDD<Feature> v2);
236
}
237
238
class GenotypesToFragmentsConverter extends Function2<GenotypeRDD, RDD<Fragment>, FragmentRDD> {
239
FragmentRDD call(GenotypeRDD v1, RDD<Fragment> v2);
240
}
241
242
class GenotypesToAlignmentRecordsConverter extends Function2<GenotypeRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
243
AlignmentRecordRDD call(GenotypeRDD v1, RDD<AlignmentRecord> v2);
244
}
245
246
class GenotypesToVariantsConverter extends Function2<GenotypeRDD, RDD<Variant>, VariantRDD> {
247
VariantRDD call(GenotypeRDD v1, RDD<Variant> v2);
248
}
249
250
class GenotypesToVariantContextConverter extends Function2<GenotypeRDD, RDD<VariantContext>, VariantContextRDD> {
251
VariantContextRDD call(GenotypeRDD v1, RDD<VariantContext> v2);
252
}
253
```
254
255
### Variant Converters
256
257
Convert variant data to other genomic data types.
258
259
```java { .api }
260
// Same-type conversion for variants
261
class VariantsToVariantsConverter extends SameTypeConversion<Variant, VariantRDD> {}
262
263
// Convert variants to other data types
264
class VariantsToContigsConverter extends Function2<VariantRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
265
NucleotideContigFragmentRDD call(VariantRDD v1, RDD<NucleotideContigFragment> v2);
266
}
267
268
class VariantsToCoverageConverter extends Function2<VariantRDD, RDD<Coverage>, CoverageRDD> {
269
CoverageRDD call(VariantRDD v1, RDD<Coverage> v2);
270
}
271
272
class VariantsToFeaturesConverter extends Function2<VariantRDD, RDD<Feature>, FeatureRDD> {
273
FeatureRDD call(VariantRDD v1, RDD<Feature> v2);
274
}
275
276
class VariantsToFragmentsConverter extends Function2<VariantRDD, RDD<Fragment>, FragmentRDD> {
277
FragmentRDD call(VariantRDD v1, RDD<Fragment> v2);
278
}
279
280
class VariantsToAlignmentRecordsConverter extends Function2<VariantRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
281
AlignmentRecordRDD call(VariantRDD v1, RDD<AlignmentRecord> v2);
282
}
283
284
class VariantsToGenotypesConverter extends Function2<VariantRDD, RDD<Genotype>, GenotypeRDD> {
285
GenotypeRDD call(VariantRDD v1, RDD<Genotype> v2);
286
}
287
288
class VariantsToVariantContextConverter extends Function2<VariantRDD, RDD<VariantContext>, VariantContextRDD> {
289
VariantContextRDD call(VariantRDD v1, RDD<VariantContext> v2);
290
}
291
```
292
293
### Variant Context Converters
294
295
Convert rich variant context data to other genomic data types.
296
297
```java { .api }
298
// Same-type conversion for variant contexts
299
class VariantContextsToVariantContextConverter extends SameTypeConversion<VariantContext, VariantContextRDD> {}
300
301
// Convert variant contexts to other data types
302
class VariantContextsToContigsConverter extends Function2<VariantContextRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
303
NucleotideContigFragmentRDD call(VariantContextRDD v1, RDD<NucleotideContigFragment> v2);
304
}
305
306
class VariantContextsToCoverageConverter extends Function2<VariantContextRDD, RDD<Coverage>, CoverageRDD> {
307
CoverageRDD call(VariantContextRDD v1, RDD<Coverage> v2);
308
}
309
310
class VariantContextsToFeaturesConverter extends Function2<VariantContextRDD, RDD<Feature>, FeatureRDD> {
311
FeatureRDD call(VariantContextRDD v1, RDD<Feature> v2);
312
}
313
314
class VariantContextsToFragmentsConverter extends Function2<VariantContextRDD, RDD<Fragment>, FragmentRDD> {
315
FragmentRDD call(VariantContextRDD v1, RDD<Fragment> v2);
316
}
317
318
class VariantContextsToAlignmentRecordsConverter extends Function2<VariantContextRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
319
AlignmentRecordRDD call(VariantContextRDD v1, RDD<AlignmentRecord> v2);
320
}
321
322
class VariantContextsToGenotypesConverter extends Function2<VariantContextRDD, RDD<Genotype>, GenotypeRDD> {
323
GenotypeRDD call(VariantContextRDD v1, RDD<Genotype> v2);
324
}
325
326
class VariantContextsToVariantsConverter extends Function2<VariantContextRDD, RDD<Variant>, VariantRDD> {
327
VariantRDD call(VariantContextRDD v1, RDD<Variant> v2);
328
}
329
```
330
331
## Usage Examples
332
333
**Basic RDD type conversion:**
334
335
```java
336
import org.bdgenomics.adam.api.java.*;
337
338
// Convert alignment records to variants
339
AlignmentRecordsToVariantsConverter converter = new AlignmentRecordsToVariantsConverter();
340
VariantRDD variants = converter.call(alignmentRDD, variantRDD.jrdd());
341
342
// Convert variants to genotypes
343
VariantsToGenotypesConverter varToGeno = new VariantsToGenotypesConverter();
344
GenotypeRDD genotypes = varToGeno.call(variants, genotypeRDD.jrdd());
345
```
346
347
**Same-type conversions for data manipulation:**
348
349
```java
350
// Same-type conversion preserving metadata
351
AlignmentRecordsToAlignmentRecordsConverter sameConverter =
352
new AlignmentRecordsToAlignmentRecordsConverter();
353
354
// Apply some RDD transformation
355
RDD<AlignmentRecord> filteredRDD = alignmentRDD.jrdd()
356
.filter(read -> read.getMapq() > 30);
357
358
// Convert back to AlignmentRecordRDD with original metadata
359
AlignmentRecordRDD filteredAlignments = sameConverter.call(alignmentRDD, filteredRDD);
360
```
361
362
**Chaining multiple conversions:**
363
364
```java
365
// Load alignment data
366
AlignmentRecordRDD alignments = jac.loadAlignments("sample.bam");
367
368
// Convert to fragments, then to coverage
369
AlignmentRecordsToFragmentsConverter toFragments = new AlignmentRecordsToFragmentsConverter();
370
FragmentsToCoverageConverter toCoverage = new FragmentsToCoverageConverter();
371
372
FragmentRDD fragments = toFragments.call(alignments, fragmentRDD.jrdd());
373
CoverageRDD coverage = toCoverage.call(fragments, coverageRDD.jrdd());
374
```
375
376
## Key Benefits
377
378
- **Metadata Preservation**: All conversions preserve genomic metadata (sequence dictionary, record groups, etc.)
379
- **Type Safety**: Strong typing ensures correct data transformations
380
- **Spark Integration**: Function2 interface works seamlessly with Spark transformations
381
- **Comprehensive Coverage**: All possible genomic data type conversions supported
382
- **Performance**: Efficient conversions that leverage Spark's distributed computing