or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

dataset-conversions.mdgenomic-data-loading.mdindex.mdpython-integration.mdrdd-conversions.md

rdd-conversions.mddocs/

0

# RDD Conversions

1

2

ADAM APIs provides a comprehensive set of converter classes that implement Function2 interface for transforming between different genomic RDD types. These converters enable seamless data type transformations while preserving genomic metadata.

3

4

## Capabilities

5

6

### Base Conversion Interface

7

8

All RDD converters implement the Function2 interface for integration with Spark transformations.

9

10

```java { .api }

11

/**

12

* Base trait for same-type RDD conversions

13

* @param <T> The record type (e.g., AlignmentRecord, Variant)

14

* @param <U> The RDD wrapper type (e.g., AlignmentRecordRDD, VariantRDD)

15

*/

16

interface SameTypeConversion<T, U extends GenomicRDD<T, U>> extends Function2<U, RDD<T>, U> {

17

/**

18

* Convert source RDD to target RDD type

19

* @param v1 Source genomic RDD containing metadata

20

* @param v2 Target RDD data

21

* @return Converted genomic RDD with preserved metadata

22

*/

23

U call(U v1, RDD<T> v2);

24

}

25

```

26

27

### Contig Fragment Converters

28

29

Convert nucleotide contig fragments (reference sequences) to other genomic data types.

30

31

```java { .api }

32

// Same-type conversion for contig fragments

33

class ContigsToContigsConverter extends SameTypeConversion<NucleotideContigFragment, NucleotideContigFragmentRDD> {}

34

35

// Convert contigs to other data types

36

class ContigsToCoverageConverter extends Function2<NucleotideContigFragmentRDD, RDD<Coverage>, CoverageRDD> {

37

CoverageRDD call(NucleotideContigFragmentRDD v1, RDD<Coverage> v2);

38

}

39

40

class ContigsToFeaturesConverter extends Function2<NucleotideContigFragmentRDD, RDD<Feature>, FeatureRDD> {

41

FeatureRDD call(NucleotideContigFragmentRDD v1, RDD<Feature> v2);

42

}

43

44

class ContigsToFragmentsConverter extends Function2<NucleotideContigFragmentRDD, RDD<Fragment>, FragmentRDD> {

45

FragmentRDD call(NucleotideContigFragmentRDD v1, RDD<Fragment> v2);

46

}

47

48

class ContigsToAlignmentRecordsConverter extends Function2<NucleotideContigFragmentRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {

49

AlignmentRecordRDD call(NucleotideContigFragmentRDD v1, RDD<AlignmentRecord> v2);

50

}

51

52

class ContigsToGenotypesConverter extends Function2<NucleotideContigFragmentRDD, RDD<Genotype>, GenotypeRDD> {

53

GenotypeRDD call(NucleotideContigFragmentRDD v1, RDD<Genotype> v2);

54

}

55

56

class ContigsToVariantsConverter extends Function2<NucleotideContigFragmentRDD, RDD<Variant>, VariantRDD> {

57

VariantRDD call(NucleotideContigFragmentRDD v1, RDD<Variant> v2);

58

}

59

60

class ContigsToVariantContextsConverter extends Function2<NucleotideContigFragmentRDD, RDD<VariantContext>, VariantContextRDD> {

61

VariantContextRDD call(NucleotideContigFragmentRDD v1, RDD<VariantContext> v2);

62

}

63

```

64

65

### Coverage Converters

66

67

Convert coverage data to other genomic data types.

68

69

```java { .api }

70

// Same-type conversion for coverage

71

class CoverageToCoverageConverter extends SameTypeConversion<Coverage, CoverageRDD> {}

72

73

// Convert coverage to other data types

74

class CoverageToContigsConverter extends Function2<CoverageRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {

75

NucleotideContigFragmentRDD call(CoverageRDD v1, RDD<NucleotideContigFragment> v2);

76

}

77

78

class CoverageToFeaturesConverter extends Function2<CoverageRDD, RDD<Feature>, FeatureRDD> {

79

FeatureRDD call(CoverageRDD v1, RDD<Feature> v2);

80

}

81

82

class CoverageToFragmentsConverter extends Function2<CoverageRDD, RDD<Fragment>, FragmentRDD> {

83

FragmentRDD call(CoverageRDD v1, RDD<Fragment> v2);

84

}

85

86

class CoverageToAlignmentRecordsConverter extends Function2<CoverageRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {

87

AlignmentRecordRDD call(CoverageRDD v1, RDD<AlignmentRecord> v2);

88

}

89

90

class CoverageToGenotypesConverter extends Function2<CoverageRDD, RDD<Genotype>, GenotypeRDD> {

91

GenotypeRDD call(CoverageRDD v1, RDD<Genotype> v2);

92

}

93

94

class CoverageToVariantsConverter extends Function2<CoverageRDD, RDD<Variant>, VariantRDD> {

95

VariantRDD call(CoverageRDD v1, RDD<Variant> v2);

96

}

97

98

class CoverageToVariantContextConverter extends Function2<CoverageRDD, RDD<VariantContext>, VariantContextRDD> {

99

VariantContextRDD call(CoverageRDD v1, RDD<VariantContext> v2);

100

}

101

```

102

103

### Feature Converters

104

105

Convert genomic feature data to other data types.

106

107

```java { .api }

108

// Same-type conversion for features

109

class FeaturesToFeatureConverter extends SameTypeConversion<Feature, FeatureRDD> {}

110

111

// Convert features to other data types

112

class FeaturesToContigsConverter extends Function2<FeatureRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {

113

NucleotideContigFragmentRDD call(FeatureRDD v1, RDD<NucleotideContigFragment> v2);

114

}

115

116

class FeaturesToCoverageConverter extends Function2<FeatureRDD, RDD<Coverage>, CoverageRDD> {

117

CoverageRDD call(FeatureRDD v1, RDD<Coverage> v2);

118

}

119

120

class FeaturesToFragmentsConverter extends Function2<FeatureRDD, RDD<Fragment>, FragmentRDD> {

121

FragmentRDD call(FeatureRDD v1, RDD<Fragment> v2);

122

}

123

124

class FeaturesToAlignmentRecordsConverter extends Function2<FeatureRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {

125

AlignmentRecordRDD call(FeatureRDD v1, RDD<AlignmentRecord> v2);

126

}

127

128

class FeaturesToGenotypesConverter extends Function2<FeatureRDD, RDD<Genotype>, GenotypeRDD> {

129

GenotypeRDD call(FeatureRDD v1, RDD<Genotype> v2);

130

}

131

132

class FeaturesToVariantsConverter extends Function2<FeatureRDD, RDD<Variant>, VariantRDD> {

133

VariantRDD call(FeatureRDD v1, RDD<Variant> v2);

134

}

135

136

class FeaturesToVariantContextConverter extends Function2<FeatureRDD, RDD<VariantContext>, VariantContextRDD> {

137

VariantContextRDD call(FeatureRDD v1, RDD<VariantContext> v2);

138

}

139

```

140

141

### Fragment Converters

142

143

Convert sequencing fragment data to other genomic data types.

144

145

```java { .api }

146

// Same-type conversion for fragments

147

class FragmentsToFragmentConverter extends SameTypeConversion<Fragment, FragmentRDD> {}

148

149

// Convert fragments to other data types

150

class FragmentsToContigsConverter extends Function2<FragmentRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {

151

NucleotideContigFragmentRDD call(FragmentRDD v1, RDD<NucleotideContigFragment> v2);

152

}

153

154

class FragmentsToCoverageConverter extends Function2<FragmentRDD, RDD<Coverage>, CoverageRDD> {

155

CoverageRDD call(FragmentRDD v1, RDD<Coverage> v2);

156

}

157

158

class FragmentsToFeaturesConverter extends Function2<FragmentRDD, RDD<Feature>, FeatureRDD> {

159

FeatureRDD call(FragmentRDD v1, RDD<Feature> v2);

160

}

161

162

class FragmentsToAlignmentRecordsConverter extends Function2<FragmentRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {

163

AlignmentRecordRDD call(FragmentRDD v1, RDD<AlignmentRecord> v2);

164

}

165

166

class FragmentsToGenotypesConverter extends Function2<FragmentRDD, RDD<Genotype>, GenotypeRDD> {

167

GenotypeRDD call(FragmentRDD v1, RDD<Genotype> v2);

168

}

169

170

class FragmentsToVariantsConverter extends Function2<FragmentRDD, RDD<Variant>, VariantRDD> {

171

VariantRDD call(FragmentRDD v1, RDD<Variant> v2);

172

}

173

174

class FragmentsToVariantContextConverter extends Function2<FragmentRDD, RDD<VariantContext>, VariantContextRDD> {

175

VariantContextRDD call(FragmentRDD v1, RDD<VariantContext> v2);

176

}

177

```

178

179

### Alignment Record Converters

180

181

Convert alignment record data to other genomic data types.

182

183

```java { .api }

184

// Same-type conversion for alignment records

185

class AlignmentRecordsToAlignmentRecordsConverter extends SameTypeConversion<AlignmentRecord, AlignmentRecordRDD> {}

186

187

// Convert alignment records to other data types

188

class AlignmentRecordsToContigsConverter extends Function2<AlignmentRecordRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {

189

NucleotideContigFragmentRDD call(AlignmentRecordRDD v1, RDD<NucleotideContigFragment> v2);

190

}

191

192

class AlignmentRecordsToCoverageConverter extends Function2<AlignmentRecordRDD, RDD<Coverage>, CoverageRDD> {

193

CoverageRDD call(AlignmentRecordRDD v1, RDD<Coverage> v2);

194

}

195

196

class AlignmentRecordsToFeaturesConverter extends Function2<AlignmentRecordRDD, RDD<Feature>, FeatureRDD> {

197

FeatureRDD call(AlignmentRecordRDD v1, RDD<Feature> v2);

198

}

199

200

class AlignmentRecordsToFragmentsConverter extends Function2<AlignmentRecordRDD, RDD<Fragment>, FragmentRDD> {

201

FragmentRDD call(AlignmentRecordRDD v1, RDD<Fragment> v2);

202

}

203

204

class AlignmentRecordsToGenotypesConverter extends Function2<AlignmentRecordRDD, RDD<Genotype>, GenotypeRDD> {

205

GenotypeRDD call(AlignmentRecordRDD v1, RDD<Genotype> v2);

206

}

207

208

class AlignmentRecordsToVariantsConverter extends Function2<AlignmentRecordRDD, RDD<Variant>, VariantRDD> {

209

VariantRDD call(AlignmentRecordRDD v1, RDD<Variant> v2);

210

}

211

212

class AlignmentRecordsToVariantContextConverter extends Function2<AlignmentRecordRDD, RDD<VariantContext>, VariantContextRDD> {

213

VariantContextRDD call(AlignmentRecordRDD v1, RDD<VariantContext> v2);

214

}

215

```

216

217

### Genotype Converters

218

219

Convert genotype data to other genomic data types.

220

221

```java { .api }

222

// Same-type conversion for genotypes

223

class GenotypesToGenotypesConverter extends SameTypeConversion<Genotype, GenotypeRDD> {}

224

225

// Convert genotypes to other data types

226

class GenotypesToContigsConverter extends Function2<GenotypeRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {

227

NucleotideContigFragmentRDD call(GenotypeRDD v1, RDD<NucleotideContigFragment> v2);

228

}

229

230

class GenotypesToCoverageConverter extends Function2<GenotypeRDD, RDD<Coverage>, CoverageRDD> {

231

CoverageRDD call(GenotypeRDD v1, RDD<Coverage> v2);

232

}

233

234

class GenotypesToFeaturesConverter extends Function2<GenotypeRDD, RDD<Feature>, FeatureRDD> {

235

FeatureRDD call(GenotypeRDD v1, RDD<Feature> v2);

236

}

237

238

class GenotypesToFragmentsConverter extends Function2<GenotypeRDD, RDD<Fragment>, FragmentRDD> {

239

FragmentRDD call(GenotypeRDD v1, RDD<Fragment> v2);

240

}

241

242

class GenotypesToAlignmentRecordsConverter extends Function2<GenotypeRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {

243

AlignmentRecordRDD call(GenotypeRDD v1, RDD<AlignmentRecord> v2);

244

}

245

246

class GenotypesToVariantsConverter extends Function2<GenotypeRDD, RDD<Variant>, VariantRDD> {

247

VariantRDD call(GenotypeRDD v1, RDD<Variant> v2);

248

}

249

250

class GenotypesToVariantContextConverter extends Function2<GenotypeRDD, RDD<VariantContext>, VariantContextRDD> {

251

VariantContextRDD call(GenotypeRDD v1, RDD<VariantContext> v2);

252

}

253

```

254

255

### Variant Converters

256

257

Convert variant data to other genomic data types.

258

259

```java { .api }

260

// Same-type conversion for variants

261

class VariantsToVariantsConverter extends SameTypeConversion<Variant, VariantRDD> {}

262

263

// Convert variants to other data types

264

class VariantsToContigsConverter extends Function2<VariantRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {

265

NucleotideContigFragmentRDD call(VariantRDD v1, RDD<NucleotideContigFragment> v2);

266

}

267

268

class VariantsToCoverageConverter extends Function2<VariantRDD, RDD<Coverage>, CoverageRDD> {

269

CoverageRDD call(VariantRDD v1, RDD<Coverage> v2);

270

}

271

272

class VariantsToFeaturesConverter extends Function2<VariantRDD, RDD<Feature>, FeatureRDD> {

273

FeatureRDD call(VariantRDD v1, RDD<Feature> v2);

274

}

275

276

class VariantsToFragmentsConverter extends Function2<VariantRDD, RDD<Fragment>, FragmentRDD> {

277

FragmentRDD call(VariantRDD v1, RDD<Fragment> v2);

278

}

279

280

class VariantsToAlignmentRecordsConverter extends Function2<VariantRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {

281

AlignmentRecordRDD call(VariantRDD v1, RDD<AlignmentRecord> v2);

282

}

283

284

class VariantsToGenotypesConverter extends Function2<VariantRDD, RDD<Genotype>, GenotypeRDD> {

285

GenotypeRDD call(VariantRDD v1, RDD<Genotype> v2);

286

}

287

288

class VariantsToVariantContextConverter extends Function2<VariantRDD, RDD<VariantContext>, VariantContextRDD> {

289

VariantContextRDD call(VariantRDD v1, RDD<VariantContext> v2);

290

}

291

```

292

293

### Variant Context Converters

294

295

Convert rich variant context data to other genomic data types.

296

297

```java { .api }

298

// Same-type conversion for variant contexts

299

class VariantContextsToVariantContextConverter extends SameTypeConversion<VariantContext, VariantContextRDD> {}

300

301

// Convert variant contexts to other data types

302

class VariantContextsToContigsConverter extends Function2<VariantContextRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {

303

NucleotideContigFragmentRDD call(VariantContextRDD v1, RDD<NucleotideContigFragment> v2);

304

}

305

306

class VariantContextsToCoverageConverter extends Function2<VariantContextRDD, RDD<Coverage>, CoverageRDD> {

307

CoverageRDD call(VariantContextRDD v1, RDD<Coverage> v2);

308

}

309

310

class VariantContextsToFeaturesConverter extends Function2<VariantContextRDD, RDD<Feature>, FeatureRDD> {

311

FeatureRDD call(VariantContextRDD v1, RDD<Feature> v2);

312

}

313

314

class VariantContextsToFragmentsConverter extends Function2<VariantContextRDD, RDD<Fragment>, FragmentRDD> {

315

FragmentRDD call(VariantContextRDD v1, RDD<Fragment> v2);

316

}

317

318

class VariantContextsToAlignmentRecordsConverter extends Function2<VariantContextRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {

319

AlignmentRecordRDD call(VariantContextRDD v1, RDD<AlignmentRecord> v2);

320

}

321

322

class VariantContextsToGenotypesConverter extends Function2<VariantContextRDD, RDD<Genotype>, GenotypeRDD> {

323

GenotypeRDD call(VariantContextRDD v1, RDD<Genotype> v2);

324

}

325

326

class VariantContextsToVariantsConverter extends Function2<VariantContextRDD, RDD<Variant>, VariantRDD> {

327

VariantRDD call(VariantContextRDD v1, RDD<Variant> v2);

328

}

329

```

330

331

## Usage Examples

332

333

**Basic RDD type conversion:**

334

335

```java

336

import org.bdgenomics.adam.api.java.*;

337

338

// Convert alignment records to variants

339

AlignmentRecordsToVariantsConverter converter = new AlignmentRecordsToVariantsConverter();

340

VariantRDD variants = converter.call(alignmentRDD, variantRDD.jrdd());

341

342

// Convert variants to genotypes

343

VariantsToGenotypesConverter varToGeno = new VariantsToGenotypesConverter();

344

GenotypeRDD genotypes = varToGeno.call(variants, genotypeRDD.jrdd());

345

```

346

347

**Same-type conversions for data manipulation:**

348

349

```java

350

// Same-type conversion preserving metadata

351

AlignmentRecordsToAlignmentRecordsConverter sameConverter =

352

new AlignmentRecordsToAlignmentRecordsConverter();

353

354

// Apply some RDD transformation

355

RDD<AlignmentRecord> filteredRDD = alignmentRDD.jrdd()

356

.filter(read -> read.getMapq() > 30);

357

358

// Convert back to AlignmentRecordRDD with original metadata

359

AlignmentRecordRDD filteredAlignments = sameConverter.call(alignmentRDD, filteredRDD);

360

```

361

362

**Chaining multiple conversions:**

363

364

```java

365

// Load alignment data

366

AlignmentRecordRDD alignments = jac.loadAlignments("sample.bam");

367

368

// Convert to fragments, then to coverage

369

AlignmentRecordsToFragmentsConverter toFragments = new AlignmentRecordsToFragmentsConverter();

370

FragmentsToCoverageConverter toCoverage = new FragmentsToCoverageConverter();

371

372

FragmentRDD fragments = toFragments.call(alignments, fragmentRDD.jrdd());

373

CoverageRDD coverage = toCoverage.call(fragments, coverageRDD.jrdd());

374

```

375

376

## Key Benefits

377

378

- **Metadata Preservation**: All conversions preserve genomic metadata (sequence dictionary, record groups, etc.)

379

- **Type Safety**: Strong typing ensures correct data transformations

380

- **Spark Integration**: Function2 interface works seamlessly with Spark transformations

381

- **Comprehensive Coverage**: All possible genomic data type conversions supported

382

- **Performance**: Efficient conversions that leverage Spark's distributed computing