CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-org-bdgenomics-adam--adam-apis-2-10

Java/Python API wrappers for ADAM genomics analysis library enabling scalable genomic data processing with Apache Spark

Pending
Overview
Eval results
Files

rdd-conversions.mddocs/

RDD Conversions

ADAM APIs provides a comprehensive set of converter classes that implement Function2 interface for transforming between different genomic RDD types. These converters enable seamless data type transformations while preserving genomic metadata.

Capabilities

Base Conversion Interface

All RDD converters implement the Function2 interface for integration with Spark transformations.

/**
 * Base trait for same-type RDD conversions
 * @param <T> The record type (e.g., AlignmentRecord, Variant)
 * @param <U> The RDD wrapper type (e.g., AlignmentRecordRDD, VariantRDD)
 */
interface SameTypeConversion<T, U extends GenomicRDD<T, U>> extends Function2<U, RDD<T>, U> {
    /**
     * Convert source RDD to target RDD type
     * @param v1 Source genomic RDD containing metadata
     * @param v2 Target RDD data
     * @return Converted genomic RDD with preserved metadata
     */
    U call(U v1, RDD<T> v2);
}

Contig Fragment Converters

Convert nucleotide contig fragments (reference sequences) to other genomic data types.

// Same-type conversion for contig fragments
class ContigsToContigsConverter extends SameTypeConversion<NucleotideContigFragment, NucleotideContigFragmentRDD> {}

// Convert contigs to other data types
class ContigsToCoverageConverter extends Function2<NucleotideContigFragmentRDD, RDD<Coverage>, CoverageRDD> {
    CoverageRDD call(NucleotideContigFragmentRDD v1, RDD<Coverage> v2);
}

class ContigsToFeaturesConverter extends Function2<NucleotideContigFragmentRDD, RDD<Feature>, FeatureRDD> {
    FeatureRDD call(NucleotideContigFragmentRDD v1, RDD<Feature> v2);
}

class ContigsToFragmentsConverter extends Function2<NucleotideContigFragmentRDD, RDD<Fragment>, FragmentRDD> {
    FragmentRDD call(NucleotideContigFragmentRDD v1, RDD<Fragment> v2);
}

class ContigsToAlignmentRecordsConverter extends Function2<NucleotideContigFragmentRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
    AlignmentRecordRDD call(NucleotideContigFragmentRDD v1, RDD<AlignmentRecord> v2);
}

class ContigsToGenotypesConverter extends Function2<NucleotideContigFragmentRDD, RDD<Genotype>, GenotypeRDD> {
    GenotypeRDD call(NucleotideContigFragmentRDD v1, RDD<Genotype> v2);
}

class ContigsToVariantsConverter extends Function2<NucleotideContigFragmentRDD, RDD<Variant>, VariantRDD> {
    VariantRDD call(NucleotideContigFragmentRDD v1, RDD<Variant> v2);
}

class ContigsToVariantContextsConverter extends Function2<NucleotideContigFragmentRDD, RDD<VariantContext>, VariantContextRDD> {
    VariantContextRDD call(NucleotideContigFragmentRDD v1, RDD<VariantContext> v2);
}

Coverage Converters

Convert coverage data to other genomic data types.

// Same-type conversion for coverage
class CoverageToCoverageConverter extends SameTypeConversion<Coverage, CoverageRDD> {}

// Convert coverage to other data types
class CoverageToContigsConverter extends Function2<CoverageRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
    NucleotideContigFragmentRDD call(CoverageRDD v1, RDD<NucleotideContigFragment> v2);
}

class CoverageToFeaturesConverter extends Function2<CoverageRDD, RDD<Feature>, FeatureRDD> {
    FeatureRDD call(CoverageRDD v1, RDD<Feature> v2);
}

class CoverageToFragmentsConverter extends Function2<CoverageRDD, RDD<Fragment>, FragmentRDD> {
    FragmentRDD call(CoverageRDD v1, RDD<Fragment> v2);
}

class CoverageToAlignmentRecordsConverter extends Function2<CoverageRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
    AlignmentRecordRDD call(CoverageRDD v1, RDD<AlignmentRecord> v2);
}

class CoverageToGenotypesConverter extends Function2<CoverageRDD, RDD<Genotype>, GenotypeRDD> {
    GenotypeRDD call(CoverageRDD v1, RDD<Genotype> v2);
}

class CoverageToVariantsConverter extends Function2<CoverageRDD, RDD<Variant>, VariantRDD> {
    VariantRDD call(CoverageRDD v1, RDD<Variant> v2);
}

class CoverageToVariantContextConverter extends Function2<CoverageRDD, RDD<VariantContext>, VariantContextRDD> {
    VariantContextRDD call(CoverageRDD v1, RDD<VariantContext> v2);
}

Feature Converters

Convert genomic feature data to other data types.

// Same-type conversion for features
class FeaturesToFeatureConverter extends SameTypeConversion<Feature, FeatureRDD> {}

// Convert features to other data types  
class FeaturesToContigsConverter extends Function2<FeatureRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
    NucleotideContigFragmentRDD call(FeatureRDD v1, RDD<NucleotideContigFragment> v2);
}

class FeaturesToCoverageConverter extends Function2<FeatureRDD, RDD<Coverage>, CoverageRDD> {
    CoverageRDD call(FeatureRDD v1, RDD<Coverage> v2);
}

class FeaturesToFragmentsConverter extends Function2<FeatureRDD, RDD<Fragment>, FragmentRDD> {
    FragmentRDD call(FeatureRDD v1, RDD<Fragment> v2);
}

class FeaturesToAlignmentRecordsConverter extends Function2<FeatureRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
    AlignmentRecordRDD call(FeatureRDD v1, RDD<AlignmentRecord> v2);
}

class FeaturesToGenotypesConverter extends Function2<FeatureRDD, RDD<Genotype>, GenotypeRDD> {
    GenotypeRDD call(FeatureRDD v1, RDD<Genotype> v2);
}

class FeaturesToVariantsConverter extends Function2<FeatureRDD, RDD<Variant>, VariantRDD> {
    VariantRDD call(FeatureRDD v1, RDD<Variant> v2);
}

class FeaturesToVariantContextConverter extends Function2<FeatureRDD, RDD<VariantContext>, VariantContextRDD> {
    VariantContextRDD call(FeatureRDD v1, RDD<VariantContext> v2);
}

Fragment Converters

Convert sequencing fragment data to other genomic data types.

// Same-type conversion for fragments
class FragmentsToFragmentConverter extends SameTypeConversion<Fragment, FragmentRDD> {}

// Convert fragments to other data types
class FragmentsToContigsConverter extends Function2<FragmentRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
    NucleotideContigFragmentRDD call(FragmentRDD v1, RDD<NucleotideContigFragment> v2);
}

class FragmentsToCoverageConverter extends Function2<FragmentRDD, RDD<Coverage>, CoverageRDD> {
    CoverageRDD call(FragmentRDD v1, RDD<Coverage> v2);
}

class FragmentsToFeaturesConverter extends Function2<FragmentRDD, RDD<Feature>, FeatureRDD> {
    FeatureRDD call(FragmentRDD v1, RDD<Feature> v2);
}

class FragmentsToAlignmentRecordsConverter extends Function2<FragmentRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
    AlignmentRecordRDD call(FragmentRDD v1, RDD<AlignmentRecord> v2);
}

class FragmentsToGenotypesConverter extends Function2<FragmentRDD, RDD<Genotype>, GenotypeRDD> {
    GenotypeRDD call(FragmentRDD v1, RDD<Genotype> v2);
}

class FragmentsToVariantsConverter extends Function2<FragmentRDD, RDD<Variant>, VariantRDD> {
    VariantRDD call(FragmentRDD v1, RDD<Variant> v2);
}

class FragmentsToVariantContextConverter extends Function2<FragmentRDD, RDD<VariantContext>, VariantContextRDD> {
    VariantContextRDD call(FragmentRDD v1, RDD<VariantContext> v2);
}

Alignment Record Converters

Convert alignment record data to other genomic data types.

// Same-type conversion for alignment records
class AlignmentRecordsToAlignmentRecordsConverter extends SameTypeConversion<AlignmentRecord, AlignmentRecordRDD> {}

// Convert alignment records to other data types
class AlignmentRecordsToContigsConverter extends Function2<AlignmentRecordRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
    NucleotideContigFragmentRDD call(AlignmentRecordRDD v1, RDD<NucleotideContigFragment> v2);
}

class AlignmentRecordsToCoverageConverter extends Function2<AlignmentRecordRDD, RDD<Coverage>, CoverageRDD> {
    CoverageRDD call(AlignmentRecordRDD v1, RDD<Coverage> v2);
}

class AlignmentRecordsToFeaturesConverter extends Function2<AlignmentRecordRDD, RDD<Feature>, FeatureRDD> {
    FeatureRDD call(AlignmentRecordRDD v1, RDD<Feature> v2);
}

class AlignmentRecordsToFragmentsConverter extends Function2<AlignmentRecordRDD, RDD<Fragment>, FragmentRDD> {
    FragmentRDD call(AlignmentRecordRDD v1, RDD<Fragment> v2);
}

class AlignmentRecordsToGenotypesConverter extends Function2<AlignmentRecordRDD, RDD<Genotype>, GenotypeRDD> {
    GenotypeRDD call(AlignmentRecordRDD v1, RDD<Genotype> v2);
}

class AlignmentRecordsToVariantsConverter extends Function2<AlignmentRecordRDD, RDD<Variant>, VariantRDD> {
    VariantRDD call(AlignmentRecordRDD v1, RDD<Variant> v2);
}

class AlignmentRecordsToVariantContextConverter extends Function2<AlignmentRecordRDD, RDD<VariantContext>, VariantContextRDD> {
    VariantContextRDD call(AlignmentRecordRDD v1, RDD<VariantContext> v2);
}

Genotype Converters

Convert genotype data to other genomic data types.

// Same-type conversion for genotypes
class GenotypesToGenotypesConverter extends SameTypeConversion<Genotype, GenotypeRDD> {}

// Convert genotypes to other data types
class GenotypesToContigsConverter extends Function2<GenotypeRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
    NucleotideContigFragmentRDD call(GenotypeRDD v1, RDD<NucleotideContigFragment> v2);
}

class GenotypesToCoverageConverter extends Function2<GenotypeRDD, RDD<Coverage>, CoverageRDD> {
    CoverageRDD call(GenotypeRDD v1, RDD<Coverage> v2);
}

class GenotypesToFeaturesConverter extends Function2<GenotypeRDD, RDD<Feature>, FeatureRDD> {
    FeatureRDD call(GenotypeRDD v1, RDD<Feature> v2);
}

class GenotypesToFragmentsConverter extends Function2<GenotypeRDD, RDD<Fragment>, FragmentRDD> {
    FragmentRDD call(GenotypeRDD v1, RDD<Fragment> v2);
}

class GenotypesToAlignmentRecordsConverter extends Function2<GenotypeRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
    AlignmentRecordRDD call(GenotypeRDD v1, RDD<AlignmentRecord> v2);
}

class GenotypesToVariantsConverter extends Function2<GenotypeRDD, RDD<Variant>, VariantRDD> {
    VariantRDD call(GenotypeRDD v1, RDD<Variant> v2);
}

class GenotypesToVariantContextConverter extends Function2<GenotypeRDD, RDD<VariantContext>, VariantContextRDD> {
    VariantContextRDD call(GenotypeRDD v1, RDD<VariantContext> v2);
}

Variant Converters

Convert variant data to other genomic data types.

// Same-type conversion for variants
class VariantsToVariantsConverter extends SameTypeConversion<Variant, VariantRDD> {}

// Convert variants to other data types
class VariantsToContigsConverter extends Function2<VariantRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
    NucleotideContigFragmentRDD call(VariantRDD v1, RDD<NucleotideContigFragment> v2);
}

class VariantsToCoverageConverter extends Function2<VariantRDD, RDD<Coverage>, CoverageRDD> {
    CoverageRDD call(VariantRDD v1, RDD<Coverage> v2);
}

class VariantsToFeaturesConverter extends Function2<VariantRDD, RDD<Feature>, FeatureRDD> {
    FeatureRDD call(VariantRDD v1, RDD<Feature> v2);
}

class VariantsToFragmentsConverter extends Function2<VariantRDD, RDD<Fragment>, FragmentRDD> {
    FragmentRDD call(VariantRDD v1, RDD<Fragment> v2);
}

class VariantsToAlignmentRecordsConverter extends Function2<VariantRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
    AlignmentRecordRDD call(VariantRDD v1, RDD<AlignmentRecord> v2);
}

class VariantsToGenotypesConverter extends Function2<VariantRDD, RDD<Genotype>, GenotypeRDD> {
    GenotypeRDD call(VariantRDD v1, RDD<Genotype> v2);
}

class VariantsToVariantContextConverter extends Function2<VariantRDD, RDD<VariantContext>, VariantContextRDD> {
    VariantContextRDD call(VariantRDD v1, RDD<VariantContext> v2);
}

Variant Context Converters

Convert rich variant context data to other genomic data types.

// Same-type conversion for variant contexts
class VariantContextsToVariantContextConverter extends SameTypeConversion<VariantContext, VariantContextRDD> {}

// Convert variant contexts to other data types
class VariantContextsToContigsConverter extends Function2<VariantContextRDD, RDD<NucleotideContigFragment>, NucleotideContigFragmentRDD> {
    NucleotideContigFragmentRDD call(VariantContextRDD v1, RDD<NucleotideContigFragment> v2);
}

class VariantContextsToCoverageConverter extends Function2<VariantContextRDD, RDD<Coverage>, CoverageRDD> {
    CoverageRDD call(VariantContextRDD v1, RDD<Coverage> v2);
}

class VariantContextsToFeaturesConverter extends Function2<VariantContextRDD, RDD<Feature>, FeatureRDD> {
    FeatureRDD call(VariantContextRDD v1, RDD<Feature> v2);
}

class VariantContextsToFragmentsConverter extends Function2<VariantContextRDD, RDD<Fragment>, FragmentRDD> {
    FragmentRDD call(VariantContextRDD v1, RDD<Fragment> v2);
}

class VariantContextsToAlignmentRecordsConverter extends Function2<VariantContextRDD, RDD<AlignmentRecord>, AlignmentRecordRDD> {
    AlignmentRecordRDD call(VariantContextRDD v1, RDD<AlignmentRecord> v2);
}

class VariantContextsToGenotypesConverter extends Function2<VariantContextRDD, RDD<Genotype>, GenotypeRDD> {
    GenotypeRDD call(VariantContextRDD v1, RDD<Genotype> v2);
}

class VariantContextsToVariantsConverter extends Function2<VariantContextRDD, RDD<Variant>, VariantRDD> {
    VariantRDD call(VariantContextRDD v1, RDD<Variant> v2);
}

Usage Examples

Basic RDD type conversion:

import org.bdgenomics.adam.api.java.*;

// Convert alignment records to variants
AlignmentRecordsToVariantsConverter converter = new AlignmentRecordsToVariantsConverter();
VariantRDD variants = converter.call(alignmentRDD, variantRDD.jrdd());

// Convert variants to genotypes  
VariantsToGenotypesConverter varToGeno = new VariantsToGenotypesConverter();
GenotypeRDD genotypes = varToGeno.call(variants, genotypeRDD.jrdd());

Same-type conversions for data manipulation:

// Same-type conversion preserving metadata
AlignmentRecordsToAlignmentRecordsConverter sameConverter = 
    new AlignmentRecordsToAlignmentRecordsConverter();

// Apply some RDD transformation
RDD<AlignmentRecord> filteredRDD = alignmentRDD.jrdd()
    .filter(read -> read.getMapq() > 30);

// Convert back to AlignmentRecordRDD with original metadata
AlignmentRecordRDD filteredAlignments = sameConverter.call(alignmentRDD, filteredRDD);

Chaining multiple conversions:

// Load alignment data
AlignmentRecordRDD alignments = jac.loadAlignments("sample.bam");

// Convert to fragments, then to coverage
AlignmentRecordsToFragmentsConverter toFragments = new AlignmentRecordsToFragmentsConverter();
FragmentsToCoverageConverter toCoverage = new FragmentsToCoverageConverter();

FragmentRDD fragments = toFragments.call(alignments, fragmentRDD.jrdd());
CoverageRDD coverage = toCoverage.call(fragments, coverageRDD.jrdd());

Key Benefits

  • Metadata Preservation: All conversions preserve genomic metadata (sequence dictionary, record groups, etc.)
  • Type Safety: Strong typing ensures correct data transformations
  • Spark Integration: Function2 interface works seamlessly with Spark transformations
  • Comprehensive Coverage: All possible genomic data type conversions supported
  • Performance: Efficient conversions that leverage Spark's distributed computing

Install with Tessl CLI

npx tessl i tessl/maven-org-bdgenomics-adam--adam-apis-2-10

docs

dataset-conversions.md

genomic-data-loading.md

index.md

python-integration.md

rdd-conversions.md

tile.json