Java and Python API bindings for ADAM genomics analysis library providing language-friendly interfaces for distributed genomic data processing.
npx @tessl/cli install tessl/maven-org-bdgenomics-adam--adam-apis_2-11@0.23.0Java and Python API bindings for the ADAM genomics analysis library. This package provides language-friendly interfaces for accessing ADAM's distributed genomic data processing capabilities from Java and Python applications, enabling integration with existing bioinformatics workflows and data science pipelines.
<dependency><groupId>org.bdgenomics.adam</groupId><artifactId>adam-apis_2.11</artifactId><version>0.23.0</version></dependency>Java:
import org.bdgenomics.adam.api.java.JavaADAMContext;
import htsjdk.samtools.ValidationStringency;Scala:
import org.bdgenomics.adam.api.java.JavaADAMContext
import org.bdgenomics.adam.api.java.GenomicDatasetConverters._
import org.bdgenomics.adam.api.java.GenomicRDDConverters._
import org.bdgenomics.adam.api.python.DataFrameConversionWrapperimport org.apache.spark.api.java.JavaSparkContext;
import org.bdgenomics.adam.api.java.JavaADAMContext;
import org.bdgenomics.adam.rdd.read.AlignmentRecordRDD;
import org.bdgenomics.adam.rdd.variant.VariantRDD;
// Initialize ADAM context from Spark context
JavaSparkContext jsc = new JavaSparkContext(spark.sparkContext());
JavaADAMContext jac = new JavaADAMContext(new ADAMContext(jsc.sc()));
// Load genomic alignment data
AlignmentRecordRDD alignments = jac.loadAlignments("input.bam");
// Load variant data
VariantRDD variants = jac.loadVariants("variants.vcf");
// Access the underlying Spark context
JavaSparkContext sparkContext = jac.getSparkContext();The adam-apis package is built around several key components that provide different levels of API access:
The package supports comprehensive genomic data types including alignments, variants, genotypes, features, coverage, fragments, and reference sequences, with automatic format detection based on file extensions and support for compressed formats.
Primary Java interface for loading and working with genomic data files. Provides high-level methods for reading common genomic formats including BAM/SAM/CRAM, VCF, FASTA, FASTQ, BED, GFF, and more.
class JavaADAMContext {
JavaSparkContext getSparkContext();
AlignmentRecordRDD loadAlignments(String pathName);
AlignmentRecordRDD loadAlignments(String pathName, ValidationStringency stringency);
VariantRDD loadVariants(String pathName);
VariantRDD loadVariants(String pathName, ValidationStringency stringency);
GenotypeRDD loadGenotypes(String pathName);
FeatureRDD loadFeatures(String pathName);
CoverageRDD loadCoverage(String pathName);
FragmentRDD loadFragments(String pathName);
NucleotideContigFragmentRDD loadContigFragments(String pathName);
ReferenceFile loadReferenceFile(String pathName);
}Type-safe conversion system for transforming between different genomic dataset types using Spark DataFrames. Enables seamless interoperability between different genomic data formats within Spark workflows.
trait ToContigDatasetConversion[T, U]
trait ToCoverageDatasetConversion[T, U]
trait ToFeatureDatasetConversion[T, U]
trait ToFragmentDatasetConversion[T, U]
trait ToAlignmentRecordDatasetConversion[T, U]
trait ToGenotypeDatasetConversion[T, U]
trait ToVariantDatasetConversion[T, U]Low-level conversion utilities for transforming between different genomic RDD types. Provides fine-grained control over data transformations and supports all genomic data type combinations.
trait SameTypeConversion[T, U] {
def call(v1: RDD[T], v2: RDD[U]): RDD[U]
}DataFrame wrapper functionality enabling Python access to ADAM's data conversion capabilities through PySpark integration.
class DataFrameConversionWrapper(newDf: DataFrame) extends JFunction[DataFrame, DataFrame] {
def call(v1: DataFrame): DataFrame
}// Core genomic data RDD types provided by ADAM
type AlignmentRecordRDD
type NucleotideContigFragmentRDD
type FragmentRDD
type FeatureRDD
type CoverageRDD
type GenotypeRDD
type VariantRDD
type VariantContextRDD
type ReferenceFile