tessl/maven-com-github-haifengl--smile-core

Statistical Machine Intelligence and Learning Engine providing comprehensive machine learning algorithms for classification, regression, clustering, and feature engineering in Java

—

Pending

Overview

Eval results

Files

Regression

Name: tessl/maven-com-github-haifengl--smile-core
Author: tessl

Supervised learning algorithms for predicting continuous values. Smile Core provides comprehensive regression capabilities from traditional linear models to advanced ensemble methods, kernel machines, and neural networks.

Capabilities

Core Regression Interface

All regression algorithms implement the unified Regression<T> interface, providing consistent prediction methods and optional online learning support.

/**
 * Base regression interface for all supervised learning algorithms
 * @param <T> the type of input objects
 */
interface Regression<T> extends ToDoubleFunction<T>, Serializable {
    /** Predict the target value for input */
    double predict(T x);
    
    /** Online learning update (if supported) */
    default void update(T x, double y);
    
    /** Create ensemble of multiple regressors */
    static <T> Regression<T> ensemble(Regression<T>... regressors);
}

Linear Regression Models

Family of linear regression algorithms with various regularization techniques.

/**
 * Ordinary Least Squares regression
 */
class OLS implements Regression<double[]> {
    /** Train OLS regression */
    public static OLS fit(double[][] x, double[] y);
    
    /** Train with intercept control */
    public static OLS fit(double[][] x, double[] y, boolean intercept);
    
    /** Predict target value */
    public double predict(double[] x);
    
    /** Get model coefficients */
    public double[] coefficients();
    
    /** Get intercept term */
    public double intercept();
    
    /** Get R-squared value */
    public double RSquared();
    
    /** Get adjusted R-squared */
    public double adjustedRSquared();
    
    /** Get residual sum of squares */
    public double RSS();
    
    /** Get total sum of squares */
    public double TSS();
}

/**
 * Ridge regression with L2 regularization
 */
class RidgeRegression implements Regression<double[]> {
    /** Train ridge regression */
    public static RidgeRegression fit(double[][] x, double[] y, double lambda);
    
    /** Predict target value */
    public double predict(double[] x);
    
    /** Get model coefficients */
    public double[] coefficients();
    
    /** Get intercept term */
    public double intercept();
    
    /** Get regularization parameter */
    public double lambda();
}

/**
 * LASSO regression with L1 regularization
 */
class LASSO implements Regression<double[]> {
    /** Train LASSO regression */
    public static LASSO fit(double[][] x, double[] y, double lambda);
    
    /** Train with tolerance and max iterations */
    public static LASSO fit(double[][] x, double[] y, double lambda, double tolerance, int maxIter);
    
    /** Predict target value */
    public double predict(double[] x);
    
    /** Get model coefficients */
    public double[] coefficients();
    
    /** Get intercept term */
    public double intercept();
    
    /** Get L1 penalty parameter */
    public double lambda();
}

/**
 * Elastic Net regression combining L1 and L2 regularization
 */
class ElasticNet implements Regression<double[]> {
    /** Train Elastic Net regression */
    public static ElasticNet fit(double[][] x, double[] y, double lambda1, double lambda2);
    
    /** Train with convergence parameters */
    public static ElasticNet fit(double[][] x, double[] y, double lambda1, double lambda2, double tolerance, int maxIter);
    
    /** Predict target value */
    public double predict(double[] x);
    
    /** Get model coefficients */
    public double[] coefficients();
    
    /** Get L1 penalty parameter */
    public double lambda1();
    
    /** Get L2 penalty parameter */
    public double lambda2();
}

Usage Example:

import smile.regression.*;

// Train linear models
OLS ols = OLS.fit(x, y);
RidgeRegression ridge = RidgeRegression.fit(x, y, 0.1);
LASSO lasso = LASSO.fit(x, y, 0.01);

// Make predictions
double prediction = ols.predict(newSample);
double ridgePred = ridge.predict(newSample);

// Get model statistics
double r2 = ols.RSquared();
double[] coeffs = ridge.coefficients();

Tree-Based Regression

Regression algorithms based on decision trees and ensemble methods.

/**
 * Regression tree using CART algorithm
 */
class RegressionTree implements Regression<double[]>, DataFrameRegression {
    /** Train regression tree */
    public static RegressionTree fit(double[][] x, double[] y);
    
    /** Train with formula on DataFrame */
    public static RegressionTree fit(Formula formula, DataFrame data);
    
    /** Train with custom parameters */
    public static RegressionTree fit(double[][] x, double[] y, int maxDepth, int maxNodes, int nodeSize);
    
    /** Predict target value */
    public double predict(double[] x);
    
    /** Get feature importance */
    public double[] importance();
    
    /** Get tree structure */
    public String toString();
}

/**
 * Random Forest regression
 */
class RandomForest implements Regression<double[]>, DataFrameRegression {
    /** Train random forest regression */
    public static RandomForest fit(double[][] x, double[] y);
    
    /** Train with formula on DataFrame */
    public static RandomForest fit(Formula formula, DataFrame data);
    
    /** Train with custom parameters */
    public static RandomForest fit(double[][] x, double[] y, int numTrees, int mtry, int maxDepth, int nodeSize);
    
    /** Predict target value */
    public double predict(double[] x);
    
    /** Get out-of-bag RMSE */
    public double error();
    
    /** Get feature importance */
    public double[] importance();
}

/**
 * Gradient Tree Boosting regression
 */
class GradientTreeBoost implements Regression<double[]>, DataFrameRegression {
    /** Train gradient boosting regression */
    public static GradientTreeBoost fit(double[][] x, double[] y);
    
    /** Train with formula on DataFrame */
    public static GradientTreeBoost fit(Formula formula, DataFrame data);
    
    /** Train with custom parameters */
    public static GradientTreeBoost fit(double[][] x, double[] y, int numTrees, int maxDepth, double shrinkage, double subsample);
    
    /** Predict target value */
    public double predict(double[] x);
    
    /** Get feature importance */
    public double[] importance();
}

Kernel Methods

Kernel-based regression algorithms including support vector regression and Gaussian processes.

/**
 * Support Vector Regression
 */
class SVM implements Regression<double[]> {
    /** Train SVR with RBF kernel */
    public static SVM fit(double[][] x, double[] y, double gamma, double C, double epsilon);
    
    /** Train SVR with custom kernel */
    public static SVM fit(double[][] x, double[] y, Kernel kernel, double C, double epsilon);
    
    /** Predict target value */
    public double predict(double[] x);
    
    /** Get support vectors */
    public SupportVector[] supportVectors();
    
    /** Get number of support vectors */
    public int numSupportVectors();
}

/**
 * Gaussian Process Regression
 */
class GaussianProcessRegression implements Regression<double[]> {
    /** Train Gaussian Process with RBF kernel */
    public static GaussianProcessRegression fit(double[][] x, double[] y, double sigma);
    
    /** Train with custom kernel and noise */
    public static GaussianProcessRegression fit(double[][] x, double[] y, Kernel kernel, double noise);
    
    /** Predict target value */
    public double predict(double[] x);
    
    /** Predict with uncertainty estimate */
    public double predict(double[] x, double[] variance);
    
    /** Get posterior mean function */
    public double[] mean();
    
    /** Get kernel function */
    public Kernel kernel();
}

Neural Network Regression

Multi-layer perceptron for regression tasks with configurable architecture.

/**
 * Multi-Layer Perceptron regression
 */
class MLP implements Regression<double[]> {
    /** Train MLP regression */
    public static MLP fit(double[][] x, double[] y);
    
    /** Train with custom architecture */
    public static MLP fit(double[][] x, double[] y, int[] hiddenLayers, ActivationFunction activation);
    
    /** Train with full configuration */
    public static MLP fit(double[][] x, double[] y, Properties params);
    
    /** Predict target value */
    public double predict(double[] x);
    
    /** Online learning update */
    public void update(double[] x, double y);
    
    /** Get network weights for layer */
    public double[][] getWeights(int layer);
}

Radial Basis Function Networks

RBF networks for non-linear regression with localized activation functions.

/**
 * Radial Basis Function Network regression
 */
class RBFNetwork implements Regression<double[]> {
    /** Train RBF network with Gaussian RBFs */
    public static RBFNetwork fit(double[][] x, double[] y, int numCenters);
    
    /** Train with custom RBF and centers */
    public static RBFNetwork fit(double[][] x, double[] y, RBF rbf, double[][] centers);
    
    /** Predict target value */
    public double predict(double[] x);
    
    /** Get RBF centers */
    public double[][] centers();
    
    /** Get output weights */
    public double[] weights();
}

Base Classes and Utilities

Abstract base classes and utility functions for regression algorithms.

/**
 * Base class for linear regression models
 */
abstract class LinearModel implements Regression<double[]> {
    /** Model coefficients */
    public abstract double[] coefficients();
    
    /** Intercept term */
    public abstract double intercept();
    
    /** Predict using linear combination */
    public double predict(double[] x);
}

/**
 * Base class for kernel-based regression
 */
abstract class KernelMachine<T> implements Regression<T> {
    /** Kernel function */
    public abstract Kernel<T> kernel();
    
    /** Support vectors or training instances */
    public abstract T[] instances();
    
    /** Instance weights */
    public abstract double[] weights();
}

/**
 * Interface for DataFrame-based regression
 */
interface DataFrameRegression {
    /** Train with formula on DataFrame */
    static Regression<Tuple> fit(Formula formula, DataFrame data);
}

Generalized Linear Models

GLM framework for regression with various distribution families.

/**
 * Generalized Linear Model
 */
class GLM implements Regression<double[]> {
    /** Train GLM with Gaussian family (linear regression) */
    public static GLM fit(double[][] x, double[] y);
    
    /** Train GLM with specified family and link */
    public static GLM fit(double[][] x, double[] y, GLM.Family family, GLM.Link link);
    
    /** Train with regularization */
    public static GLM fit(double[][] x, double[] y, GLM.Family family, GLM.Link link, double lambda, double alpha);
    
    /** Predict target value */
    public double predict(double[] x);
    
    /** Get model coefficients */
    public double[] coefficients();
    
    /** Get deviance */
    public double deviance();
    
    /** GLM distribution families */
    enum Family { GAUSSIAN, BINOMIAL, POISSON, GAMMA }
    
    /** GLM link functions */
    enum Link { IDENTITY, LOG, LOGIT, INVERSE, SQRT }
}

Usage Example:

import smile.regression.GaussianProcessRegression;
import smile.regression.RandomForest;

// Gaussian Process with uncertainty
GaussianProcessRegression gp = GaussianProcessRegression.fit(x, y, 1.0);
double[] variance = new double[1];
double prediction = gp.predict(newSample, variance);
double uncertainty = Math.sqrt(variance[0]);

// Random Forest ensemble
RandomForest rf = RandomForest.fit(x, y, 500, 5, 20, 5);
double prediction = rf.predict(newSample);
double oobError = rf.error();

Training Patterns

All regression algorithms follow consistent training patterns:

Array-based training:

Regression model = Algorithm.fit(double[][] x, double[] y);
Regression model = Algorithm.fit(double[][] x, double[] y, parameters...);

DataFrame-based training:

Regression model = Algorithm.fit(Formula formula, DataFrame data);

Prediction patterns:

double prediction = model.predict(double[] x);
double prediction = model.predict(double[] x, double[] uncertainty); // For probabilistic models

Common Parameters

Most regression algorithms support these common configuration options:

lambda: Regularization parameter (linear models)
alpha: Elastic net mixing parameter
maxDepth: Maximum tree depth (tree-based)
numTrees: Number of trees in ensemble
nodeSize: Minimum samples per leaf
shrinkage: Learning rate (boosting)
subsample: Fraction of samples for training
tolerance: Convergence tolerance
maxIter: Maximum iterations
seed: Random seed for reproducibility

Install with Tessl CLI

npx tessl i tessl/maven-com-github-haifengl--smile-core

docs

advanced-analytics.md

classification.md

clustering.md

deep-learning.md

feature-engineering.md

index.md

regression.md

validation-metrics.md

tile.json