CtrlK
BlogDocsLog inGet started
Tessl Logo

domain-ml

Use when building ML/AI apps in Rust. Keywords: machine learning, ML, AI, tensor, model, inference, neural network, deep learning, training, prediction, ndarray, tch-rs, burn, candle, 机器学习, 人工智能, 模型推理

Install with Tessl CLI

npx tessl i github:actionbook/rust-skills --skill domain-ml
What are skills?

72

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Machine Learning Domain

Layer 3: Domain Constraints

Domain Constraints → Design Implications

Domain RuleDesign ConstraintRust Implication
Large dataEfficient memoryZero-copy, streaming
GPU accelerationCUDA/Metal supportcandle, tch-rs
Model portabilityStandard formatsONNX
Batch processingThroughput over latencyBatched inference
Numerical precisionFloat handlingndarray, careful f32/f64
ReproducibilityDeterministicSeeded random, versioning

Critical Constraints

Memory Efficiency

RULE: Avoid copying large tensors
WHY: Memory bandwidth is bottleneck
RUST: References, views, in-place ops

GPU Utilization

RULE: Batch operations for GPU efficiency
WHY: GPU overhead per kernel launch
RUST: Batch sizes, async data loading

Model Portability

RULE: Use standard model formats
WHY: Train in Python, deploy in Rust
RUST: ONNX via tract or candle

Trace Down ↓

From constraints to design (Layer 2):

"Need efficient data pipelines"
    ↓ m10-performance: Streaming, batching
    ↓ polars: Lazy evaluation

"Need GPU inference"
    ↓ m07-concurrency: Async data loading
    ↓ candle/tch-rs: CUDA backend

"Need model loading"
    ↓ m12-lifecycle: Lazy init, caching
    ↓ tract: ONNX runtime

Use Case → Framework

Use CaseRecommendedWhy
Inference onlytract (ONNX)Lightweight, portable
Training + inferencecandle, burnPure Rust, GPU
PyTorch modelstch-rsDirect bindings
Data pipelinespolarsFast, lazy eval

Key Crates

PurposeCrate
Tensorsndarray
ONNX inferencetract
ML frameworkcandle, burn
PyTorch bindingstch-rs
Data processingpolars
Embeddingsfastembed

Design Patterns

PatternPurposeImplementation
Model loadingOnce, reuseOnceLock<Model>
BatchingThroughputCollect then process
StreamingLarge dataIterator-based
GPU asyncParallelismData loading parallel to compute

Code Pattern: Inference Server

use std::sync::OnceLock;
use tract_onnx::prelude::*;

static MODEL: OnceLock<SimplePlan<TypedFact, Box<dyn TypedOp>, Graph<TypedFact, Box<dyn TypedOp>>>> = OnceLock::new();

fn get_model() -> &'static SimplePlan<...> {
    MODEL.get_or_init(|| {
        tract_onnx::onnx()
            .model_for_path("model.onnx")
            .unwrap()
            .into_optimized()
            .unwrap()
            .into_runnable()
            .unwrap()
    })
}

async fn predict(input: Vec<f32>) -> anyhow::Result<Vec<f32>> {
    let model = get_model();
    let input = tract_ndarray::arr1(&input).into_shape((1, input.len()))?;
    let result = model.run(tvec!(input.into()))?;
    Ok(result[0].to_array_view::<f32>()?.iter().copied().collect())
}

Code Pattern: Batched Inference

async fn batch_predict(inputs: Vec<Vec<f32>>, batch_size: usize) -> Vec<Vec<f32>> {
    let mut results = Vec::with_capacity(inputs.len());

    for batch in inputs.chunks(batch_size) {
        // Stack inputs into batch tensor
        let batch_tensor = stack_inputs(batch);

        // Run inference on batch
        let batch_output = model.run(batch_tensor).await;

        // Unstack results
        results.extend(unstack_outputs(batch_output));
    }

    results
}

Common Mistakes

MistakeDomain ViolationFix
Clone tensorsMemory wasteUse views
Single inferenceGPU underutilizedBatch processing
Load model per requestSlowSingleton pattern
Sync data loadingGPU idleAsync pipeline

Trace to Layer 1

ConstraintLayer 2 PatternLayer 1 Implementation
Memory efficiencyZero-copyndarray views
Model singletonLazy initOnceLock<Model>
Batch processingChunked iterationchunks() + parallel
GPU asyncConcurrent loadingtokio::spawn + GPU

Related Skills

WhenSee
Performancem10-performance
Lazy initializationm12-lifecycle
Async patternsm07-concurrency
Memory efficiencym01-ownership
Repository
actionbook/rust-skills
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.