CtrlK
BlogDocsLog inGet started
Tessl Logo

senior-data-scientist

World-class senior data scientist skill specialising in statistical modeling, experiment design, causal inference, and predictive analytics. Covers A/B testing (sample sizing, two-proportion z-tests, Bonferroni correction), difference-in-differences, feature engineering pipelines (Scikit-learn, XGBoost), cross-validated model evaluation (AUC-ROC, AUC-PR, SHAP), and MLflow experiment tracking — using Python (NumPy, Pandas, Scikit-learn), R, and SQL. Use when designing or analysing controlled experiments, building and evaluating classification or regression models, performing causal analysis on observational data, engineering features for structured tabular datasets, or translating statistical findings into data-driven business decisions.

81

1.25x
Quality

88%

Does it follow best practices?

Impact

73%

1.25x

Average score across 6 eval scenarios

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Evaluation results

41%

5%

A/B Test Design for Checkout Funnel Optimization

Experiment design with provided scripts

Criteria
Without context
With context

Uses experiment_designer script

0%

0%

Correct script flags

0%

0%

Power analysis script has type hints

100%

0%

Alpha = 0.05 used

100%

100%

Power = 0.80 used

100%

100%

Monitoring plan present

100%

100%

MLflow or W&B mentioned

0%

0%

Uptime/error rate target

0%

0%

Latency SLO referenced

0%

100%

Scikit-learn or statsmodels used

0%

100%

Batch processing mentioned

0%

0%

61%

9%

Customer Churn Prediction Feature Pipeline

Feature engineering pipeline with reliability patterns

Criteria
Without context
With context

Uses feature pipeline script

0%

0%

Correct script flags

0%

0%

Type hints in pipeline

100%

100%

Batch processing design

25%

100%

Retry logic present

0%

0%

Circuit breaker or failure design

28%

71%

Data quality validation

100%

100%

Comprehensive tests written

100%

100%

Pandas or NumPy used

100%

100%

Comprehensive logging

100%

100%

10x scalability noted

0%

0%

Feature catalog complete

100%

100%

72%

16%

Production Readiness Review for Credit Risk Model

Model evaluation with security and monitoring

Criteria
Without context
With context

Uses model eval script

0%

0%

Correct script flags

0%

0%

PII anonymization addressed

100%

100%

Data encryption addressed

100%

100%

GDPR/CCPA compliance

100%

100%

Latency SLOs specified

100%

100%

Error rate target specified

100%

100%

MLflow or W&B for tracking

0%

100%

Canary or feature flag deployment

100%

100%

Type hints in eval script

0%

100%

Comprehensive logging in code

0%

0%

SSN/PII not logged raw

100%

100%

84%

24%

Checkout Flow A/B Test Analysis

A/B test multi-metric analysis with corrections

Criteria
Without context
With context

Two-proportion z-test

100%

100%

Lift reported

100%

100%

Confidence interval reported

100%

100%

Bonferroni correction applied

0%

100%

Corrected alpha value

0%

100%

Sample ratio mismatch check

100%

100%

SRM threshold referenced

0%

100%

User-level randomization noted

25%

0%

Business cycle duration concern

0%

0%

Primary metric identified

100%

100%

scipy or statsmodels used

100%

100%

92%

4%

Loyalty Program Impact Analysis

Causal inference with difference-in-differences

Criteria
Without context
With context

statsmodels formula API

100%

100%

Interaction term in formula

100%

100%

HC3 robust standard errors

0%

100%

Clustered standard errors

100%

100%

Parallel trends validation

100%

100%

ATT reported

100%

100%

Confidence interval for ATT

100%

100%

Propensity score matching considered

100%

0%

Baseline group comparison

100%

100%

Not just p-value

100%

100%

pandas used for data

100%

100%

92%

35%

Customer Churn Prediction Model

Feature engineering specifics and imbalanced classification

Criteria
Without context
With context

Log-transform applied

100%

100%

High-cardinality target encoding

0%

20%

Cyclical time feature (sin/cos)

0%

100%

is_weekend feature

0%

100%

Fit on train only

40%

100%

Lag features before split

100%

100%

Feature business meaning documented

100%

100%

AUC-PR reported

100%

100%

AUC-ROC reported

100%

100%

DummyClassifier baseline

100%

100%

SHAP values computed

0%

100%

StratifiedKFold used

100%

100%

Repository
alirezarezvani/claude-skills
Evaluated
Agent
Claude Code
Model
Claude Sonnet 4.6

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.