or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

aggregation-builder.mdindex.mdplugin-registration.mdstatistical-results.md

statistical-results.mddocs/

0

# Statistical Results

1

2

Core interface and implementation classes for accessing computed statistical measures from matrix stats aggregation results.

3

4

## Capabilities

5

6

### MatrixStats Interface

7

8

Primary interface for accessing statistical results from matrix stats aggregations.

9

10

```java { .api }

11

/**

12

* Interface for MatrixStats Metric Aggregation results

13

*/

14

public interface MatrixStats extends Aggregation {

15

/**

16

* Return the total document count processed by this aggregation

17

* @return Total number of documents

18

*/

19

long getDocCount();

20

21

/**

22

* Return total field count (differs from docCount if there are missing values)

23

* @param field The field name to get count for

24

* @return Number of documents with non-missing values for this field

25

*/

26

long getFieldCount(String field);

27

28

/**

29

* Return the field mean (average)

30

* @param field The field name to get mean for

31

* @return Arithmetic mean of the field values

32

*/

33

double getMean(String field);

34

35

/**

36

* Return the field variance

37

* @param field The field name to get variance for

38

* @return Statistical variance of the field values

39

*/

40

double getVariance(String field);

41

42

/**

43

* Return the skewness of the distribution

44

* @param field The field name to get skewness for

45

* @return Skewness measure (positive = right-skewed, negative = left-skewed)

46

*/

47

double getSkewness(String field);

48

49

/**

50

* Return the kurtosis of the distribution

51

* @param field The field name to get kurtosis for

52

* @return Kurtosis measure (measures tail heaviness)

53

*/

54

double getKurtosis(String field);

55

56

/**

57

* Return the covariance between field x and field y

58

* @param fieldX First field name

59

* @param fieldY Second field name

60

* @return Covariance between the two fields

61

*/

62

double getCovariance(String fieldX, String fieldY);

63

64

/**

65

* Return the correlation coefficient of field x and field y

66

* @param fieldX First field name

67

* @param fieldY Second field name

68

* @return Pearson correlation coefficient (-1 to 1)

69

*/

70

double getCorrelation(String fieldX, String fieldY);

71

}

72

```

73

74

**Usage Examples:**

75

76

```java

77

import org.elasticsearch.search.aggregations.matrix.stats.MatrixStats;

78

79

// Extract aggregation results from search response

80

MatrixStats stats = searchResponse.getAggregations().get("price_analysis");

81

82

// Basic statistics for individual fields

83

long totalDocs = stats.getDocCount();

84

long priceCount = stats.getFieldCount("price");

85

double avgPrice = stats.getMean("price");

86

double priceVariability = stats.getVariance("price");

87

88

// Distribution shape analysis

89

double priceSkew = stats.getSkewness("price"); // Distribution asymmetry

90

double priceKurtosis = stats.getKurtosis("price"); // Tail heaviness

91

92

// Cross-field relationship analysis

93

double priceQuantityCovariance = stats.getCovariance("price", "quantity");

94

double priceQuantityCorrelation = stats.getCorrelation("price", "quantity");

95

96

// Self-correlation always returns 1.0

97

double selfCorr = stats.getCorrelation("price", "price"); // Returns 1.0

98

```

99

100

### InternalMatrixStats Implementation

101

102

Internal implementation class used for distributed computation across Elasticsearch shards.

103

104

```java { .api }

105

/**

106

* Internal implementation of MatrixStats for shard-level computation

107

* Computes distribution statistics over multiple fields

108

*/

109

public class InternalMatrixStats extends InternalAggregation implements MatrixStats {

110

/**

111

* Constructor for per-shard statistics

112

* @param name Aggregation name

113

* @param count Document count

114

* @param multiFieldStatsResults Running statistics from this shard

115

* @param results Final computed results (null for intermediate reductions)

116

* @param metadata Aggregation metadata

117

*/

118

InternalMatrixStats(

119

String name,

120

long count,

121

RunningStats multiFieldStatsResults,

122

MatrixStatsResults results,

123

Map<String, Object> metadata

124

);

125

126

// Implements all MatrixStats interface methods

127

128

/**

129

* Get the running statistics object (for internal use)

130

* @return Running statistics instance

131

*/

132

RunningStats getStats();

133

134

/**

135

* Get the computed results object (for internal use)

136

* @return Final results instance, may be null for intermediate reductions

137

*/

138

MatrixStatsResults getResults();

139

140

/**

141

* Reduce multiple shard results into a single result

142

* @param aggregations List of shard-level aggregation results

143

* @param reduceContext Context for the reduction operation

144

* @return Combined aggregation result

145

*/

146

public InternalAggregation reduce(List<InternalAggregation> aggregations, ReduceContext reduceContext);

147

}

148

```

149

150

### ParsedMatrixStats Client Implementation

151

152

Parsed version of MatrixStats for client-side usage, typically used when parsing aggregation results from JSON responses.

153

154

```java { .api }

155

/**

156

* Parsed version of MatrixStats for client-side usage

157

*/

158

public class ParsedMatrixStats extends ParsedAggregation implements MatrixStats {

159

// Implements all MatrixStats interface methods

160

161

/**

162

* Create ParsedMatrixStats from XContent parser

163

* @param parser XContent parser positioned at the aggregation data

164

* @param name Name of the aggregation

165

* @return Parsed MatrixStats instance

166

*/

167

public static ParsedMatrixStats fromXContent(XContentParser parser, String name) throws IOException;

168

}

169

```

170

171

### MatrixStatsResults Internal Class

172

173

Container class for computed statistical results (package-private, used internally).

174

175

```java { .api }

176

/**

177

* Container for computed matrix statistics results

178

* Descriptive stats gathered per shard, with final correlation and covariance computed on coordinating node

179

*/

180

class MatrixStatsResults implements Writeable {

181

/**

182

* Default constructor for empty results

183

*/

184

MatrixStatsResults();

185

186

/**

187

* Constructor that computes results from running statistics

188

* @param stats Running statistics to compute final results from

189

*/

190

MatrixStatsResults(RunningStats stats);

191

192

/**

193

* Return document count

194

* @return Total number of documents processed

195

*/

196

public final long getDocCount();

197

198

/**

199

* Return the field count for the requested field

200

* @param field Field name

201

* @return Number of non-missing values for this field

202

*/

203

public long getFieldCount(String field);

204

205

/**

206

* Return the mean for the requested field

207

* @param field Field name

208

* @return Arithmetic mean of field values

209

*/

210

public double getMean(String field);

211

212

/**

213

* Return the variance for the requested field

214

* @param field Field name

215

* @return Statistical variance of field values

216

*/

217

public double getVariance(String field);

218

219

/**

220

* Return the skewness for the requested field

221

* @param field Field name

222

* @return Skewness measure of the distribution

223

*/

224

public double getSkewness(String field);

225

226

/**

227

* Return the kurtosis for the requested field

228

* @param field Field name

229

* @return Kurtosis measure of the distribution

230

*/

231

public double getKurtosis(String field);

232

233

/**

234

* Return the covariance between two fields

235

* @param fieldX First field name

236

* @param fieldY Second field name

237

* @return Covariance between the fields

238

*/

239

public double getCovariance(String fieldX, String fieldY);

240

241

/**

242

* Return the correlation coefficient between two fields

243

* @param fieldX First field name

244

* @param fieldY Second field name

245

* @return Pearson correlation coefficient

246

*/

247

public Double getCorrelation(String fieldX, String fieldY);

248

}

249

```

250

251

### RunningStats Internal Class

252

253

Internal class for accumulating statistical data during aggregation processing using a single-pass algorithm.

254

255

```java { .api }

256

/**

257

* Running statistics computation for matrix stats aggregation

258

* Implements single-pass algorithm for computing statistics across large datasets

259

* Based on parallel statistical computation algorithms

260

*/

261

public class RunningStats implements Writeable, Cloneable {

262

/**

263

* Constructor for deserialization from stream

264

* @param in Stream input to read from

265

*/

266

public RunningStats(StreamInput in) throws IOException;

267

268

/**

269

* Serialize running statistics to stream

270

* @param out Stream output to write to

271

*/

272

public void writeTo(StreamOutput out) throws IOException;

273

274

/**

275

* Add a document's field values to the running statistics

276

* @param fieldNames Array of field names

277

* @param fieldVals Array of corresponding field values

278

*/

279

public void add(String[] fieldNames, double[] fieldVals);

280

281

/**

282

* Merge another RunningStats instance (from different shard) into this one

283

* @param other RunningStats instance to merge

284

*/

285

public void merge(RunningStats other);

286

287

/**

288

* Create a deep copy of this RunningStats instance

289

* @return Cloned RunningStats instance

290

*/

291

public RunningStats clone();

292

}

293

```

294

295

## Statistical Measures Explanation

296

297

### Basic Statistics

298

299

- **Document Count**: Total number of documents processed

300

- **Field Count**: Number of documents with non-missing values for a specific field

301

- **Mean**: Arithmetic average of field values

302

- **Variance**: Measure of data spread around the mean

303

304

### Distribution Shape

305

306

- **Skewness**: Measures asymmetry of the distribution:

307

- Positive: Right-skewed (tail extends to the right)

308

- Negative: Left-skewed (tail extends to the left)

309

- Zero: Symmetric distribution

310

311

- **Kurtosis**: Measures tail heaviness:

312

- Higher values indicate heavier tails and more outliers

313

- Lower values indicate lighter tails

314

315

### Cross-Field Relationships

316

317

- **Covariance**: Measures how two variables change together:

318

- Positive: Variables tend to increase together

319

- Negative: One variable increases as the other decreases

320

- Zero: No linear relationship

321

322

- **Correlation**: Normalized covariance (-1 to 1):

323

- 1.0: Perfect positive correlation

324

- -1.0: Perfect negative correlation

325

- 0.0: No linear correlation

326

327

## Error Handling

328

329

Methods may throw exceptions for invalid field names:

330

331

- **IllegalArgumentException**: Thrown when field name is null or doesn't exist in the aggregation results

332

- **Double.NaN**: Returned for mathematical operations that are undefined (e.g., variance of constant values)