0
# Statistical Results
1
2
Core interface and implementation classes for accessing computed statistical measures from matrix stats aggregation results.
3
4
## Capabilities
5
6
### MatrixStats Interface
7
8
Primary interface for accessing statistical results from matrix stats aggregations.
9
10
```java { .api }
11
/**
12
* Interface for MatrixStats Metric Aggregation results
13
*/
14
public interface MatrixStats extends Aggregation {
15
/**
16
* Return the total document count processed by this aggregation
17
* @return Total number of documents
18
*/
19
long getDocCount();
20
21
/**
22
* Return total field count (differs from docCount if there are missing values)
23
* @param field The field name to get count for
24
* @return Number of documents with non-missing values for this field
25
*/
26
long getFieldCount(String field);
27
28
/**
29
* Return the field mean (average)
30
* @param field The field name to get mean for
31
* @return Arithmetic mean of the field values
32
*/
33
double getMean(String field);
34
35
/**
36
* Return the field variance
37
* @param field The field name to get variance for
38
* @return Statistical variance of the field values
39
*/
40
double getVariance(String field);
41
42
/**
43
* Return the skewness of the distribution
44
* @param field The field name to get skewness for
45
* @return Skewness measure (positive = right-skewed, negative = left-skewed)
46
*/
47
double getSkewness(String field);
48
49
/**
50
* Return the kurtosis of the distribution
51
* @param field The field name to get kurtosis for
52
* @return Kurtosis measure (measures tail heaviness)
53
*/
54
double getKurtosis(String field);
55
56
/**
57
* Return the covariance between field x and field y
58
* @param fieldX First field name
59
* @param fieldY Second field name
60
* @return Covariance between the two fields
61
*/
62
double getCovariance(String fieldX, String fieldY);
63
64
/**
65
* Return the correlation coefficient of field x and field y
66
* @param fieldX First field name
67
* @param fieldY Second field name
68
* @return Pearson correlation coefficient (-1 to 1)
69
*/
70
double getCorrelation(String fieldX, String fieldY);
71
}
72
```
73
74
**Usage Examples:**
75
76
```java
77
import org.elasticsearch.search.aggregations.matrix.stats.MatrixStats;
78
79
// Extract aggregation results from search response
80
MatrixStats stats = searchResponse.getAggregations().get("price_analysis");
81
82
// Basic statistics for individual fields
83
long totalDocs = stats.getDocCount();
84
long priceCount = stats.getFieldCount("price");
85
double avgPrice = stats.getMean("price");
86
double priceVariability = stats.getVariance("price");
87
88
// Distribution shape analysis
89
double priceSkew = stats.getSkewness("price"); // Distribution asymmetry
90
double priceKurtosis = stats.getKurtosis("price"); // Tail heaviness
91
92
// Cross-field relationship analysis
93
double priceQuantityCovariance = stats.getCovariance("price", "quantity");
94
double priceQuantityCorrelation = stats.getCorrelation("price", "quantity");
95
96
// Self-correlation always returns 1.0
97
double selfCorr = stats.getCorrelation("price", "price"); // Returns 1.0
98
```
99
100
### InternalMatrixStats Implementation
101
102
Internal implementation class used for distributed computation across Elasticsearch shards.
103
104
```java { .api }
105
/**
106
* Internal implementation of MatrixStats for shard-level computation
107
* Computes distribution statistics over multiple fields
108
*/
109
public class InternalMatrixStats extends InternalAggregation implements MatrixStats {
110
/**
111
* Constructor for per-shard statistics
112
* @param name Aggregation name
113
* @param count Document count
114
* @param multiFieldStatsResults Running statistics from this shard
115
* @param results Final computed results (null for intermediate reductions)
116
* @param metadata Aggregation metadata
117
*/
118
InternalMatrixStats(
119
String name,
120
long count,
121
RunningStats multiFieldStatsResults,
122
MatrixStatsResults results,
123
Map<String, Object> metadata
124
);
125
126
// Implements all MatrixStats interface methods
127
128
/**
129
* Get the running statistics object (for internal use)
130
* @return Running statistics instance
131
*/
132
RunningStats getStats();
133
134
/**
135
* Get the computed results object (for internal use)
136
* @return Final results instance, may be null for intermediate reductions
137
*/
138
MatrixStatsResults getResults();
139
140
/**
141
* Reduce multiple shard results into a single result
142
* @param aggregations List of shard-level aggregation results
143
* @param reduceContext Context for the reduction operation
144
* @return Combined aggregation result
145
*/
146
public InternalAggregation reduce(List<InternalAggregation> aggregations, ReduceContext reduceContext);
147
}
148
```
149
150
### ParsedMatrixStats Client Implementation
151
152
Parsed version of MatrixStats for client-side usage, typically used when parsing aggregation results from JSON responses.
153
154
```java { .api }
155
/**
156
* Parsed version of MatrixStats for client-side usage
157
*/
158
public class ParsedMatrixStats extends ParsedAggregation implements MatrixStats {
159
// Implements all MatrixStats interface methods
160
161
/**
162
* Create ParsedMatrixStats from XContent parser
163
* @param parser XContent parser positioned at the aggregation data
164
* @param name Name of the aggregation
165
* @return Parsed MatrixStats instance
166
*/
167
public static ParsedMatrixStats fromXContent(XContentParser parser, String name) throws IOException;
168
}
169
```
170
171
### MatrixStatsResults Internal Class
172
173
Container class for computed statistical results (package-private, used internally).
174
175
```java { .api }
176
/**
177
* Container for computed matrix statistics results
178
* Descriptive stats gathered per shard, with final correlation and covariance computed on coordinating node
179
*/
180
class MatrixStatsResults implements Writeable {
181
/**
182
* Default constructor for empty results
183
*/
184
MatrixStatsResults();
185
186
/**
187
* Constructor that computes results from running statistics
188
* @param stats Running statistics to compute final results from
189
*/
190
MatrixStatsResults(RunningStats stats);
191
192
/**
193
* Return document count
194
* @return Total number of documents processed
195
*/
196
public final long getDocCount();
197
198
/**
199
* Return the field count for the requested field
200
* @param field Field name
201
* @return Number of non-missing values for this field
202
*/
203
public long getFieldCount(String field);
204
205
/**
206
* Return the mean for the requested field
207
* @param field Field name
208
* @return Arithmetic mean of field values
209
*/
210
public double getMean(String field);
211
212
/**
213
* Return the variance for the requested field
214
* @param field Field name
215
* @return Statistical variance of field values
216
*/
217
public double getVariance(String field);
218
219
/**
220
* Return the skewness for the requested field
221
* @param field Field name
222
* @return Skewness measure of the distribution
223
*/
224
public double getSkewness(String field);
225
226
/**
227
* Return the kurtosis for the requested field
228
* @param field Field name
229
* @return Kurtosis measure of the distribution
230
*/
231
public double getKurtosis(String field);
232
233
/**
234
* Return the covariance between two fields
235
* @param fieldX First field name
236
* @param fieldY Second field name
237
* @return Covariance between the fields
238
*/
239
public double getCovariance(String fieldX, String fieldY);
240
241
/**
242
* Return the correlation coefficient between two fields
243
* @param fieldX First field name
244
* @param fieldY Second field name
245
* @return Pearson correlation coefficient
246
*/
247
public Double getCorrelation(String fieldX, String fieldY);
248
}
249
```
250
251
### RunningStats Internal Class
252
253
Internal class for accumulating statistical data during aggregation processing using a single-pass algorithm.
254
255
```java { .api }
256
/**
257
* Running statistics computation for matrix stats aggregation
258
* Implements single-pass algorithm for computing statistics across large datasets
259
* Based on parallel statistical computation algorithms
260
*/
261
public class RunningStats implements Writeable, Cloneable {
262
/**
263
* Constructor for deserialization from stream
264
* @param in Stream input to read from
265
*/
266
public RunningStats(StreamInput in) throws IOException;
267
268
/**
269
* Serialize running statistics to stream
270
* @param out Stream output to write to
271
*/
272
public void writeTo(StreamOutput out) throws IOException;
273
274
/**
275
* Add a document's field values to the running statistics
276
* @param fieldNames Array of field names
277
* @param fieldVals Array of corresponding field values
278
*/
279
public void add(String[] fieldNames, double[] fieldVals);
280
281
/**
282
* Merge another RunningStats instance (from different shard) into this one
283
* @param other RunningStats instance to merge
284
*/
285
public void merge(RunningStats other);
286
287
/**
288
* Create a deep copy of this RunningStats instance
289
* @return Cloned RunningStats instance
290
*/
291
public RunningStats clone();
292
}
293
```
294
295
## Statistical Measures Explanation
296
297
### Basic Statistics
298
299
- **Document Count**: Total number of documents processed
300
- **Field Count**: Number of documents with non-missing values for a specific field
301
- **Mean**: Arithmetic average of field values
302
- **Variance**: Measure of data spread around the mean
303
304
### Distribution Shape
305
306
- **Skewness**: Measures asymmetry of the distribution:
307
- Positive: Right-skewed (tail extends to the right)
308
- Negative: Left-skewed (tail extends to the left)
309
- Zero: Symmetric distribution
310
311
- **Kurtosis**: Measures tail heaviness:
312
- Higher values indicate heavier tails and more outliers
313
- Lower values indicate lighter tails
314
315
### Cross-Field Relationships
316
317
- **Covariance**: Measures how two variables change together:
318
- Positive: Variables tend to increase together
319
- Negative: One variable increases as the other decreases
320
- Zero: No linear relationship
321
322
- **Correlation**: Normalized covariance (-1 to 1):
323
- 1.0: Perfect positive correlation
324
- -1.0: Perfect negative correlation
325
- 0.0: No linear correlation
326
327
## Error Handling
328
329
Methods may throw exceptions for invalid field names:
330
331
- **IllegalArgumentException**: Thrown when field name is null or doesn't exist in the aggregation results
332
- **Double.NaN**: Returned for mathematical operations that are undefined (e.g., variance of constant values)