|
static int | TDigestClusterCompare (const void *operand1_ptr, const void *operand2_ptr) |
| Function used to compare two nodes by looking at their mean values. This is an input to the qsort function.
|
|
static uint32_t | TDigestInterpolate (const uint32_t input_sample1, const uint32_t input_sample2, int sample_index, int total_interpolation_points) |
| Function used to interpolate from input_sample1 to input_sample2 by sample_index/num_samples_of_interest.
|
|
static void | TDigestSort (TDigest *td_ptr) |
| Function used to sort a TDigest structure.
|
|
static int | TDigestFindCluster (const TDigest *td_ptr, int desired_sample, int *total_samples_ptr) |
| Function used to find the cluster index where a given sample number resides. This function walks through all clusters counting samples in each one until it finds the cluster with the requested sample number.
|
|
static int | TDigestGetClusterLimit (const TDigest *td_ptr, int cluster_index) |
| Function used to figure out if a proposed max percentile for a cluster is under the percentile limit for that cluster.
|
|
static bool | TDigestMerge (TDigest *td_ptr) |
| Function used to merge all clusters of the t-digest. This function follows Algorithm 1 from the above-referenced white paper by Dunning and Ertl, which provides a means to merge a t-digest with a list of additional samples.
|
|
static uint32_t | TDigestCalculatePercentile (TDigest *td_ptr, int percentile) |
| Function to run the calculation for a percentile value.
|
|
bool | TDigestCreate (TDigestHandle *ret_td_handle_ptr) |
| Function used to create a new t-digest.
|
|
void | TDigestDestroy (TDigestHandle td_handle) |
| Function used to free t-digest memory.
|
|
void | TDigestClear (TDigestHandle td_handle) |
| Function used to reset t-digest to begin collecting a new set of statistics. This function initializes all the TDigest data members.
|
|
void | TDigestAddSample (TDigestHandle td_handle, uint32_t value) |
| Function used to create a new t-digest.
|
|
bool | TDigestGetPercentileValue (TDigestHandle td_handle, int percentile, uint32_t *value_at_percentile_ptr) |
| Function used to get the value at a given percentile.
|
|
int | TDigestGetCount (TDigestHandle td_handle) |
| Function used to get the number of samples in the digest.
|
|
An implementation of the t-digest percentile estimation algorithm developed by Ted Dunning and Otmar Ertl and available here: https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf.
This algorithm gathers samples of a given metric and stores them in clusters of samples such that each cluster contains a mean and sample count and no other information. As clusters grow and the number of clusters grows, clusters can be combined or created in order to meet algorithm requirements for the max number of clusters and cluster weight (number of samples per cluster). Clusters near the edges of the distribution of samples are scaled such that they contain less samples and clusters near the center of the distribution are scaled such that they contain more samples. Such scaling has the effect of keeping estimation error low. (Nearly) exact values for a given percentile can be calculated from this set of clusters by interpolating between cluster means.
static bool TDigestMerge |
( |
TDigest * | td_ptr | ) |
|
|
static |
Function used to merge all clusters of the t-digest. This function follows Algorithm 1 from the above-referenced white paper by Dunning and Ertl, which provides a means to merge a t-digest with a list of additional samples.
- Parameters
-
td_ptr | Pointer to the TDigest object to use. |
- Returns
- True if merge was successful; false if not.
NOTE: We choose to always loop forward to simplify the algorithm, but the white paper referenced above discusses a potential improvement to error rates near q=0 if looping is alternated between forward and reverse from merge to merge.