tdigest()

tdigest(
    buckets INTEGER,
    value DOUBLE PRECISION
) RETURNS TDigest

This constructs and returns a tdigest with the specified number of buckets over the given values.

TimescaleDB provides an implementation of the tdigest data structure for quantile approximations. A tdigest is a space efficient aggregation which provides increased resolution at the edges of the distribution. This allows for more accurate estimates of extreme quantiles than traditional methods.

Timescale's tdigest is implemented as an aggregate function in PostgreSQL. It does not support moving-aggregate mode, and are not ordered-set aggregates. They are currently restricted to float values. They are parallelizable and are good candidates for continuous aggregation.

The tdigest function is somewhat dependent on the order of inputs. The percentile approximations should be nearly equal for the same underlying data, especially at the extremes of the quantile range where the tdigest is inherently more accurate, they are unlikely to be identical if built in a different order. While this should have little effect on the accuracy of the estimates, it is worth noting that repeating the creation of the tdigest might have subtle differences if the call is being parallelized by PostgreSQL.

Required arguments

NameTypeDescription
bucketsINTEGERNumber of buckets in the digest. Increasing this provides more accurate quantile estimates, but requires more memory.
valueDOUBLE PRECISIONColumn to aggregate

Returns

ColumnTypeDescription
A tdigest object which can be passed to other tdigest APIs

Sample usage

This example uses a table called samples, with a column called weights, that holds DOUBLE PRECISION values. This query returns a digest over that column:

SELECT tdigest(100, data) FROM samples;

This example builds a view from the aggregate that can be passed to other tdigest functions:

CREATE VIEW digest AS
    SELECT tdigest(100, data)
    FROM samples;

Found an issue on this page?

Report an issue!

Related Content