# Percentile approximation

In general, percentiles are useful for understanding the distribution of data. The 50th percentile is the point at which half of your data is greater and half is lesser. The 10th percentile is the point at which 90% of the data is greater, and 10% is lesser. The 99th percentile is the point at which 1% is greater, and 99% is lesser.

The 50th percentile, or median, is often a more useful measure than the average, especially when your data contains outliers. Outliers can dramatically change the average, but do not affect the median as much. For example, if you have three rooms in your house and two of them are 40â„‰ (4â„ƒ) and one is 130â„‰ (54â„ƒ), the average room temperature is 70â„‰ (21â„ƒ), which doesn't tell you much. However, the 50th percentile temperature is 40â„‰ (4â„ƒ), which tells you that at least half your rooms are at refrigerator temperatures (also, you should probably get your heating checked!)

Percentiles are sometimes avoided because calculating them requires more CPU and
memory than an average or other aggregate measures. This is because an exact
computation of the percentile needs the full dataset as an ordered list.
Timescale uses approximation algorithms to calculate a percentile without
requiring all of the data. This also makes them more compatible with continuous
aggregates. By default, TimescaleDB uses `uddsketch`

, but you can also choose to
use `tdigest`

. For more information about these algorithms, see the
advanced aggregation methods documentation.

note

- For more information about how percentile approximation works, read our percentile approximation blog.
- For more information about percentile approximation API calls, see the hyperfunction API documentation.

Found an issue on this page?

Report an issue!