stats_agg() (two variables) functions
ToolkitTimescaleDB Toolkit functions are available under Timescale Community Edition. They are automatically included with Timescale, but must be installed separately for self-hosted TimescaleDB. Click to learn more.Introduction
Perform linear regression analysis, for example to calculate correlation coefficient and covariance, on two-dimensional data. You can also calculate common statistics, such as average and standard deviation, on each dimension separately. These functions are similar to the PostgreSQL statistical aggregates, but they include more features and are easier to use in continuous aggregates and window functions. The linear regressions are based on the standard least-squares fitting method.
These functions work on two-dimensional data. To work with one-dimensional data,
for example to calculate the average and standard deviation of a single
variable, see the one-dimensional stats_agg
functions.
Related hyperfunction groups
Aggregate
- stats_agg (two variables)
- Aggregate data into an intermediate statistical aggregate form for further calculation
Accessor
- average_y, average_x
- Calculate the average from a two-dimensional statistical aggregate for the dimension specified
- corr
- Calculate the correlation coefficient from a two-dimensional statistical aggregate
- covariance
- Calculate the covariance from a two-dimensional statistical aggregate
- determination_coeff
- Calculate the determination coefficient from a two-dimensional statistical aggregate
- intercept
- Calculate the intercept from a two-dimensional statistical aggregate
- kurtosis_y, kurtosis_x
- Calculate the kurtosis from a two-dimensional statistical aggregate for the dimension specified
- num_vals
- Calculate the number of values in a two-dimensional statistical aggregate
- skewness_y, skewness_x
- Calculate the skewness from a two-dimensional statistical aggregate for the dimension specified
- slope
- Calculate the slope from a two-dimensional statistical aggregate
- stddev_y, stddev_x
- Calculate the standard deviation from a two-dimensional statistical aggregate for the dimension specified
- sum_y, sum_x
- Calculate the sum from a two-dimensional statistical aggregate for the dimension specified
- variance_y, variance_x
- Calculate the variance from a two-dimensional statistical aggregate for the dimension specified
- x_intercept
- Calculate the x-intercept from a two-dimensional statistical aggregate
Rollup
stats_agg(y DOUBLE PRECISION,x DOUBLE PRECISION) RETURNS StatsSummary2D
This is the first step for performing any statistical aggregate calculations on two-dimensional data. Use stats_agg
to create an intermediate aggregate (StatsSummary2D
) from your data. This intermediate form can then be used by one or more accessors in this group to compute the final results. Optionally, multiple such intermediate aggregate objects can be combined using rollup()
or rolling()
before an accessor is applied.
Required arguments
Name | Type | Description |
---|---|---|
y, x | DOUBLE PRECISION | The variables to use for the statistical aggregate. |
Returns
Column | Type | Description |
---|---|---|
stats_agg | StatsSummary2D | The statistical aggregate, containing data about the variables in an intermediate form. Pass the aggregate to accessor functions in the statistical aggregates API to perform final calculations. Or, pass the aggregate to rollup functions to combine multiple statistical aggregates into larger aggregates. |
average_y(summary StatsSummary 2D) RETURNS DOUBLE PRECISION
average_x(summary StatsSummary 2D) RETURNS DOUBLE PRECISION
Calculate the average from a two-dimensional aggregate for the given dimension. For example, average_y()
calculates the average for all the values of the y
variable, independent of the values of the x
variable.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Returns
Column | Type | Description |
---|---|---|
average_y , average_x | DOUBLE PRECISION | The average of the values in the statistical aggregate |
Examples
Calculate the average of the integers from 0 to 100:
SELECT average_x(stats_agg(y, x))FROM generate_series(1, 5) y,generate_series(0, 100) x;
average-----------50
corr(summary StatsSummary2D) RETURNS DOUBLE PRECISION
Calculate the correlation coefficient from a two-dimensional statistical aggregate. The calculation uses the standard least-squares fitting for linear regression.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Returns
Column | Type | Description |
---|---|---|
corr | DOUBLE PRECISION | The correlation coefficient of the least-squares fit line |
Examples
Calculate the correlation coefficient of independent variable y
and dependent variable x
for each 15-minute time bucket:
SELECTid,time_bucket('15 min'::interval, ts) AS bucket,corr(stats_agg(y, x)) AS summaryFROM fooGROUP BY id, time_bucket('15 min'::interval, ts)
covariance(summary StatsSummary2D,[ method TEXT ]) RETURNS DOUBLE PRECISION
Calculate the covariance from a two-dimensional statistical aggregate. The calculation uses the standard least-squares fitting for linear regression.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Optional arguments
Name | Type | Description |
---|---|---|
method | TEXT | The method used for calculating the covariance. The two options are population and sample , which can be abbreviated to pop or samp . Defaults to sample . |
Returns
Column | Type | Description |
---|---|---|
covariance | DOUBLE PRECISION | The covariance of the least-squares fit line |
Examples
Calculate the covariance of independent variable y
and dependent variable x
for each 15-minute time bucket:
SELECTid,time_bucket('15 min'::interval, ts) AS bucket,covariance(stats_agg(y, x)) AS summaryFROM fooGROUP BY id, time_bucket('15 min'::interval, ts)
determination_coeff(summary StatsSummary2D) RETURNS DOUBLE PRECISION
Calculate the determination coefficient from a two-dimensional statistical aggregate. The calculation uses the standard least-squares fitting for linear regression.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Returns
Column | Type | Description |
---|---|---|
determination_coeff | DOUBLE PRECISION | The determination coefficient of the least-squares fit line |
Examples
Calculate the determination coefficient of independent variable y
and dependent variable x
for each 15-minute time bucket:
SELECTid,time_bucket('15 min'::interval, ts) AS bucket,determination_coeff(stats_agg(y, x)) AS summaryFROM fooGROUP BY id, time_bucket('15 min'::interval, ts)
intercept(summary StatsSummary2D) RETURNS DOUBLE PRECISION
Calculate the y intercept from a two-dimensional statistical aggregate. The calculation uses the standard least-squares fitting for linear regression.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Returns
Column | Type | Description |
---|---|---|
intercept | DOUBLE PRECISION | The y intercept of the least-squares fit line |
Examples
Calculate the y intercept from independent variable y
and dependent variable x
for each 15-minute time bucket:
SELECTid,time_bucket('15 min'::interval, ts) AS bucket,intercept(stats_agg(y, x)) AS summaryFROM fooGROUP BY id, time_bucket('15 min'::interval, ts)
kurtosis_y(summary StatsSummary2D,[ method TEXT ]) RETURNS DOUBLE PRECISION
kurtosis_x(summary StatsSummary2D,[ method TEXT ]) RETURNS DOUBLE PRECISION
Calculate the kurtosis from a two-dimensional statistical aggregate for the given dimension. For example, kurtosis_y()
calculates the kurtosis for all the values of the y
variable, independent of values of the x
variable. The kurtosis is the fourth statistical moment. It is a measure of “tailedness” of a data distribution compared to a normal distribution.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Optional arguments
Name | Type | Description |
---|---|---|
method | TEXT | The method used for calculating the kurtosis. The two options are population and sample , which can be abbreviated to pop or samp . Defaults to sample . |
Returns
Column | Type | Description |
---|---|---|
kurtosis_y , kurtosis_x | DOUBLE PRECISION | The kurtosis of the values in the statistical aggregate |
Examples
Calculate the kurtosis of a sample containing the integers from 0 to 100:
SELECT kurtosis_y(stats_agg(data, data))FROM generate_series(0, 100) data;
kurtosis_y----------1.78195
num_vals(summary StatsSummary2D) RETURNS BIGINT
Calculate the number of values contained in a two-dimensional statistical aggregate.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Returns
Column | Type | Description |
---|---|---|
num_vals | DOUBLE PRECISION | The number of values in the statistical aggregate |
Examples
Calculate the number of values from 1 to 5, and from 0 to 100, inclusive:
SELECT num_vals(stats_agg(y, x))FROM generate_series(1, 5) y,generate_series(0, 100) x;
num_vals--------505
skewness_y(summary StatsSummary2D,[ method TEXT ]) RETURNS DOUBLE PRECISION
skewness_x(summary StatsSummary2D,[ method TEXT ]) RETURNS DOUBLE PRECISION
Calculate the skewness from a two-dimensional statistical aggregate for the given dimension. For example, skewness_y()
calculates the skewness for all the values of the y
variable, independent of values of the x
variable. The skewness is the third statistical moment. It is a measure of asymmetry in a data distribution.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Optional arguments
Name | Type | Description |
---|---|---|
method | TEXT | The method used for calculating the skewness. The two options are population and sample , which can be abbreviated to pop or samp . Defaults to sample . |
Returns
Column | Type | Description |
---|---|---|
skewness_y , skewness_x | DOUBLE PRECISION | The skewness of the values in the statistical aggregate |
Examples
Calculate the skewness of a sample containing the integers from 0 to 100:
SELECT skewness_x(stats_agg(data, data))FROM generate_series(0, 100) data;
skewness_x----------0
slope(summary StatsSummary2D) RETURNS DOUBLE PRECISION
Calculate the slope of the linear fitting line from a two-dimensional statistical aggregate. The calculation uses the standard least-squares fitting for linear regression.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Returns
Column | Type | Description |
---|---|---|
slope | DOUBLE PRECISION | The slope of the least-squares fit line |
Examples
Calculate the slope from independent variable y
and dependent variable x
for each 15-minute time bucket:
SELECTid,time_bucket('15 min'::interval, ts) AS bucket,slope(stats_agg(y, x)) AS summaryFROM fooGROUP BY id, time_bucket('15 min'::interval, ts)
stddev_y(summary StatsSummary2D,[ method TEXT ]) RETURNS DOUBLE PRECISION
stddev_x(summaryStatsSummary2D,[ method TEXT ]) RETURNS DOUBLE PRECISION
Calculate the standard deviation from a two-dimensional statistical aggregate for the given dimension. For example, stddev_y()
calculates the skewness for all the values of the y
variable, independent of values of the x
variable.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Optional arguments
Name | Type | Description |
---|---|---|
method | TEXT | The method used for calculating the standard deviation. The two options are population and sample , which can be abbreviated to pop or samp . Defaults to sample . |
Returns
Column | Type | Description |
---|---|---|
stddev_y , stddev_x | DOUBLE PRECISION | The standard deviation of the values in the statistical aggregate |
Examples
Calculate the standard deviation of a sample containing the integers from 0 to 100:
SELECT stddev_y(stats_agg(data, data))FROM generate_series(0, 100) data;
stddev_y--------29.3002
sum_y(summary StatsSummary2D) RETURNS DOUBLE PRECISION
sum_x(summary StatsSummary2D) RETURNS DOUBLE PRECISION
Calculate the sum from a two-dimensional statistical aggregate for the given dimension. For example, sum_y()
calculates the skewness for all the values of the y
variable, independent of values of the x
variable.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Returns
Column | Type | Description |
---|---|---|
sum | DOUBLE PRECISION | The sum of the values in the statistical aggregate |
Examples
Calculate the sum of the numbers from 0 to 100:
SELECT sum_y(stats_agg(data, data))FROM generate_series(0, 100) data;
sum_y-----5050
variance_y(summary StatsSummary2D,[ method TEXT ]) RETURNS DOUBLE PRECISION
variance_x(summaryStatsSummary2D,[ method TEXT ]) RETURNS DOUBLE PRECISION
Calculate the variance from a two-dimensional statistical aggregate for the given dimension. For example, variance_y()
calculates the skewness for all the values of the y
variable, independent of values of the x
variable.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Optional arguments
Name | Type | Description |
---|---|---|
method | TEXT | The method used for calculating the standard deviation. The two options are population and sample , which can be abbreviated to pop or samp . Defaults to sample . |
Returns
Column | Type | Description |
---|---|---|
variance | DOUBLE PRECISION | The variance of the values in the statistical aggregate |
Examples
Calculate the variance of a sample containing the integers from 0 to 100:
SELECT variance_y(stats_agg(data, data))FROM generate_series(0, 100) data;
variance_y----------858.5
x_intercept(summary StatsSummary2D) RETURNS DOUBLE PRECISION
Calculate the x intercept from a two-dimensional statistical aggregate. The calculation uses the standard least-squares fitting for linear regression.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Returns
Column | Type | Description |
---|---|---|
intercept | DOUBLE PRECISION | The x intercept of the least-squares fit line |
Examples
Calculate the x intercept from independent variable y
and dependent variable x
for each 15-minute time bucket:
SELECTid,time_bucket('15 min'::interval, ts) AS bucket,x_intercept(stats_agg(y, x)) AS summaryFROM fooGROUP BY id, time_bucket('15 min'::interval, ts)
rolling(ss StatsSummary2D) RETURNS StatsSummary2D
Combine multiple intermediate two-dimensional statistical aggregate
(StatsSummary2D
) objects into a single StatsSummary2D
object. It is
optimized for use in a window function context for computing tumbling window
statistical aggregates.
This is especially useful for computing tumbling window aggregates from a
continuous aggregate. It can be orders of magnitude faster because it uses
inverse transition and combine functions, with the possibility that bigger
floating point errors can occur in unusual scenarios.
For re-aggregation in a non-window function context, such as combining
hourly buckets into daily buckets, see rollup()
.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Returns
Column | Type | Description |
---|---|---|
rolling | StatsSummary2D | A new statistical aggregate produced by combining the input statistical aggregates |
rolling(ss StatsSummary2D) RETURNS StatsSummary2D
Combine multiple intermediate two-dimensional statistical aggregate (StatsSummary2D
) objects into a single StatsSummary2D
object. For example, you can use rollup
to combine statistical aggregates from 15-minute buckets into daily buckets.
For use in window function, see rolling()
.
Required arguments
Name | Type | Description |
---|---|---|
summary | StatsSummary2D | The statistical aggregate produced by a stats_agg call |
Returns
Column | Type | Description |
---|---|---|
rollup | StatsSummary2D | A new statistical aggregate produced by combining the input statistical aggregates |
Create a statistical aggregate that summarizes daily statistical data about two
variables, val2
and val1
, where val2
is the dependent variable and val1
is the independent variable. Use the statistical aggregate to calculate the
average of the dependent variable and the slope of the linear-regression fit:
WITH t as (SELECTtime_bucket('1 day'::interval, ts) as dt,stats_agg(val2, val1) AS stats2D,FROM fooWHERE id = 'bar'GROUP BY time_bucket('1 day'::interval, ts))SELECTaverage_x(stats2D),slope(stats2D)FROM t;
Keywords
Found an issue on this page?Report an issue or Edit this page in GitHub.