# time_weight() functions

ToolkitTimescaleDB Toolkit functions are available under Timescale Community Edition. They are automatically included with Timescale, but must be installed separately for self-hosted TimescaleDB. Click to learn more.## Introduction

Calculate time-weighted summary statistics, such as averages (means) and integrals. Time weighting is used when data is unevenly sampled over time. In that case, a straight average gives misleading results, as it biases towards more frequently sampled values.

For example, a sensor might silently spend long periods of time in a steady state, and send data only when a significant change occurs. The regular mean counts the steady-state reading as only a single point, whereas a time-weighted mean accounts for the long period of time spent in the steady state. In essence, the time-weighted mean takes an integral over time, then divides by the elapsed time.

### Aggregate

- time_weight
- Aggregate data into an intermediate time-weighted aggregate form for further calculation

### Accessor

- average
- Calculate the time-weighted average of values in a
`TimeWeightSummary`

- first_time
- Get the first timestamp from a
`TimeWeightSummary`

aggregate - first_val
- Get the first value from a
`TimeWeightSummary`

aggregate - integral
- Calculate the integral from a
`TimeWeightSummary`

- interpolated_average
- Calculate the time-weighted average over an interval, while interpolating the interval bounds
- interpolated_integral
- Calculate the integral over an interval, while interpolating the interval bounds
- last_time
- Get the last timestamp from a
`TimeWeightSummary`

aggregate - last_val
- Get the last value from a
`TimeWeightSummary`

aggregate

### Rollup

- rollup
- Combine multiple
`TimeWeightSummaries`

time_weight(method TEXT,ts TIMESTAMPTZ,value DOUBLE PRECISION) RETURNS TimeWeightSummary

This is the first step for performing any time-weighted calculations. Use `time_weight`

to create an intermediate aggregate (`TimeWeightSummary`

) from your data. This intermediate form can then be used by one or more accessors in this group to compute final results.
Optionally, multiple such intermediate aggregate objects can be combined using `rollup()`

before an accessor is applied.

## Required arguments

Name | Type | Description |
---|---|---|

`method` | `TEXT` | The weighting method to use. The available methods are `linear` (or its alias `trapezoidal` , for those familiar with numeric integration methods) and `LOCF` , which stands for 'last observation carried forward'. `linear` fills in missing data by interpolating linearly between the start and end points of the gap. `LOCF` fills in the gap by assuming that the value remains constant until the next value is seen.
`LOCF` is most useful when a measurement is taken only when a value changes. `linear` is most useful if there are no such guarantees on the measurement.
The method names are case-insensitive. |

`ts` | `TIMESTAMPTZ` | The time at each point. Null values are ignored. An aggregate evaluated on only `null` values returns `null` . |

`value` | `DOUBLE PRECISION` | The value at each point to use for the time-weighted aggregate. Null values are ignored. An aggregate evaluated on only `null` values returns `null` . |

## Returns

Column | Type | Description |
---|---|---|

`time_weight` | `TimeWeightSummary` | A `TimeWeightSummary` object that can be passed to other functions within the time-weighting API |

## Examples

Aggregate data from column `val`

into daily time-weighted aggregates, using the linear interpolation method:

SELECTtime_bucket('1 day'::interval, ts) as dt,time_weight('Linear', ts, val) AS twFROM fooGROUP BY time_bucket('1 day'::interval, ts)

average(tws TimeWeightSummary) RETURNS DOUBLE PRECISION

Calculate the time-weighted average. Equal to `integral`

divided by the elapsed time.

## Required arguments

Name | Type | Description |
---|---|---|

`tws` | `TimeWeightSummary` | The input `TimeWeightSummary` from a `time_weight()` call. |

## Returns

Column | Type | Description |
---|---|---|

`average` | `DOUBLE PRECISION` | The time-weighted average. |

## Examples

Calculate the time-weighted average of the column `val`

, using the 'last observation carried forward' interpolation method:

SELECTid,average(tws)FROM (SELECTid,time_weight('LOCF', ts, val) AS twsFROM fooGROUP BY id) t

first_time(tw TimeWeightSummary) RETURNS TIMESTAMPTZ

Get the timestamp of the first point in a `TimeWeightSummary`

aggregate.

## Required arguments

Name | Type | Description |
---|---|---|

`tws` | `TimeWeightSummary` | The input `TimeWeightSummary` from a `time_weight()` call. |

## Returns

Column | Type | Description |
---|---|---|

`first_time` | `TIMESTAMPTZ` | The time of the first point in the `TimeWeightSummary` |

## Examples

Produce a linear `TimeWeightSummary`

over the column `val`

and get the first timestamp:

WITH t as (SELECTtime_bucket('1 day'::interval, ts) as dt,time_weight('Linear', ts, val) AS twFROM tableGROUP BY time_bucket('1 day'::interval, ts))SELECTdt,first_time(tw)FROM t;

first_val(tw TimeWeightSummary) RETURNS DOUBLE PRECISION

Get the value of the first point in a `TimeWeightSummary`

aggregate.

## Required arguments

Name | Type | Description |
---|---|---|

`tws` | `TimeWeightSummary` | The input `TimeWeightSummary` from a `time_weight()` call. |

## Returns

Column | Type | Description |
---|---|---|

`first_val` | `DOUBLE PRECISION` | The value of the first point in the `TimeWeightSummary` |

## Examples

Produce a linear `TimeWeightSummary`

over the column `val`

and get the first value:

WITH t as (SELECTtime_bucket('1 day'::interval, ts) as dt,time_weight('Linear', ts, val) AS twFROM tableGROUP BY time_bucket('1 day'::interval, ts))SELECTdt,first_val(tw)FROM t;

integral(tws TimeWeightSummary[, unit TEXT]) RETURNS DOUBLE PRECISION

Calculate the integral, or the area under the curve formed by the data points. Equal to `average`

multiplied by the elapsed time.

## Required arguments

Name | Type | Description |
---|---|---|

`tws` | `TimeWeightSummary` | The input `TimeWeightSummary` from a `time_weight()` call. |

## Optional arguments

Name | Type | Description |
---|---|---|

`unit` | `TEXT` | The unit of time to express the integral in. Can be `microsecond` , `millisecond` , `second` , `minute` , `hour` , or any alias for those units supported by PostgreSQL. Defaults to `second` . |

## Returns

Column | Type | Description |
---|---|---|

`integral` | `DOUBLE PRECISION` | The time-weighted integral. |

## Examples

Create a table to track irregularly sampled storage usage in bytes, and get the total storage used in byte-hours. Use the 'last observation carried forward' interpolation method:

-- Create a table to track irregularly sampled storage usageCREATE TABLE user_storage_usage(ts TIMESTAMP, storage_bytes BIGINT);INSERT INTO user_storage_usage(ts, storage_bytes) VALUES('01-01-2022 00:00', 0),('01-01-2022 00:30', 100),('01-01-2022 03:00', 300),('01-01-2022 03:10', 1000),('01-01-2022 03:25', 817);-- Get the total byte-hours usedSELECTintegral(time_weight('LOCF', ts, storage_bytes), 'hours')FROMuser_storage_usage;

interpolated_average(tws TimeWeightSummary,start TIMESTAMPTZ,interval INTERVAL[, prev TimeWeightSummary][, next TimeWeightSummary]) RETURNS DOUBLE PRECISION

Calculate the time-weighted average over an interval, while interpolating
the interval bounds.

Similar to `average`

, but allows an accurate calculation across
interval bounds when data has been bucketed into separate time intervals,
and there is no data point precisely at the interval bound. For example,
this is useful in a window function.

Values from the previous and next buckets are used to interpolate the values
at the bounds, using the same interpolation method used within the
`TimeWeightSummary`

itself.

Equal to `interpolated_integral`

divided by the
elapsed time.

## Required arguments

Name | Type | Description |
---|---|---|

`tws` | `TimeWeightSummary` | The input `TimeWeightSummary` from a `time_weight()` call. |

`start` | `TIMESTAMPTZ` | The start of the interval which the time-weighted average should cover (if there is a preceeding point). |

`interval` | `INTERVAL` | The length of the interval which the time-weighted average should cover. |

## Optional arguments

Name | Type | Description |
---|---|---|

`prev` | `TimeWeightSummary` | The `TimeWeightSummary` from the prior interval, used to interpolate the value at `start` . If NULL, the first timestamp in `tws` is used for the starting value. The prior interval can be determined from the PostgreSQL `lag()` function. |

`next` | `TimeWeightSummary` | The `TimeWeightSummary` from the next interval, used to interpolate the value at `start` + `interval` . If NULL, the first timestamp in `tws` is used for the starting value. The next interval can be determined from the PostgreSQL `lead()` function. |

## Returns

Column | Type | Description |
---|---|---|

`average` | `DOUBLE PRECISION` | The time-weighted average for the interval (`start` , `start` + `interval` ), computed from the `TimeWeightSummary` plus end points interpolated from `prev` and `next` |

## Examples

Calculate the time-weighted daily average of the column `val`

, interpolating over bucket bounds using the 'last observation carried forward' method:

SELECTid,time,interpolated_average(tws,time,'1 day',LAG(tws) OVER (PARTITION BY id ORDER by time),LEAD(tws) OVER (PARTITION BY id ORDER by time))FROM (SELECTid,time_bucket('1 day', ts) AS time,time_weight('LOCF', ts, val) AS twsFROM fooGROUP BY id, time) t

interpolated_integral(tws TimeWeightSummary,start TIMESTAMPTZ,interval INTERVAL[, prev TimeWeightSummary][, next TimeWeightSummary][, unit TEXT]) RETURNS DOUBLE PRECISION

Calculate the integral over an interval, while interpolating the interval
bounds.

Similar to `integral`

, but allows an accurate calculation across
interval bounds when data has been bucketed into separate time intervals,
and there is no data point precisely at the interval bound. For example,
this is useful in a window function.

Values from the previous and next buckets are used to interpolate the values
at the bounds, using the same interpolation method used within the
`TimeWeightSummary`

itself.

Equal to `interpolated_average`

multiplied by the
elapsed time.

## Required arguments

Name | Type | Description |
---|---|---|

`tws` | `TimeWeightSummary` | The input `TimeWeightSummary` from a `time_weight()` call. |

`start` | `TIMESTAMPTZ` | The start of the interval which the time-weighted integral should cover (if there is a preceding point). |

`interval` | `INTERVAL` | The length of the interval which the time-weighted integral should cover. |

## Optional arguments

Name | Type | Description |
---|---|---|

`prev` | `TimeWeightSummary` | The `TimeWeightSummary` from the prior interval, used to interpolate the value at `start` . If NULL, the first timestamp in `tws` is used for the starting value. The prior interval can be determined from the PostgreSQL `lag()` function. |

`next` | `TimeWeightSummary` | The `TimeWeightSummary` from the next interval, used to interpolate the value at `start` + `interval` . If NULL, the first timestamp in `tws` is used for the starting value. The next interval can be determined from the PostgreSQL `lead()` function. |

`unit` | `TEXT` | The unit of time to express the integral in. Can be `microsecond` , `millisecond` , `second` , `minute` , `hour` , or any alias for those units supported by PostgreSQL. Defaults to `second` . |

## Returns

Column | Type | Description |
---|---|---|

`integral` | `DOUBLE PRECISION` | The time-weighted integral for the interval (`start` , `start` + `interval` ), computed from the `TimeWeightSummary` plus end points interpolated from `prev` and `next` |

## Examples

Create a table to track irregularly sampled storage usage in bytes, and get the total storage used in byte-hours between January 1 and January 6. Use the 'last observation carried forward' interpolation method:

-- Create a table to track irregularly sampled storage usageCREATE TABLE user_storage_usage(ts TIMESTAMP, storage_bytes BIGINT);INSERT INTO user_storage_usage(ts, storage_bytes) VALUES('01-01-2022 20:55', 27),('01-02-2022 18:33', 100),('01-03-2022 03:05', 300),('01-04-2022 12:13', 1000),('01-05-2022 07:26', 817);-- Get the total byte-hours used between Jan. 1 and Jan. 6SELECTinterpolated_integral(time_weight('LOCF', ts, storage_bytes),'01-01-2022','5 days',NULL,NULL,'hours')FROMuser_storage_usage;

last_time(tw TimeWeightSummary) RETURNS TIMESTAMPTZ

Get the timestamp of the last point in a `TimeWeightSummary`

aggregate.

## Required arguments

Name | Type | Description |
---|---|---|

`tws` | `TimeWeightSummary` | The input `TimeWeightSummary` from a `time_weight()` call. |

## Returns

Column | Type | Description |
---|---|---|

`last_time` | `TIMESTAMPTZ` | The time of the last point in the `TimeWeightSummary` |

## Examples

Produce a linear `TimeWeightSummary`

over the column `val`

and get the last timestamp:

WITH t as (SELECTtime_bucket('1 day'::interval, ts) as dt,time_weight('Linear', ts, val) AS twFROM tableGROUP BY time_bucket('1 day'::interval, ts))SELECTdt,last_time(tw)FROM t;

last_val(tw TimeWeightSummary) RETURNS DOUBLE PRECISION

Get the value of the last point in a `TimeWeightSummary`

aggregate.

## Required arguments

Name | Type | Description |
---|---|---|

`tws` | `TimeWeightSummary` | The input `TimeWeightSummary` from a `time_weight()` call. |

## Returns

Column | Type | Description |
---|---|---|

`last_val` | `DOUBLE PRECISION` | The value of the last point in the `TimeWeightSummary` |

## Examples

Produce a linear `TimeWeightSummary`

over the column `val`

and get the last value:

WITH t as (SELECTtime_bucket('1 day'::interval, ts) as dt,time_weight('Linear', ts, val) AS twFROM tableGROUP BY time_bucket('1 day'::interval, ts))SELECTdt,last_val(tw)FROM t;

rollup(tws TimeWeightSummary) RETURNS TimeWeightSummary

Combine multiple intermediate time-weighted aggregate (`TimeWeightSummary`

) objects produced by `time_weight()`

into a single intermediate `TimeWeightSummary`

object. For example, you can use `rollup`

to combine time-weighted aggregates from 15-minute buckets into daily buckets.

## Required arguments

Name | Type | Description |
---|---|---|

`time_weight` | `TimeWeightSummary` | The `TimeWeightSummary` aggregate produced by a `time_weight` call |

## Returns

Column | Type | Description |
---|---|---|

`rollup` | `StatsSummary1D` | A new `TimeWeightSummary` aggregate produced by combining the input `TimeWeightSummary` aggregates |

Given a table `foo`

with data in a column `val`

, aggregate data into a daily
`TimeWeightSummary`

. Use that to calculate the average for column `val`

:

WITH t as (SELECTtime_bucket('1 day'::interval, ts) as dt,time_weight('Linear', ts, val) AS twFROM fooWHERE measure_id = 10GROUP BY time_bucket('1 day'::interval, ts))SELECTdt,average(tw)FROM t;

Time-weighted average calculations are not strictly parallelizable, as defined by PostgreSQL. These calculations require inputs to be strictly ordered, but in general, PostgreSQL parallelizes by assigning rows randomly to workers.

However, the algorithm can be parallelized if it is guaranteed that all rows
within some time range go to the same worker. This is the case for both
continuous aggregates and distributed hypertables. (Note that the partitioning
keys of the distributed hypertable must be within the `GROUP BY`

clause, but
this is usually the case.)

If you try to combine overlapping `TimeWeightSummaries`

, an error is thrown.
For example, you might create a `TimeWeightSummary`

for `device_1`

and a
separate `TimeWeightSummary`

for `device_2`

, both covering the same period of
time. You can't combine these because the interpolation techniques only make
sense when restricted to a single measurement series.

If you want to calculate a single summary statistic across all devices, use a simple average, like this:

WITH t as (SELECT measure_id,average(time_weight('LOCF', ts, val)) as time_weighted_averageFROM fooGROUP BY measure_id)SELECT avg(time_weighted_average) -- use the normal avg function to average the time-weighted averagesFROM t;

The time-weighted average functions are not strictly parallelizable in the PostgreSQL sense. PostgreSQL requires that parallelizable functions accept potentially overlapping input. As explained above, the time-weighted functions do not. However, they do support partial aggregation and partition-wise aggregation in multi-node setups.

Because the time-weighted aggregates require ordered sets, they build up a buffer of input data, sort it, and then perform the aggregation steps. When memory is too small to build up a buffer of points, you might see Out of Memory failures or other issues. In these cases, try using a multi-level aggregate. For example:

WITH t as (SELECT measure_id,time_bucket('1 day'::interval, ts),time_weight('LOCF', ts, val)FROM fooGROUP BY measure_id, time_bucket('1 day'::interval, ts))SELECT measure_id,average(rollup(time_weight))FROM tGROUP BY measure_id;

Keywords

Found an issue on this page?

Report an issue!