time_bucket_gapfill() functions
CommunityCommunity functions are available under Timescale Community Edition. Click to learn more.Introduction
Aggregate data by time interval, while filling in gaps of missing data.
time_bucket_gapfill
works similarly to time_bucket
, but adds
gapfilling capabilities. The other functions in this group must be used in the
same query as time_bucket_gapfill
. They control how missing values are treated.
Important
time_bucket_gapfill
must be used as a top-level expression in a query or subquery. You cannot, for example, nest time_bucket_gapfill
in another function (such as round(time_bucket_gapfill(...))
), or cast the result of the gapfilling call. If you need to cast, you can use time_bucket_gapfill
in a subquery, and let the outer query do the type cast.
Bucket
- time_bucket_gapfill
- Bucket rows by time interval while filling gaps in data
Interpolator
- interpolate
- Fill in missing values by linear interpolation
- locf
- Fill in missing values by carrying the last observed value forward
time_bucket_gapfill(bucket_width INTERVAL | INTEGER,time TIMESTAMPTZ | INTEGER,[, timezone TEXT][, start TIMESTAMPTZ | INTEGER][, finish TIMESTAMPTZ | INTEGER]) RETURNS TIMESTAMPTZ
Group data into buckets based on time interval, while filling in gaps of missing data. If you don't provide a gapfilling algorithm, such as locf
or interpolate
, gaps are left as NULL
in the returned data.
Required arguments
Name | Type | Description |
---|---|---|
bucket_width | INTERVAL , INTEGER | A PostgreSQL time interval to specify the length of each bucket. For example, use 1 day to get daily buckets. Use INTEGER only if your time column is integer-based. |
time | TIMESTAMPTZ , INTEGER | The timestamp on which to base the bucket |
Optional arguments
Name | Type | Description |
---|---|---|
timezone | TEXT | The timezone to use for bucketing. For example, Europe/Berlin . Available in TimescaleDB 2.9 or later. Does not work for integer-based time. If you have an untyped start or finish argument and a timezone argument, you might run into a problem where you are not passing your arguments for the parameter that you expect. To solve this, either name your arguments or explicitly type cast them. |
start | TIMESTAMPTZ , INTEGER | The start of the period to gapfill. Values before start are passed through, but no gapfilling is performed. Use INTEGER only if your time column is integer-based. Best practice is to use the WHERE clause. Specifying start is legacy. The WHERE is more performant, because the query planner can filter out chunks by constraint exclusion. |
finish | TIMESTAMPTZ , INTEGER | The end of the period to gapfill. Values after finish are passed through, but no gapfilling is performed. Use INTEGER only if your time column is integer-based. Best practice is to use the WHERE clause. Specifying finish is legacy. The WHERE is more performant, because the query planner can filter out chunks by constraint exclusion. |
Returns
Column | Type | Description |
---|---|---|
time_bucket_gapfill | TIMESTAMPTZ | The start time of the time bucket. |
interpolate(value SMALLINT | INTEGER | BIGINT | REAL | DOUBLE PRECISION[, prev EXPRESSION][, next EXPRESSION]) RETURNS SMALLINT | INTEGER | BIGINT | REAL | DOUBLE PRECISION
Fill in missing values by linear interpolation. Use in the same query as time_bucket_gapfill
. interpolate
cannot be nested inside another function call.
Required arguments
Name | Type | Description |
---|---|---|
value | SMALLINT , INTEGER , BIGINT , REAL , DOUBLE PRECISION | The value to interpolate |
Optional arguments
Name | Type | Description |
---|---|---|
prev | EXPRESSION | If no previous value is available for gapfilling, use the prev lookup expression to get a previous value. For example, you can use prev to fill in the first bucket in a queried time range. The expression must return a (time, value) tuple with types corresponding to the bucketed times and values. |
next | EXPRESSION | If no next value is available for gapfilling, use the next lookup expression to get a next value. For example, you can use next to fill in the last bucket in a queried time range. The expression must return a (time, value) tuple with types corresponding to the bucketed times and values. |
Returns
Column | Type | Description |
---|---|---|
interpolate | SMALLINT , INTEGER , BIGINT , REAL , DOUBLE PRECISION | The gapfilled value. The return type is the type of value . |
locf(value ANY ELEMENT[, prev EXPRESSION][, treat_null_as_missing BOOLEAN]) RETURNS ANY ELEMENT
Fill in missing values by carrying the last observed value forward. Use in the same query as time_bucket_gapfill
. locf
cannot be nested inside another function call.
Required arguments
Name | Type | Description |
---|---|---|
value | ANY ELEMENT | The value to carry forward |
Optional arguments
Name | Type | Description |
---|---|---|
prev | EXPRESSION | If no previous value is available for gapfilling, use the prev lookup expression to get a previous value. For example, you can use prev to fill in the first bucket in a queried time range. |
treat_null_as_missing | BOOLEAN | When true , NULL values are ignored, and only non-NULL values are carried forward. |
Returns
Column | Type | Description |
---|---|---|
locf | ANY ELEMENT | The gapfilled value. The return type is the type of value . |
Get the daily average metric value. Use time_bucket_gapfill
without specifying
a gapfilling algorithm. This leaves the missing values as NULL
:
SELECT time_bucket_gapfill('1 day', time) AS day,avg(value) as valueFROM metricsWHERE time > '2021-12-31 00:00:00+00'::timestamptzAND time < '2022-01-10 00:00:00-00'::timestamptzGROUP BY dayORDER BY day desc;
day | value-----------------------+--------------------2022-01-09 00:00:00+00 |2022-01-08 00:00:00+00 | 48.612931559931082022-01-07 00:00:00+00 | 54.3882675259864852022-01-06 00:00:00+00 |2022-01-05 00:00:00+00 | 58.2575206347852662022-01-04 00:00:00+00 | 46.091724242617652022-01-03 00:00:00+00 | 42.534987078200272022-01-02 00:00:00+00 |2022-01-01 00:00:00+00 | 47.844200014159752021-12-31 00:00:00+00 |(10 rows)
Get the daily average metric value. Use locf
to carry the last value forward
if a value is missing. Note that avg
is nested inside locf
, and not the
other way around.
SELECT time_bucket_gapfill('1 day', time) AS day,locf(avg(value)) as valueFROM metricsWHERE time > '2021-12-31 00:00:00+00'::timestamptzAND time < '2022-01-10 00:00:00-00'::timestamptzGROUP BY dayORDER BY day desc;
day | value-----------------------+--------------------2022-01-09 00:00:00+00 | 48.612931559931082022-01-08 00:00:00+00 | 48.612931559931082022-01-07 00:00:00+00 | 54.3882675259864852022-01-06 00:00:00+00 | 58.2575206347852662022-01-05 00:00:00+00 | 58.2575206347852662022-01-04 00:00:00+00 | 46.091724242617652022-01-03 00:00:00+00 | 42.534987078200272022-01-02 00:00:00+00 | 47.844200014159752022-01-01 00:00:00+00 | 47.844200014159752021-12-31 00:00:00+00 |(10 rows)
Get the daily average metric value. Use interpolate
to linearly interpolate
the value if it is missing. Note that avg
is nested inside interpolate
.
SELECT time_bucket_gapfill('1 day', time) AS day,interpolate(avg(value)) as valueFROM metricsWHERE time > '2021-12-31 00:00:00+00'::timestamptzAND time < '2022-01-10 00:00:00-00'::timestamptzGROUP BY dayORDER BY day desc;
day | value-----------------------+--------------------2022-01-09 00:00:00+00 |2022-01-08 00:00:00+00 | 48.612931559931082022-01-07 00:00:00+00 | 54.3882675259864852022-01-06 00:00:00+00 | 56.322894080385882022-01-05 00:00:00+00 | 58.2575206347852662022-01-04 00:00:00+00 | 46.091724242617652022-01-03 00:00:00+00 | 42.534987078200272022-01-02 00:00:00+00 | 45.1895935461800142022-01-01 00:00:00+00 | 47.844200014159752021-12-31 00:00:00+00 |(10 rows)
Get the daily average metric value, using Europe/Berlin
as the timezone. Note
that daily time buckets now start at 23:00 UTC
, which is equivalent to
midnight in Berlin for the selected dates:
SELECT time_bucket_gapfill('1 day', time, 'Europe/Berlin') AS day,interpolate(avg(value)) as valueFROM metricsWHERE time > '2021-12-31 00:00:00+00'::timestamptzAND time < '2022-01-10 00:00:00-00'::timestamptzGROUP BY dayORDER BY day desc;
day | value-----------------------+--------------------2022-01-09 23:00:00+00 |2022-01-08 23:00:00+00 | 48.650791279137032022-01-07 23:00:00+00 | 47.318477770991542022-01-06 23:00:00+00 | 55.988457403438592022-01-05 23:00:00+00 | 55.616674017771082022-01-04 23:00:00+00 | 58.741155745220122022-01-03 23:00:00+00 | 45.779936359882732022-01-02 23:00:00+00 | 41.786899234532022022-01-01 23:00:00+00 | 24.3243134777439742021-12-31 23:00:00+00 | 48.866803776612612021-12-30 23:00:00+00 |(11 rows)
Keywords
Found an issue on this page?Report an issue or Edit this page in GitHub.