Refresh continuous aggregates
Continuous aggregates can have a range of different refresh policies. In addition to refreshing the continuous aggregate automatically using a policy, you can also refresh it manually.
Continuous aggregates require a policy for automatic refreshing. You can adjust this to suit different use cases. For example, you can have the continuous aggregate and the hypertable stay in sync, even when data is removed from the hypertable, or you could keep source data in the continuous aggregate even after it is removed from the hypertable.
You can change the way your continuous aggregate is refreshed by adjusting the add_continuous_aggregate_policy. The policy takes three arguments:
start_offset: the start of the refresh window relative to when the policy runs
end_offset: the end of the refresh window relative to when the policy runs
schedule_interval: the refresh interval in minutes or hours. Defaults to 24 hours.
If you set the
end_offset to NULL, the range is open-ended
and extends to the beginning or end of time. However, we recommend that you
end_offset so that at least the most recent time bucket is excluded.
For time-series data that mostly contains writes that occur in time stamp order,
the time buckets that see lots of writes quickly have out-of-date
aggregates. You get better performance by excluding the time buckets that are
getting a lot of writes.
Changing a refresh policy to use a NULL start_offset
- At the
psqlprompt, create a new policy called
conditions_summary_hourlythat keeps the continuous aggregate up to date, and runs every hour:
SELECT add_continuous_aggregate_policy('conditions_summary_hourly', start_offset => NULL, end_offset => INTERVAL '1 h', schedule_interval => INTERVAL '1 h');
The policy in this example ensures that all the data in the continuous aggregate
is up to date with the hypertable except for anything written in the last hour.
It also does not refresh the last time bucket of the continuous aggregate.
Because it has an open-ended
start_offset parameter, any data that is removed
from the table, for example with a DELETE or with
drop_chunks, is also removed
from the continuous aggregate view. This means that the continuous aggregate
always reflects the data in the underlying hypertable.
If you want to keep data in the continuous aggregate even if it is removed from
the underlying hypertable, you can set the
start_offset to match the data
retention policy on the source hypertable. For example, if
you have a retention policy that removes data older than one month, set
start_offset to one month or less. This sets your policy so that it does
not refresh the dropped data.
Changing a refresh policy to keep removed data
- At the
psqlprompt, create a new policy called
conditions_summary_hourlythat keeps data removed from the hypertable in the continuous aggregate, and runs every hour:
SELECT add_continuous_aggregate_policy('conditions_summary_hourly', start_offset => INTERVAL '1 month', end_offset => INTERVAL '1 h', schedule_interval => INTERVAL '1 h');
It is important to consider your data retention policies when you're setting up continuous aggregate policies. If the continuous aggregate policy window covers data that is removed by the data retention policy, the data will be removed when the aggregates for those buckets are refreshed. For example, if you have a data retention policy that removes all data older than two weeks, the continuous aggregate policy will only have data for the last two weeks.
If you need to manually refresh a continuous aggeregate, you can use the
refresh command. This recomputes the data within the window that has changed in the
underlying hypertable since the last refresh. Therefore, if only a few
buckets need updating, the refresh runs quickly.
If you have recently dropped data from a hypertable with a continuous aggregate,
refresh_continuous_aggregate on a region containing dropped chunks
recalculates the aggregate without the dropped data. See
drop data for more information.
refresh command takes three arguments:
- The name of the continuous aggregate view to refresh
- The timestamp of the beginning of the refresh window
- The timestamp of the end of the refresh window
Only buckets that are wholly within the range specified are refreshed. For
example, if you specify
2021-05-01', '2021-06-01 the only buckets that are
refreshed are those up to but not including 2021-06-01. It is possible to
specify NULL in a manual refresh to get an open-ended range, but we do not
recommend using it, because you could inadvertently materialize a large amount
of data, slow down your performance, and have unintended consequences on other
policies like data retention.
Manually refreshing a continuous aggregate
- To manually refresh a continuous aggregate, use the
CALL refresh_continuous_aggregate('example', '2021-05-01', '2021-06-01');
Avoid refreshing time intervals that are likely to have a lot of writes. In general, this means you should never refresh the most recent time bucket. Because the of constant change in the underlying data, they are unlikely to produce accurate aggregates. Additionally, refreshing this data slows down the ingest rate of the hypertable due to write amplification. If you want to include the latest bucket in your queries, use real-time aggregation instead.
Found an issue on this page?Report an issue!