About continuous aggregates

Timescale Cloud: Performance, Scale, Enterprise

Self-hosted products

MST

In modern applications, data usually grows very quickly. This means that aggregating it into useful summaries can become very slow. Timescale Cloud continuous aggregates make aggregating data lightning fast, accurate, and easy.

If you are collecting data very frequently, you might want to aggregate your data into minutes or hours instead. For example, if an IoT device takes temperature readings every second, you might want to find the average temperature for each hour. Every time you run this query, the database needs to scan the entire table and recalculate the average.

Continuous aggregates are a kind of hypertable that is refreshed automatically in the background as new data is added, or old data is modified. Changes to your dataset are tracked, and the hypertable behind the continuous aggregate is automatically updated in the background.

You don't need to manually refresh your continuous aggregates, they are continuously and incrementally updated in the background. Continuous aggregates also have a much lower maintenance burden than regular PostgreSQL materialized views, because the whole view is not created from scratch on each refresh. This means that you can get on with working your data instead of maintaining your database.

Because continuous aggregates are based on hypertables, you can query them in exactly the same way as your other tables. This includes continuous aggregates in the rowstore, compressed into the columnstore, or tiered to object storage. You can even create continuous aggregates on top of your continuous aggregates - for an even more fine-tuned aggregation.

Real-time aggregation enables you to combine pre-aggregated data from the materialized view with the most recent raw data. This gives you up-to-date results on every query.

In TimescaleDB v2.13 and later, real-time aggregates are DISABLED by default. In earlier versions, real-time aggregates are ENABLED by default; when you create a continuous aggregate, queries to that view include the results from the most recent raw data.

Types of aggregation

There are three main ways to make aggregation easier: materialized views, continuous aggregates, and real-time aggregates.

Materialized views are a standard PostgreSQL function. They are used to cache the result of a complex query so that you can reuse it later on. Materialized views do not update regularly, although you can manually refresh them as required.

Continuous aggregates are a Timescale only feature. They work in a similar way to a materialized view, but they are updated automatically in the background, as new data is added to your database. Continuous aggregates are updated continuously and incrementally, which means they are less resource intensive to maintain than materialized views. Continuous aggregates are based on hypertables, and you can query them in the same way as you do your other tables.

Real-time aggregates are a Timescale only feature. They are the same as continuous aggregates, but they add the most recent raw data to the previously aggregated data to provide accurate and up-to-date results, without needing to aggregate data as it is being written.

Continuous aggregates on continuous aggregates

You can create a continuous aggregate on top of another continuous aggregate. This allows you to summarize data at different granularity. For example, you might have a raw hypertable that contains second-by-second data. Create a continuous aggregate on the hypertable to calculate hourly data. To calculate daily data, create a continuous aggregate on top of your hourly continuous aggregate.

For more information, see the documentation about continuous aggregates on continuous aggregates.

Continuous aggregates with a `JOIN` clause

Continuous aggregates support the following JOIN features:

Feature	TimescaleDB < 2.10.x	TimescaleDB <= 2.15.x	TimescaleDB >= 2.16.x
INNER JOIN	❌	✅	✅
LEFT JOIN	❌	❌	✅
LATERAL JOIN	❌	❌	✅
Joins between ONE hypertable and ONE standard PostgreSQL table	❌	✅	✅
Joins between ONE hypertable and MANY standard PostgreSQL tables	❌	❌	✅
Join conditions must be equality conditions, and there can only be ONE `JOIN` condition	❌	✅	✅
Any join conditions	❌	❌	✅

JOINS in TimescaleDB must meet the following conditions:

Only the changes to the hypertable are tracked, and they are updated in the continuous aggregate when it is refreshed. Changes to standard PostgreSQL table are not tracked.
You can use an INNER, LEFT, and LATERAL joins; no other join type is supported.
Joins on the materialized hypertable of a continuous aggregate are not supported.
Hierarchical continuous aggregates can be created on top of a continuous aggregate with a JOIN clause, but cannot themselves have a JOIN clauses.

JOIN examples

Given the following schema:


CREATE TABLE locations (
  id TEXT PRIMARY KEY,
  name TEXT
);

CREATE TABLE devices (
  id SERIAL PRIMARY KEY,
  location_id TEXT,
  name TEXT
);

CREATE TABLE conditions (
  "time" TIMESTAMPTZ,
  device_id INTEGER,
  temperature FLOAT8
) WITH (
  tsdb.hypertable,
  tsdb.partition_column='time'
);

See the following JOIN examples on continuous aggregates:

INNER JOIN on a single equality condition, using the ON clause:


CREATE MATERIALIZED VIEW conditions_by_day WITH (timescaledb.continuous) AS
SELECT time_bucket('1 day', time) AS bucket, devices.name, MIN(temperature), MAX(temperature)
FROM conditions
JOIN devices ON devices.id = conditions.device_id
GROUP BY bucket, devices.name
WITH NO DATA;

INNER JOIN on a single equality condition, using the ON clause, with a further condition added in the WHERE clause:


CREATE MATERIALIZED VIEW conditions_by_day WITH (timescaledb.continuous) AS
SELECT time_bucket('1 day', time) AS bucket, devices.name, MIN(temperature), MAX(temperature)
FROM conditions
JOIN devices ON devices.id = conditions.device_id
WHERE devices.location_id = 'location123'
GROUP BY bucket, devices.name
WITH NO DATA;

INNER JOIN on a single equality condition specified in WHERE clause:


CREATE MATERIALIZED VIEW conditions_by_day WITH (timescaledb.continuous) AS
SELECT time_bucket('1 day', time) AS bucket, devices.name, MIN(temperature), MAX(temperature)
FROM conditions, devices
WHERE devices.id = conditions.device_id
GROUP BY bucket, devices.name
WITH NO DATA;

INNER JOIN on multiple equality conditions:


CREATE MATERIALIZED VIEW conditions_by_day WITH (timescaledb.continuous) AS
SELECT time_bucket('1 day', time) AS bucket, devices.name, MIN(temperature), MAX(temperature)
FROM conditions
JOIN devices ON devices.id = conditions.device_id AND devices.location_id = 'location123'
GROUP BY bucket, devices.name
WITH NO DATA;