state_agg() functions

ToolkitTimescaleDB Toolkit functions are available under Timescale Community Edition. They are automatically included with Timescale, but must be installed separately for self-hosted TimescaleDB. Click to learn more.

Timescale Cloud: Performance, Scale, Enterprise

Self-hosted products

MST

Given a system or value that switches between discrete states, track transitions between the states. For example, you can use state_agg to create a state of state transitions, or to calculate the durations of states. state_agg extends the capabilities of compact_state_agg.

state_agg is designed to work with a relatively small number of states. It might not perform well on datasets where states are mostly distinct between rows.

Because state_agg tracks more information, it uses more memory than compact_state_agg. If you want to minimize memory use and don't need to query the timestamps of state transitions, consider using compact_state_agg instead.

Related hyperfunction groups

This group of functions uses the two-step aggregation pattern.

Rather than calculating the final result in one step, you first create an intermediate aggregate by using the aggregate function.

Then, use any of the accessors on the intermediate aggregate to calculate a final result. You can also roll up multiple intermediate aggregates with the rollup functions.

The two-step aggregation pattern has several advantages:

More efficient because multiple accessors can reuse the same aggregate
Easier to reason about performance, because aggregation is separate from final computation
Easier to understand when calculations can be rolled up into larger intervals, especially in window functions and continuous aggregates
Can perform retrospective analysis even when underlying data is dropped, because the intermediate aggregate stores extra information not available in the final result

To learn more, see the blog post on two-step aggregates.

Functions in this group

state_agg: Aggregate state data into a state aggregate for further analysis

duration_in: Calculate the total time spent in a given state from a state aggregate
interpolated_duration_in: Calculate the total time spent in a given state from a state aggregate, interpolating values at time bucket boundaries
interpolated_state_periods: Get the time periods corresponding to a given state from a state aggregate, interpolating values at time bucket boundaries
interpolated_state_timeline: Get a state of all states from a state aggregate, interpolating values at time bucket boundaries
into_values: Expand the state aggregate into a set of rows, displaying the duration of each state
state_at: Deterimine the state at a given time
state_periods: Get the time periods corresponding to a given state from a state aggregate
state_timeline: Get a state of all states from a state aggregate

rollup: Combine multiple state aggregates

Function details

state_agg(
  ts TIMESTAMPTZ,
  value {TEXT | BIGINT}
) RETURNS StateAgg

Aggregate state data into a state aggregate to track state transitions. Unlike compact_state_agg, which only stores durations, state_agg also stores the timestamps of state transitions.

Required arguments

Name	Type	Description
`ts`	`TIMESTAMPTZ`	Timestamps associated with each state reading
`value`	`TEXT`, `BIGINT`	The state at that time

Returns

Column	Type	Description
`agg`	`StateAgg`	An object storing the periods spent in each state, including timestamps of state transitions

Examples

Create a state aggregate to track the status of some devices:


SELECT state_agg(time, status) FROM devices;

duration_in(
  agg StateAgg,
  state {TEXT | BIGINT}
  [, start TIMESTAMPTZ]
  [, interval INTERVAL]
) RETURNS INTERVAL

Given a state aggregate, calculate the total time spent in a state. If you need to interpolate missing values across time bucket boundaries, use interpolated_duration_in.

Required arguments

Name	Type	Description
`agg`	`StateAgg`	A state aggregate created with `state_agg`
`state`	`TEXT`, `BIGINT`	The state to query

Optional arguments

Name	Type	Description
`start`	`TIMESTAMPTZ`	If specified, only the time in the state after this time is returned.
`interval`	`INTERVAL`	If specified, only the time in the state from the start time to the end of the interval is returned.

Returns

Column	Type	Description
`duration_in`	`INTERVAL`	The time spent in the given state. Displayed in `days`, `hh:mm:ss`, or a combination of the two.

Examples

Create a test table that tracks when a system switches between starting, running, and error states. Query the table for the time spent in the running state.

If you prefer to see the result in seconds, EXTRACT the epoch from the returned result:


SET timezone TO 'UTC';
CREATE TABLE states(time TIMESTAMPTZ, state TEXT);
INSERT INTO states VALUES
  ('1-1-2020 10:00', 'starting'),
  ('1-1-2020 10:30', 'running'),
  ('1-3-2020 16:00', 'error'),
  ('1-3-2020 18:30', 'starting'),
  ('1-3-2020 19:30', 'running'),
  ('1-5-2020 12:00', 'stopping');

SELECT duration_in(
  state_agg(time, state),
  'running'
) FROM states;


duration_in  
---------------
3 days 22:00:00

interpolated_duration_in(
  agg StateAgg,
  state {TEXT | BIGINT},
  start TIMESTAMPTZ,
  interval INTERVAL
  [, prev StateAgg]
) RETURNS DOUBLE PRECISION

Calculate the total duration in a given state. Unlike duration_in, you can use this function across multiple state aggregates that cover multiple time buckets. Any missing values at the time bucket boundaries are interpolated from adjacent state aggregates.

Required arguments

Name	Type	Description
`agg`	`StateAgg`	A state aggregate created with `state_agg`
`state`	`TEXT`, `BIGINT`	The state to query
`start`	`TIMESTAMPTZ`	The start of the interval to be calculated
`interval`	`INTERVAL`	The length of the interval to be calculated

Optional arguments

Name	Type	Description
`prev`	`StateAgg`	The state aggregate from the prior interval, used to interpolate the value at `start`. If `NULL`, the first timestamp in `aggregate` is used as the start of the interval.

Returns

Column	Type	Description
`interpolated_duration_in`	`INTERVAL`	The total time spent in the queried state. Displayed as `days`, `hh:mm:ss`, or a combination of the two.

Examples

Create a test table that tracks when a system switches between starting, running, and error states. Query the table for the time spent in the running state. Use LAG and LEAD to get the neighboring aggregates for interpolation. If you prefer to see the result in seconds, EXTRACT the epoch from the returned result:


SELECT 
  time,
  interpolated_duration_in(
    agg,
    'running',
    time,
    '1 day',
    LAG(agg) OVER (ORDER BY time)
) FROM (
  SELECT
    time_bucket('1 day', time) as time,
    state_agg(time, state) as agg
  FROM
    states
  GROUP BY time_bucket('1 day', time)
) s;


time                    | interpolated_duration_in 
------------------------+--------------------------
2020-01-01 00:00:00+00  | 13:30:00
2020-01-02 00:00:00+00  | 16:00:00
2020-01-03 00:00:00+00  | 04:30:00
2020-01-04 00:00:00+00  | 12:00:00

interpolated_state_periods(
  agg StateAgg,
  state [TEXT | BIGINT],
  start TIMESTAMPTZ,
  interval INTERVAL,
  [, prev StateAgg]
) RETURNS (TIMESTAMPTZ, TIMESTAMPTZ)

Given a state aggregate and a specific state, list the periods when the system is in that state. Periods are defined by the start time and end time.

Unlike state_periods, you can use this function across multiple state aggregates that cover different time buckets. Any missing values at the time bucket boundaries are interpolated from adjacent state aggregates.

Required arguments

Name	Type	Description
`agg`	`StateAgg`	A state aggregate created with `state_agg`
`state`	`TEXT`, `BIGINT`	The state to query
`start`	`TIMESTAMPTZ`	The start of the interval to be calculated
`interval`	`INTERVAL`	The length of the interval to be calculated

Optional arguments

Name	Type	Description
`prev`	`StateAgg`	The state aggregate from the prior interval, used to interpolate the value at `start`. If `NULL`, the first timestamp in `aggregate` is used as the start of the interval.

Returns

Column	Type	Description
`start_time`	`TIMESTAMPTZ`	The time when the state started (inclusive)
`end_time`	`TIMESTAMPTZ`	The time when the state ended (exclusive)

Examples

Given state aggregates bucketed by 1-minute intervals, interpolate the states at the bucket boundaries and list all time periods corresponding to the state OK. To perform the interpolation, the LAG and LEAD functions are used to get the previous and next state aggregates:


SELECT
    bucket,
    (interpolated_state_periods(
        summary,
        'OK',
        bucket,
        '15 min',
        LAG(summary) OVER (ORDER by bucket)
    )).*
FROM (
    SELECT
        time_bucket('1 min'::interval, ts) AS bucket,
        state_agg(ts, state) AS summary
    FROM states_test
    GROUP BY time_bucket('1 min'::interval, ts)
) t;


bucket         |       start_time       |        end_time
------------------------+------------------------+------------------------
 2020-01-01 00:00:00+00 | 2020-01-01 00:00:11+00 | 2020-01-01 00:15:00+00
 2020-01-01 00:01:00+00 | 2020-01-01 00:01:03+00 | 2020-01-01 00:16:00+00

interpolated_state_timeline(
    agg StateAgg,
    start TIMESTAMPTZ,
    interval INTERVAL,
    [, prev StateAgg]
) RETURNS (TIMESTAMPTZ, TIMESTAMPTZ)

interpolated_state_int_timeline(
    agg StateAgg,
    start TIMESTAMPTZ,
    interval INTERVAL,
    [, prev StateAgg]
) RETURNS (TIMESTAMPTZ, TIMESTAMPTZ)

Get a state of all states, showing each time a state is entered and exited.

Unlike state_timeline, you can use this function across multiple state aggregates that cover different time buckets. Any missing values at the time bucket boundaries are interpolated from adjacent state aggregates.

Required arguments

Name	Type	Description
`agg`	`StateAgg`	A state aggregate created with `state_agg`
`start`	`TIMESTAMPTZ`	The start of the interval to be calculated
`interval`	`INTERVAL`	The length of the interval to be calculated

Optional arguments

Name	Type	Description
`prev`	`StateAgg`	The state aggregate from the prior interval, used to interpolate the value at `start`. If `NULL`, the first timestamp in `aggregate` is used as the start of the interval.

Returns

Column	Type	Description
`state`	`TEXT`, `BIGINT`	A state found in the state aggregate
`start_time`	`TIMESTAMPTZ`	The time when the state started (inclusive)
`end_time`	`TIMESTAMPTZ`	The time when the state ended (exclusive)

Examples

Given state aggregates bucketed by 1-minute intervals, interpolate the states at the bucket boundaries and get the history of all states.

To perform the interpolation, the LAG and LEAD functions are used to get the previous and next state aggregates:


SELECT
    bucket,
    (interpolated_state_timeline(
        summary,
        bucket,
        '15 min',
        LAG(summary) OVER (ORDER by bucket)
    )).*
FROM (
    SELECT
        time_bucket('1 min'::interval, ts) AS bucket,
        state_agg(ts, state) AS summary
    FROM states_test
    GROUP BY time_bucket('1 min'::interval, ts)
) t;


bucket         | state |       start_time       |        end_time
------------------------+-------+------------------------+------------------------
 2020-01-01 00:00:00+00 | START | 2020-01-01 00:00:00+00 | 2020-01-01 00:00:11+00
 2020-01-01 00:00:00+00 | OK    | 2020-01-01 00:00:11+00 | 2020-01-01 00:15:00+00
 2020-01-01 00:01:00+00 | ERROR | 2020-01-01 00:01:00+00 | 2020-01-01 00:01:03+00
 2020-01-01 00:01:00+00 | OK    | 2020-01-01 00:01:03+00 | 2020-01-01 00:16:00+00
 2020-01-01 00:02:00+00 | STOP  | 2020-01-01 00:02:00+00 | 2020-01-01 00:17:00+00

into_values(
  agg StateAgg
) RETURNS (TEXT, INTERVAL)

into_int_values(
  agg StateAgg
) RETURNS (INT, INTERVAL)

Unpack the state aggregate into a set of rows with two columns, displaying the duration of each state. By default, the columns are named state and duration. You can rename them using the same method as renaming a table.

Required arguments

Name	Type	Description
`agg`	`StateAgg`	A state aggregate created with `state_agg`

Returns

Column	Type	Description
`state`	`TEXT`, `BIGINT`	A state found in the state aggregate
`duration`	`INTERVAL`	The total time spent in that state

Examples

Create a state aggregate from the table states_test. The time column is named time, and the state column contains text values corresponding to different states of a system. Use into_values to display the data from the state aggregate:


SELECT state, duration FROM into_values(
  (SELECT state_agg(time, state) FROM states_test)
);


state | duration
------+----------
ERROR | 00:00:03
OK    | 00:01:46
START | 00:00:11

state_at(
  agg StateAgg,
  ts TIMESTAMPTZ
) RETURNS TEXT

state_at_int(
  agg StateAgg,
  ts TIMESTAMPTZ
) RETURNS BIGINT

Given a state aggregate, deterimine the state at a given time.

Required arguments

Name	Type	Description
`agg`	`StateAgg`	A state aggregate created with `state_agg`
`ts`	`TIMESTAMPTZ`	The time to get the state at.

Returns

Column	Type	Description
`state`	`TEXT`, `BIGINT`	The state at the given time.

Examples

Create a state aggregate and determine the state at a particular time:


SELECT state_at(
  (SELECT state_agg(ts, state) FROM states_test),
  '2020-01-01 00:00:05+00'
);


state_at
----------
START

state_periods(
    agg StateAgg,
    state [TEXT | BIGINT]
) RETURNS (TIMESTAMPTZ, TIMESTAMPTZ)

Given a state aggregate and a specific state, list the periods when the system is in that state. Periods are defined by the start time and end time.

If you have multiple state aggregates and need to interpolate the state across interval boundaries, use interpolated_state_periods.

Required arguments

Name	Type	Description
`agg`	`StateAgg`	A state aggregate created using `state_agg`.
`state`	`TEXT`, `BIGINT`	The target state to get data for.

Returns

Column	Type	Description
`start_time`	`TIMESTAMPTZ`	The time when the state started (inclusive)
`end_time`	`TIMESTAMPTZ`	The time when the state ended (exclusive)

Examples

Create a state aggregate and list all periods corresponding to the state OK:


SELECT start_time, end_time FROM state_periods(
  (SELECT state_agg(ts, state) FROM states_test),
  'OK',
);


start_time       |        end_time
------------------------+------------------------
 2020-01-01 00:00:11+00 | 2020-01-01 00:01:00+00
 2020-01-01 00:01:03+00 | 2020-01-01 00:02:00+00

state_timeline(
    agg StateAgg
) RETURNS (TEXT, TIMESTAMPTZ, TIMESTAMPTZ)

state_int_timeline(
    agg StateAgg
) RETURNS (BIGINT, TIMESTAMPTZ, TIMESTAMPTZ)

Get a state of all states, showing each time a state is entered and exited.

If you have multiple state aggregates and need to interpolate the state across interval boundaries, use interpolated_state_timeline.

Required arguments

Name	Type	Description
`agg`	`StateAgg`	The aggregate from which to get a state

Returns

Column	Type	Description
`state`	`TEXT`, `BIGINT`	A state found in the state aggregate
`start_time`	`TIMESTAMPTZ`	The time when the state started (inclusive)
`end_time`	`TIMESTAMPTZ`	The time when the state ended (exclusive)

Examples

Get the history of states from a state aggregate:


SELECT state, start_time, end_time
  FROM state_timeline(
    (SELECT state_agg(ts, state) FROM states_test)
  );


state |       start_time       |        end_time
-------+------------------------+------------------------
 START | 2020-01-01 00:00:00+00 | 2020-01-01 00:00:11+00
 OK    | 2020-01-01 00:00:11+00 | 2020-01-01 00:01:00+00
 ERROR | 2020-01-01 00:01:00+00 | 2020-01-01 00:01:03+00
 OK    | 2020-01-01 00:01:03+00 | 2020-01-01 00:02:00+00
 STOP  | 2020-01-01 00:02:00+00 | 2020-01-01 00:02:00+00

rollup(
    agg StateAgg
) RETURNS StateAgg

Combine multiple state aggregates into a single state aggregate. For example, you can use rollup to combine state aggregates from 15-minute buckets into daily buckets.

Required arguments

Name	Type	Description
`agg`	`StateAgg`	State aggregates created using `state_agg`

Returns

Column	Type	Description
`agg`	`StateAgg`	A new state aggregate that combines the input state aggregates

Examples

Combine multiple state aggregates and calculate the duration spent in the START state:


WITH buckets AS (SELECT
    time_bucket('1 minute', ts) as dt,
    state_agg(ts, state) AS sa
FROM states_test
GROUP BY time_bucket('1 minute', ts))
SELECT duration_in(
    'START',
    rollup(buckets.sa)
)
FROM buckets;

state_agg() functions

Related hyperfunction groups

Two-step aggregation

Functions in this group

Aggregate

Accessor

Rollup

Function details

state_agg()

duration_in()

interpolated_duration_in()

interpolated_state_periods()

interpolated_state_timeline()

into_values()

state_at()

state_periods()

state_timeline()

rollup()

state_agg() functions

Introduction

Related hyperfunction groups

Two-step aggregation

Functions in this group

Aggregate

Accessor

Rollup

Function details

state_agg()

duration_in()

interpolated_duration_in()

interpolated_state_periods()

interpolated_state_timeline()

into_values()

state_at()

state_periods()

state_timeline()

rollup()

Related Content