heartbeat_agg() functions
ToolkitTimescaleDB Toolkit functions are available under Timescale Community Edition. They are automatically included with Timescale, but must be installed separately for self-hosted TimescaleDB. Click to learn more.Introduction
Given a series of timestamped heartbeats and a liveness interval, determine the overall liveness of a system. This aggregate can be used to report total uptime or downtime as well as report the time ranges where the system was live or dead.
It's also possible to combine multiple heartbeat aggregates to determine the overall health of a service. For example, the heartbeat aggregates from a primary and standby server could be combine to see if there was ever a window where both machines were down at the same time.
Related hyperfunction groups
Aggregate
- heartbeat_agg
- Create a liveness aggregate from a set of heartbeats
Accessor
- dead_ranges
- Get the down intervals from a heartbeat_agg
- downtime
- Get the total time dead during a heartbeat aggregate
- interpolate
- Adjust a heartbeat aggregate with predecessor information
- interpolated_downtime
- Get the total time dead from a heartbeat aggregate and predecessor
- interpolated_uptime
- Get the total time live from a heartbeat aggregate and predecessor
- live_at
- Test if the aggregate has a heartbeat covering a given time
- live_ranges
- Get the live intervals from a heartbeat_agg
- num_gaps
- Count the number of gaps between live ranges
- num_live_ranges
- Count the number of live ranges
- trim_to
- Reduce the covered interval of a heartbeat aggregate
- uptime
- Get the total time live during a heartbeat aggregate
Rollup
- rollup
- Combine multiple heartbeat aggregates
heartbeat_agg(heartbeat TIMESTAMPTZ,agg_start TIMESTAMPTZ,agg_duration INTERVAL,heartbeat_liveness INTERVAL) RETURNS HeartbeatAgg
This takes a set of heartbeat timestamps and aggregates the liveness state of the underlying system for the specified time range.
Required arguments
Name | Type | Description |
---|---|---|
heartbeat | TIMESTAMPTZ | The column containing the timestamps of the heartbeats. |
agg_start | TIMESTAMPTZ | The start of the time range over which this aggregate is tracking liveness. |
agg_duration | INTERVAL | The length of the time range over which this aggregate is tracking liveness. Any point in this range that doesn't closely follow a heartbeat is considered to be dead. |
heartbeat_liveness | INTERVAL | How long the system is considered to be live after each heartbeat. |
Returns
Column | Type | Description |
---|---|---|
heartbeat_agg | HeartbeatAgg | The liveness data for the heartbeated system over the provided interval. |
Examples
Given a table called system_health
with a ping_time
column, construct an aggregate of system liveness for 10 days starting from Jan 1, 2022. This assumes a system is unhealthy if it hasn't been heard from in a 5 minute window:
SELECT heartbeat_agg(ping_time,'01-01-2022 UTC','10 days','5 min')FROM system_health;
dead_ranges(agg HEARTBEATAGG) RETURNS TABLE (start TIMESTAMPTZ,end TIMESTAMPTZ)
Given a heartbeat aggregate, this will return a set of (starttime, endtime) pairs representing when the underlying system did not have a valid heartbeat during the interval of the aggregate.
Required arguments
Name | Type | Description |
---|---|---|
agg | HeartbeatAgg | A heartbeat aggregate to get the liveness data from. |
Returns
Column | Type | Description |
---|---|---|
dead_ranges | TABLE (start TIMESTAMPTZ, end TIMESTAMPTZ) | The (start, end) pairs of when the system was down. |
Examples
Given a table called liveness
containing weekly heartbeat aggregates in column health
with timestamp column date
, we can use the following to get the intervals where the system was down during the week of Jan 9, 2022:
SELECT dead_ranges(health)FROM livenessWHERE date = '01-9-2022 UTC'
dead_ranges-----------------------------------------------------("2022-01-09 00:00:00+00","2022-01-09 00:00:30+00")("2022-01-12 15:27:22+00","2022-01-12 15:31:17+00")
downtime(agg HEARTBEATAGG) RETURNS INTERVAL
Given a heartbeat aggregate, this will sum all the ranges where the system
did not have a recent enough heartbeat.
There may appear to be some downtime between the start of the aggregate and
the first heartbeat. If there is a heartbeat aggregage covering the
previous period, you can use its last heartbeat to correct for this using
interpolated_downtime()
.
Required arguments
Name | Type | Description |
---|---|---|
agg | HeartbeatAgg | A heartbeat aggregate to get the liveness data from. |
Returns
Column | Type | Description |
---|---|---|
downtime | INTERVAL | The sum of all the dead ranges in the aggregate. |
Examples
Given a table called liveness
containing weekly heartbeat aggregates in column health
with timestamp column date
, we can use the following to get the total downtime of the system during the week of Jan 9, 2022:
SELECT downtime(health)FROM livenessWHERE date = '01-9-2022 UTC'
downtime--------00:04:25
interpolate(agg HEARTBEATAGG,pred HEARTBEATAGG) RETURNS HEARTBEATAGG
This takes a heartbeat aggregate and the aggregate immediately preceding it. It updates the aggregate to include any live ranges that should have been carried over from the last heartbeat in the predecessor, even if there aren't heartbeats for that range in the interval covered by this aggregate. It returns the updated aggregate, which can then be used with any of the heartbeat aggregate accessors.
Required arguments
Name | Type | Description |
---|---|---|
agg | HeartbeatAgg | A heartbeat aggregate containing liveness data for a particular interval. |
Optional arguments
Name | Type | Description |
---|---|---|
pred | HeartbeatAgg | The heartbeat aggregate for the preceding interval, if one exists. |
Returns
Column | Type | Description |
---|---|---|
interpolate | HeartbeatAgg | A copy of agg which has been update to include any heartbeat intervals extending past the end of pred . |
Examples
Given a table called liveness
containing weekly heartbeat aggregates in column health
with timestamp column date
, we can use the following to get the intervals where the system was unhealthy during the week of Jan 9, 2022. This correctly excludes any ranges covered by a heartbeat at the end of the Jan 2 week:
SELECT dead_ranges(interpolate(health,LAG(health) OVER (ORDER BY date)))FROM livenessWHERE date = '01-9-2022 UTC'
dead_ranges-----------------------------------------------------("2022-01-12 15:27:22+00","2022-01-12 15:31:17+00")
interpolated_downtime(agg HEARTBEATAGG,pred HEARTBEATAGG) RETURNS INTERVAL
This behaves very similarly to downtime()
, but it also takes the heartbeat aggregate from the preceding interval. It checks when the last heartbeat in the predecessor was received and makes sure not to consider the heartbeat interval after that time as unhealthy, even if it extends into the current aggregate prior to the first heartbeat.
Required arguments
Name | Type | Description |
---|---|---|
agg | HeartbeatAgg | A heartbeat aggregate to get the liveness data from. |
Optional arguments
Name | Type | Description |
---|---|---|
pred | HeartbeatAgg | The heartbeat aggregate for the interval before the one being measured, if one exists. |
Returns
Column | Type | Description |
---|---|---|
interpolated_downtime | INTERVAL | The sum of all the unhealthy ranges in the aggregate, excluding those covered by the last heartbeat of the previous interval. |
Examples
Given a table called liveness
containing weekly heartbeat aggregates in column health
with timestamp column date
, you can use this command to get the total interpolated downtime of the system during the week of Jan 9, 2022:
SELECT interpolated_downtime(health,LAG(health) OVER (ORDER BY date))FROM livenessWHERE date = '01-9-2022 UTC'
interpolated_downtime---------------------00:03:55
interpolated_uptime(agg HEARTBEATAGG,pred HEARTBEATAGG) RETURNS INTERVAL
This behaves very similarly to uptime()
, but it also takes the heartbeat aggregate from the preceding interval. It checks when the last heartbeat in the predecessor was received and makes sure that the entire heartbeat interval after that is considered live. This addresses the issue where uptime
would consider the interval between the start of the interval and the first heartbeat as dead.
Required arguments
Name | Type | Description |
---|---|---|
agg | HeartbeatAgg | A heartbeat aggregate to get the liveness data from. |
Optional arguments
Name | Type | Description |
---|---|---|
pred | HeartbeatAgg | The heartbeat aggregate for the interval before the one being measured, if one exists. |
Returns
Column | Type | Description |
---|---|---|
interpolated_uptime | INTERVAL | The sum of all the live ranges in the aggregate, including those covered by the last heartbeat of the previous interval. |
Examples
Given a table called liveness
containing weekly heartbeat aggregates in column health
with timestamp column date
, you can use this command to get the total interpolated uptime of the system during the week of Jan 9, 2022:
SELECT interpolated_uptime(health,LAG(health) OVER (ORDER BY date))FROM livenessWHERE date = '01-9-2022 UTC'
interpolated_uptime-------------------6 days 23:56:05
live_at(agg HEARTBEATAGG,test TIMESTAMPTZ) RETURNS BOOL
Given a heartbeat aggregate and a timestamp, this returns whether the
aggregate has a heartbeat indicating the system was live at the given time.
Note that this returns false for any time not covered by the aggregate.
Required arguments
Name | Type | Description |
---|---|---|
agg | HeartbeatAgg | A heartbeat aggregate to get the liveness data from. |
test | TimestampTz | The time to test the liveness of. |
Returns
Column | Type | Description |
---|---|---|
live_at | bool | True if the heartbeat aggregate had a heartbeat close before the test time. |
Examples
Given a table called liveness
containing weekly heartbeat aggregates in column health
with timestamp column date
, we can use the following to see if the sytem was live at a particular time:
SELECT live_at(health, '2022-01-12 15:30:00+00')FROM livenessWHERE date = '01-9-2022 UTC'
live_at---------f
live_ranges(agg HEARTBEATAGG) RETURNS TABLE (start TIMESTAMPTZ,end TIMESTAMPTZ)
Given a heartbeat aggregate, this returns a set of (starttime, endtime) pairs representing when the underlying system was live during the interval of the aggregate.
Required arguments
Name | Type | Description |
---|---|---|
agg | HeartbeatAgg | A heartbeat aggregate to get the liveness data from. |
Returns
Column | Type | Description |
---|---|---|
live_ranges | TABLE (start TIMESTAMPTZ, end TIMESTAMPTZ) | The (start, end) pairs of when the system was live. |
Examples
Given a table called liveness
containing weekly heartbeat aggregates in column health
with timestamp column date
, we can use the following to get the intervals where the system was live during the week of Jan 9, 2022:
SELECT live_ranges(health)FROM livenessWHERE date = '01-9-2022 UTC'
live_ranges-----------------------------------------------------("2022-01-09 00:00:30+00","2022-01-12 15:27:22+00")("2022-01-12 15:31:17+00","2022-01-16 00:00:00+00")
num_gaps(agg HEARTBEATAGG) RETURNS BIGINT
Given a heartbeat aggregate, this returns the number of gaps between the periods of liveness. Additionally, if the aggregate is not live at the start or end of its covered interval, these are also considered gaps.
Required arguments
Name | Type | Description |
---|---|---|
agg | HeartbeatAgg | A heartbeat aggregate to get the number of gaps from. |
Returns
Column | Type | Description |
---|---|---|
num_gaps | bigint | The number of gaps in the aggregate. |
Examples
Given a table called liveness
containing weekly heartbeat aggregates in column health
with timestamp column date
, use this query to see how many times the system was down in a particular week::
SELECT num_gaps(health)FROM livenessWHERE date = '01-9-2022 UTC'
num_gaps---------4
num_live_ranges(agg HEARTBEATAGG) RETURNS BIGINT
Given a heartbeat aggregate, this returns the number of live periods.
Required arguments
Name | Type | Description |
---|---|---|
agg | HeartbeatAgg | A heartbeat aggregate to get the number of ranges from. |
Returns
Column | Type | Description |
---|---|---|
num_live_ranges | bigint | The number of live ranges in the aggregate. |
Examples
Given a table called liveness
containing weekly heartbeat aggregates in column health
with timestamp column date
, use this query to see how many intervals the system was up in a given week::
SELECT num_live_ranges(health)FROM livenessWHERE date = '01-9-2022 UTC'
num_live_ranges---------5
trim_to(agg HEARTBEATAGG,start TIMESTAMPTZ,duration INTERVAL) RETURNS HEARTBEATAGG
Given a heartbeat aggregate, this reduces the time range covered by that aggregate. This can only be used to narrow the covered interval, passing arguments that would extend beyond the range covered by the initial aggregate gives an error.
Required arguments
Name | Type | Description |
---|---|---|
agg | HeartbeatAgg | A heartbeat aggregate to trim down. |
Optional arguments
Name | Type | Description |
---|---|---|
start | TimestampTz | The start of the trimmed range. If not provided, the returned heartbeat aggregate starts from the same time as the starting one. |
duration | Interval | How long the resulting aggregate should cover. If not provided, the returned heartbeat aggregate ends at the same time as the starting one. |
Returns
Column | Type | Description |
---|---|---|
trim_to | heartbeat_agg | The trimmed aggregate. |
Examples
Given a table called liveness
containing weekly heartbeat aggregates in column health
with timestamp column date
, use this query to roll up several weeks and trim the result to an exact month::
SELECT trim_to(rollup(health), '03-1-2022 UTC', '1 month')FROM livenessWHERE date > '02-21-2022 UTC' AND date < '3-7-2022 UTC'
uptime(agg HEARTBEATAGG) RETURNS INTERVAL
Given a heartbeat aggregate, this sums all the ranges where the system
was live and returns the total.
There may appear to be some downtime between the start of the aggregate and
the first heartbeat. If there is a heartbeat aggregage covering the
previous period, you can use its last heartbeat to correct for this using
interpolated_uptime()
.
Required arguments
Name | Type | Description |
---|---|---|
agg | HeartbeatAgg | A heartbeat aggregate to get the liveness data from. |
Returns
Column | Type | Description |
---|---|---|
uptime | INTERVAL | The sum of all the live ranges in the aggregate. |
Examples
Given a table called liveness
containing weekly heartbeat aggregates in column health
with timestamp column date
, you can use this command to get the total uptime of the system during the week of Jan 9, 2022:
SELECT uptime(health)FROM livenessWHERE date = '01-9-2022 UTC'
uptime-----------------6 days 23:55:35
rollup(heartbeatagg HEARTBEATAGG) RETURNS HEARTBEATAGG
This combines multiple heartbeat aggregates into one. This can be used
to combine aggregates into adjacent intervals into one larger interval,
such as rolling daily aggregates into a weekly or monthly aggregate.
Another use for this is to combine heartbeat aggregates for redundant
systems to determine if there were any overlapping failures. For instance,
a master and standby system can have their heartbeats combined to see if
there were any intervals where both systems were down at the same time. The
result of rolling overlapping heartbeats together like this is a
heartbeat aggregate which considers a time live if any of its component
aggregates were live.
Required arguments
Name | Type | Description |
---|---|---|
heartbeatagg | HeartbeatAgg | The heartbeat aggregates to roll up. |
Returns
Column | Type | Description |
---|---|---|
rollup | HeartbeatAgg | A heartbeat aggregate covering the interval from the earliest start time of its component aggregates to the latest end time. It combines the live ranges of all the components. |
Keywords
Found an issue on this page?Report an issue or Edit this page in GitHub.