About Timescale hyperfunctions

Timescale hyperfunctions are a specialized set of functions that allow you to analyze time-series data. You can use hyperfunctions to analyze anything you have stored as time-series data, including IoT devices, IT systems, marketing analytics, user behavior, financial metrics, and cryptocurrency.

Hyperfunctions allow you to perform critical time-series queries quickly, analyze time-series data, and extract meaningful information. They aim to identify, build, and combine all of the functionality SQL needs to perform time-series analysis into a single extension.

Some hyperfunctions are included in the default TimescaleDB product. For additional hyperfunctions, you need to install the Timescale Toolkit PostgreSQL extension.

Hyperfunctions available with TimescaleDB and Timescale Toolkit

Here is a list of all the hyperfunctions provided by Timescale. Hyperfunctions marked 'Toolkit' require an installation of Timescale Toolkit. Hyperfunctions marked 'experimental' are still under development.

warning

Experimental features could have bugs. They might not be backwards compatible, and could be removed in future releases. Use these features at your own risk, and do not use any experimental features in production.

Approximate count distincts

Hyperfunction subfamilyHyperfunction typeAPI callToolkitExperimental
HyperloglogAggregateapprox_count_distinct()
hyperloglog()
Accessordistinct_count()
stderror()
Rolluprollup()

Saturating math

Hyperfunction subfamilyHyperfunction typeAPI callToolkitExperimental
Saturating mathOthersaturating_add()
saturating_add_pos()
saturating_mul()
saturating_sub()
saturating_sub_pos()

Statistical aggregates

Financial analysis

Hyperfunction subfamilyHyperfunction typeAPI callToolkitExperimental
OpenHighLowCloseAggregateohlc()
Accessoropen() | high() | low() | close()
open_time() | high_time() | low_time() | close_time()
Rolluprollup()

Gapfilling and interpolation

Hyperfunction subfamilyHyperfunction typeAPI callToolkitExperimental
InterpolationOtherinterpolate()
locf()
GapfillOthertime_bucket_gapfill()

Percentile approximation

Hyperfunction subfamilyHyperfunction typeAPI callToolkitExperimental
Percentile approximationAggregatepercentile_agg()
Accessorapprox_percentile()
approx_percentile_rank()
error()
max_val()
mean()
min_val()
num_vals()
Rolluprollup()
Advanced aggregationAggregatetdigest()
uddsketch()

Metric aggregation

Time-weighted averages

Hyperfunction subfamilyHyperfunction typeAPI callToolkitExperimental
Time-weighted averagesAggregatetime_weight()
Accessoraverage()
first_time() | last_time()
first_val() | last_val()
integral()
interpolated_average()
interpolated_integral()
Rolluprollup()

Downsample

Hyperfunction subfamilyHyperfunction typeAPI callToolkitExperimental
DownsampleOtherasap_smooth()
gp_lttb()
lttb()

Frequency analysis

Hyperfunction subfamilyHyperfunction typeAPI callToolkitExperimental
CountMinSketchAggregatecount_min_sketch()
Accessorapprox_count()
StateAggAggregatestate_agg()
Accessorduration_in()
interpolated_duration_in()
into_values()
SpaceSavingAggregateAggregatefreq_agg()
topn_agg()
Accessorinto_values()
min_frequency() | max_frequency()
topn()

For more information about each of the API calls listed in this table, see our hyperfunction API documentation.

Function pipelines

Function pipelines are an experimental feature, designed to radically improve the developer ergonomics of analyzing data in PostgreSQL and SQL, by applying principles from functional programming and popular tools like Python's Pandas, and PromQL.

SQL is the best language for data analysis, but it is not perfect, and at times can get quite unwieldy. For example, this query gets data from the last day from the measurements table, sorts the data by the time column, calculates the delta between the values, takes the absolute value of the delta, and then takes the sum of the result of the previous steps:

SELECT device id,
sum(abs_delta) as volatility
FROM (
 SELECT device_id,
abs(val - lag(val) OVER last_day) as abs_delta
FROM measurements
WHERE ts >= now()-'1 day'::interval) calc_delta
GROUP BY device_id;

You can express the same query with a function pipeline like this:

SELECT device_id,
 timevector(ts, val) -> sort() -> delta() -> abs() -> sum() as volatility
FROM measurements
WHERE ts >= now()-'1 day'::interval
GROUP BY device_id;

Function pipelines are completely SQL compliant, meaning that any tool that speaks SQL is able to support data analysis using function pipelines.

For more information about how function pipelines work, read our blog post.

Toolkit feature development

Timescale Toolkit features are developed in the open. As features are developed they are categorized as experimental, beta, stable, or deprecated. This documentation covers the stable features, but more information on our experimental features in development can be found in the Toolkit repository.

Contribute to Timescale Toolkit

We want and need your feedback! What are the frustrating parts of analyzing time-series data? What takes far more code than you feel it should? What runs slowly, or only runs quickly after many rewrites? We want to solve community-wide problems and incorporate as much feedback as possible.

Found an issue on this page?

Report an issue!

Keywords

Related Content