To help you understand what is going on in your database, you can create your own custom visualizations and dashboards. TimescaleDB allows you to create custom dashboards for your data, using the full functionality of PostgreSQL monitoring. Of course, you can always use other commercial tools to monitor TimescaleDB, just as you can with PostgreSQL, but custom dashboards give you the most flexibility.
This tutorial shows you how to build a custom visualization that shows how many chunks a hypertable has, the state of the compression for each chunk, and the current total size of the database. The front-end is built in React, and connects to metrics about TimescaleDB using Hasura, a GraphQL service. This tutorial includes:
You can get the full code for this project from this GitHub repo.
This project works on any TimescaleDB instance, but if you’re interested in generating sample data to use, use our Simulating IoT sensor data tutorial.
TimescaleDB uses hypertables to store time-series data. TimescaleDB automatically partitions data in hypertables into smaller child tables called chunks. The chunks represent data for a given time period, which makes it easier to query and manage over time. For example, if you wanted to query data from 10am to 11am, instead of scanning your entire database, TimescaleDB would scan the specific chunks that contain data for just that period. All the interaction with the database still occurs on the hypertable using SQL, but TimescaleDB partitions the data to make large queries more efficient.
Many features in TimescaleDB rely on chunks, including continuous aggregates, data retention, and native compression. Native compression is particularly helpful with large time-series datasets. Time-series data can be relentless in quantity and speed, and difficult to store and query without a purpose-built time-series database. You can use TimescaleDB compression to save as much as 97% of your disk space for the same amount of data, and usually increase the speed of your queries over time.
Visualizing the state of your hypertables can help you gain a better understanding of how compression works and possibly even how different types impact compression efficiently. Visualization can help you see the results of compression table by table, and chunk by chunk. To do this, TimescaleDB provides multiple views and functions that can be queried for information about the state of your hypertables and chunks. Although there is no combined view that provides exactly the data we need for our visualization, TimescaleDB provides the building blocks to craft a custom SQL query that returns the data needed to better visualize the current hypertable and chunk compression state. For example, this query returns the name and time series range that this chunk covers:
tsdb=> SELECT chunk_name, range_start, range_end FROM timescaledb_information.chunks LIMIT 1; chunk_name | range_start | range_end ------------------+------------------------+------------------------ _hyper_2_2_chunk | 2021-04-29 00:00:00+00 | 2021-05-06 00:00:00+00 (1 row)
Hypertables that have data spanning massive time periods can have thousands of chunks, so visualizing them effectively is important. To provide a visual perspective of the table, the image area represents the total size of all table data before compression. Each circle represents a chunk, and the area of each circle represents the size of the chunk on disk.
Here’s an example of what this visualization looks like:
With this visualization, you can see a few things at a glance:
By using the uncompressed data size to represent the area of the image, you can quickly get a sense of how much space has been saved by the overall white space across the image. Smaller yellow chunks are compressed and their size represents their portion of space within the larger table, while larger dark chunks are uncompressed and take up more space in the image. You can also make the visualization interactive, so that you can click on a chunk and compress or uncompress it manually.
To build the visualization application, we created some new functions and views to:
To extract information from chunks, you can use the
timescaledb_information.chunks view that the TimescaleDB extension provides.
This query returns the time-series range of each chunk:
SELECT hypertable_schema, hypertable_name, chunk_name, range_start, range_end FROM timescaledb_information.chunks LIMIT 1;
Sample row vertically output to explore:
-[ RECORD 1 ]-----+----------------------- hypertable_schema | public hypertable_name | conditions chunk_name | _hyper_2_2_chunk range_start | 2021-04-29 00:00:00+00 range_end | 2021-05-06 00:00:00+00
The chunk name returned with the dataset is unique, and can be used in other
queries to retrieve enhanced details about each chunk. In this example, the
chunk has a
range_end that spans one week. As new data is
inserted into the table, any data that has a timestamp between 2021-04-29 and
2021-05-06 is stored on this specific chunk for the
When you know the name and time range of each chunk, you need to get more
detail about the compression status and how much disk is being saved by
compressing the data. You can get this additional information by querying the
chunk_compression_stats function with the
tsdb=> SELECT * FROM chunk_compression_stats('conditions'); -[ RECORD 1 ]------------------+---------------------- chunk_schema | _timescaledb_internal chunk_name | _hyper_6_913_chunk compression_status | Compressed before_compression_table_bytes | 204800 before_compression_index_bytes | 360448 before_compression_toast_bytes | 0 before_compression_total_bytes | 565248 after_compression_table_bytes | 8192 after_compression_index_bytes | 16384 after_compression_toast_bytes | 98304 after_compression_total_bytes | 122880 node_name | -[ RECORD 2 ]------------------+---------------------- chunk_schema | _timescaledb_internal chunk_name | _hyper_6_880_chunk compression_status | Uncompressed before_compression_table_bytes | before_compression_toast_bytes | before_compression_total_bytes | after_compression_table_bytes | after_compression_index_bytes | after_compression_toast_bytes | after_compression_total_bytes | node_name |
When the chunk is uncompressed, this query does not show the size of the chunk.
To get the size of uncompressed chunks, use the
and pass the hypertable name as a parameter:
tsdb=> SELECT * FROM chunks_detailed_size('conditions'); -[ RECORD 1 ]+---------------------- chunk_schema | _timescaledb_internal chunk_name | _hyper_6_853_chunk table_bytes | 8192 index_bytes | 40960 toast_bytes | 98304 total_bytes | 147456 node_name |
You can use the
total_bytes information in this function to see that the chunk
Now that you know how to gather all of the data you need to drive the visualization, it’s time to join it together in a view that can be queried using SQL (and eventually, our application).
CREATE OR REPLACE VIEW chunks_with_compression AS SELECT DISTINCT ch.chunk_name, ccs.chunk_schema, ch.hypertable_schema, ch.hypertable_name, ch.range_start, ch.range_end, COALESCE(ccs.before_compression_total_bytes, NULL, cds.total_bytes) AS before_compression_total_bytes, ccs.after_compression_total_bytes FROM ( SELECT hypertable_schema, hypertable_name, chunk_name, range_start, range_end FROM timescaledb_information.chunks) AS ch LEFT OUTER JOIN LATERAL chunk_compression_stats(ch.hypertable_name::regclass) ccs ON ch.chunk_name = ccs.chunk_name LEFT OUTER JOIN LATERAL chunks_detailed_size(ch.hypertable_name::regclass) cds ON ccs.chunk_schema = cds.chunk_schema AND ch.chunk_name = cds.chunk_name;
To test, use the name of a random chunk from the hypertable to query this view and check that you get all of the information you need. You should see the time range of the chunk, the hypertable information, and its size before and after compression.
In this example chunk, the
before_compression_total_bytes is ten times bigger
after_compression_total_bytes. Compression saved more than 90% of disk
SELECT * FROM chunks_with_compression; ... -[ RECORD 96 ]-----------------+----------------------- chunk_name | _hyper_2_37_chunk chunk_schema | _timescaledb_internal hypertable_schema | public hypertable_name | conditions range_start | 2021-05-27 00:00:00+00 range_end | 2021-06-03 00:00:00+00 before_compression_total_bytes | 90112 after_compression_total_bytes | 8192
In this example, we are using data generated by our Simulating IoT sensor data tutorial. This data results in a simple schema and data that mimics a number of IoT sensors with information on time, device, and temperature.
By following the tutorial, you have a table named
conditions, which stores the
temperature of example devices over time.
Use these commands to create the table and generate some sample data:
CREATE TABLE conditions ( time TIMESTAMPTZ NOT NULL, device INTEGER NOT NULL, temperature FLOAT NOT NULL, PRIMARY KEY(time, device) ); SELECT * FROM create_hypertable('conditions', 'time', 'device', 3); INSERT INTO conditions SELECT time, (random()*30)::int, random()*80 - 40 FROM generate_series(TIMESTAMP '2020-01-01 00:00:00', TIMESTAMP '2020-01-01 00:00:00' + INTERVAL '1 month', INTERVAL '1 min') AS time;
When you write a backend application, you need to protect the database and expose only the required information to an authorized user. The Hasura GraphQL Engine does this by providing GraphQL APIs over new or existing PostgreSQL databases. This allows you to create permission rules and dynamically expand your database resources.
When you have your sample database set up, you can use the Hasura cloud to connect the resources that we want to expose through GraphQL. Hasura is a good option because it connects to our TimescaleDB database and quickly exposes the tables, views, and functions you need. For more information about setting up a new data source on Hasura, check out their wizard.
We’re going to use two types of operations:
Hasura allows you to attach any resource and offer it as a query or a
subscription. In this example, you map the
chunks_with_compression view you
created earlier as a GraphQL resource, so it can be consumed as a query or
subscription. You can then map the changes, or mutations, as you compress and
decompress a chunk. This image describes a SQL view is tracked on Hasura:
Hasura can map custom types that come from table structures. To create the necessary mutations, functions need to return types that inherit from table structures. To create a new structure of the table from a query, call the query with limit 0:
CREATE TABLE compressed_chunk AS SELECT compress_chunk((c.chunk_schema ||'.' ||c.chunk_name)::regclass) FROM timescaledb_information.chunks c WHERE NOT c.is_compressed limit 0;
Hasura needs a function to be tracked as a mutation. Create a function to rewrap
compress_chunk from the TimescaleDB extension, and return the
"compressed_chunk" in a function that compresses the chunk:
CREATE OR REPLACE FUNCTION compress_chunk_named(varchar) returns setof compressed_chunk AS $$ SELECT compress_chunk((c.chunk_schema ||'.' ||$1)::regclass) FROM timescaledb_information.chunks c WHERE NOT c.is_compressed AND c.chunk_name = $1 limit 1 $$ LANGUAGE SQL VOLATILE;
Note that the function adds an extra
where clause so that it does not compress
a chunk that is already compressed.
You also need a similar function for decompression:
CREATE OR REPLACE FUNCTION decompress_chunk_named(varchar) returns setof compressed_chunk AS $$ SELECT decompress_chunk((c.chunk_schema ||'.' ||$1)::regclass) FROM timescaledb_information.chunks c WHERE c.is_compressed AND c.chunk_name = $1 limit 1 $$ LANGUAGE SQL VOLATILE;
The next step is to go to the Hasura cloud and connect the database as a new
data source. In the data panel, set up the PostgreSQL URI of your database, and
then you can track each function as a query or mutation. This is an example of
compress_chunk_named function. In our case, the subscription goes to the
chunks_with_compression function. You can also track
compress_chunk_named as GQL mutations with a single argument.
For the full code of our front-end application, see our GitHub repo. The front-end application connects to the Hasura GraphQL layer you created, then connects to the TimescaleDB database to retrieve information about chunks and compression status. The front-end application then renders the image for the visualization.
As a summary, the front-end:
TimescaleDB is a powerful relational database for time-series data, bringing the full spectrum of tools and dashboards available for PostgreSQL.
In this tutorial you learned how to collect hypertables metadata from TimescaleDB internals. Expose it through GraphQL and fetch the data using a React client.
You can get the full code for this project from this GitHub repo.
This tutorial was originally created for HasuraCon 2021.
We hope you find new ways to explore your data and make your decisions smarter and data-driven. If you get any interesting results or have any questions about this tutorial, drop us a line on our community Slack channel.
Found an issue on this page?Report an issue!