Hypercore is the TimescaleDB hybrid row-columnar storage engine used by hypertables. Hypertables partition your data in chunks. Chunks stored in the rowstore use a row-oriented data format optimized for high-speed inserts and updates. Chunks stored in the columnstore use a columnar data format optimized for analytics. You ingest hot data into the rowstore. As data cools and becomes more suited for analytics, Timescale Cloud automatically converts these chunks of data to the columnstore. You define the moment when data is converted using a columnstore policy.
When you convert chunks from the rowstore to the columnstore, multiple records are grouped into a single row. The columns of this row hold an array-like structure that stores all the data. For example, data in the following rowstore chunk:
Timestamp | Device ID | Device Type | CPU | Disk IO |
---|---|---|---|---|
12:00:01 | A | SSD | 70.11 | 13.4 |
12:00:01 | B | HDD | 69.70 | 20.5 |
12:00:02 | A | SSD | 70.12 | 13.2 |
12:00:02 | B | HDD | 69.69 | 23.4 |
12:00:03 | A | SSD | 70.14 | 13.0 |
12:00:03 | B | HDD | 69.70 | 25.2 |
Is converted and compressed into arrays in a row in the columnstore:
Timestamp | Device ID | Device Type | CPU | Disk IO |
---|---|---|---|---|
[12:00:01, 12:00:01, 12:00:02, 12:00:02, 12:00:03, 12:00:03] | [A, B, A, B, A, B] | [SSD, HDD, SSD, HDD, SSD, HDD] | [70.11, 69.70, 70.12, 69.69, 70.14, 69.70] | [13.4, 20.5, 13.2, 23.4, 13.0, 25.2] |
Because a single row takes up less disk space, you can reduce your chunk size by more than 90%, and can also speed up your queries. This saves on storage costs, and keeps your queries operating at lightning speed.
This page shows you how to get the best results when you set a policy to automatically convert chunks in a hypertable from the rowstore to the columnstore.
To follow the procedure on this page you need to:
- Create a target Timescale Cloud service
The code samples in this page use the
data from Try the key Timescale features.The compression ratio and query performance of data in the columnstore is dependent on the order and structure of your
data. Rows that change over a dimension should be close to each other. With time-series data, you orderby
the time
dimension. For example, Timestamp
:
Timestamp | Device ID | Device Type | CPU | Disk IO |
---|---|---|---|---|
12:00:01 | A | SSD | 70.11 | 13.4 |
This ensures that records are compressed and accessed in the same order. However, you would always have to
access the data using the time dimension, then filter all the rows using other criteria. To make your queries more
efficient, you segment your data based on the way you want to access it. For example, to rapidly access data about a
single device, you segmentby
the Device ID
column. This enables you to run much faster analytical queries on
data in the columnstore.
When Timescale Cloud converts a chunk to the columnstore, TimescaleDB automatically creates a different schema for your
data. TimescaleDB creates and uses custom indexes to incorporate the segmentby
and orderby
parameters when
you write to and read from the columstore.
To set up your Hypercore automation:
Connect to your Timescale Cloud service
In Timescale Console open an SQL editor. You can also connect to your service using psql.
Enable columnstore on a hypertable
Create a job that automatically moves chunks in a hypertable to the columnstore at a specific time interval. By default, your table is
orderedby
the time column. For efficient queries on columnstore data, remember tosegmentby
the column you will use most often to filter your data:- Use
ALTER TABLE
for a hypertableALTER TABLE crypto_ticks SET (timescaledb.enable_columnstore = true,timescaledb.segmentby = 'symbol'); - Use ALTER MATERIALIZED VIEW for a continuous aggregateBefore you sayALTER MATERIALIZED VIEW assets_candlestick_daily set (timescaledb.enable_columnstore = true,timescaledb.segmentby = 'symbol' );
huh
, a continuous aggregate is a specialized hypertable.
- Use
Add a policy to convert chunks to the columnstore at a specific time interval
For example, move yesterday's crypto trading data to the columnstore:
CALL add_columnstore_policy('crypto_ticks', after => INTERVAL '1d');Check the columstore policy
View your data space saving:
When you convert data to the columnstore, as well as being optimized for analytics, it is compressed by more than 90%. This saves on storage costs and keeps your queries operating at lightning speed. To see the amount of space saved:
SELECTpg_size_pretty(before_compression_total_bytes) as before,pg_size_pretty(after_compression_total_bytes) as afterFROM hypertable_compression_stats('crypto_ticks');You see something like:
before after 194 MB 24 MB View the policies that you set or the policies that already exist:
SELECT * FROM timescaledb_information.jobsWHERE proc_name='policy_compression';
Pause a columnstore policy
If you need to modify or add a lot of data to a chunk in the columnstore, best practice is to stop any jobs moving chunks to the columnstore, convert the chunk back to the rowstore, then modify the data. After the update, convert the chunk to the columnstore and restart the jobs.
SELECT * FROM timescaledb_information.jobs whereproc_name = 'policy_compression' AND relname = 'crypto_ticks'-- Select the JOB_ID from the resultsSELECT alter_job(JOB_ID, scheduled => false);See alter_job.
Restart a columnstore policy
SELECT alter_job(JOB_ID, scheduled => true);See alter_job.
Remove a columnstore policy
CALL remove_columnstore_policy('crypto_ticks');Disable columnstore
If your table has chunks in the columnstore, you have to convert the chunks back to the rowstore before you disable the columnstore.
ALTER TABLE crypto_ticks SET (timescaledb.enable_columnstore = false);
Timescale Cloud charges are based on the amount of storage you use. You don't pay for fixed storage size, and you don't need to worry about scaling disk size as your data grows - we handle it all for you. To reduce your data costs further, combine Hypercore, a data retention policy, and tiered storage.
For integers, timestamps, and other integer-like types, data is compressed using delta encoding, delta-of-delta, simple-8b, and run-length encoding. For columns with few repeated values,XOR-based and dictionary compression is used. For all other types, dictionary compression is used.
Keywords
Found an issue on this page?Report an issue or Edit this page in GitHub.