Decompression
If you need to backfill or update data in a compressed chunk, decompress the chunk first. Inserting data into a compressed chunk is more computationally expensive than inserting data into an uncompressed chunk, so decompressing the chunk is also a good idea if you need to backfill large amounts of data.
tip
When you are planning your compression strategy, make sure you leave enough additional storage capacity for decompressing chunks if you need to.
These are the main steps for decompressing chunks in preparation for inserting or backfilling data:
- Temporarily turn off any existing compression policy. This stops the policy trying to compress chunks that you are currently working on.
- Decompress chunks.
- Perform the insertion or backfill.
- Re-enable the compression policy. This re-compresses the chunks you worked on.
Decompress chunks manually
There are several methods for selecting chunks and decompressing them.
Decompress individual chunks
To decompress a single chunk by name, run this command:
SELECT decompress_chunk('_timescaledb_internal.<chunk_name>');
In this example, chunk_name
is the name of the chunk that you want to decompress.
Decompress chunks by time
To decompress a set of chunks based on a time range, you can use the output of
show_chunks
to decompress each one:
SELECT decompress_chunk(i) from show_chunks('table_name', newer_than, older_than) i;
Decompress chunks on more precise constraints
If you want to use more precise matching constraints, for example space partitioning, you can construct a command like this:
SELECT tableoid::regclass FROM metrics
WHERE time = '2000-01-01' AND device_id = 1
GROUP BY tableoid;
tableoid
------------------------------------------
_timescaledb_internal._hyper_72_37_chunk
Backfill historical data on compressed chunks
When you backfill data, you are inserting data that has a timestamp in the past into a corresponding chunk that has already been compressed.
In this section, we explain how to backfill data into a temporary table. Temporary tables only exist for the duration of the database session, and then are automatically dropped, This is the simplest method for doing a large backfill operation.
If you backfill regularly, you might prefer to use a regular table instead, so
that multiple writers can insert into the table at the same time before the
decompress_backfill
process. In this case, after you are done backfilling the
data, clean up by truncating your table in preparation for the next backfill, or
drop it completely.
Backfill with a supplied function
If you need to insert a batch of backfilled data, the TimescaleDB
extras GitHub repository includes functions for
backfilling batch data to compressed chunks. In this procedure, we describe how to use the decompress_backfill
function.
Backfilling with a supplied function
- At the psql prompt, create a temporary table with the same schema as the hypertable you want to backfill into. In this example, our table is called
example
, and the data column iscpu_temp
:CREATE TEMPORARY TABLE cpu_temp AS SELECT * FROM example WITH NO DATA;
- Insert your data into the backfill table.
- Use a supplied backfill function. This function halts the compression
policy, identifies the compressed chunks that the backfilled data
corresponds to, decompresses the chunks, inserts data from the backfill
table into the main hypertable, and then re-enables the compression policy:
CALL decompress_backfill(staging_table=>'cpu_temp', destination_hypertable=>'example');
Backfill manually
If you don't want to use a supplied function, you can perform the steps manually. In this procedure, we describe how to identify and turn off your compression policy, before manually decompressing chunks.
Backfilling manually
- At the psql prompt, find the
job_id
of the policy:SELECT s.job_id FROM timescaledb_information.jobs j INNER JOIN timescaledb_information.job_stats s ON j.job_id = s.job_id WHERE j.proc_name = 'policy_compression' AND s.hypertable_name = <target table>;
- Pause compression, to prevent the policy trying to compress chunks that you
are currently working on:
SELECT alter_job(<job_id>, scheduled => false);
- Decompress the chunks that you want to modify.Repeat for each chunk. Alternatively, you can decompress a set of chunks based on a time range using
SELECT decompress_chunk('_timescaledb_internal._hyper_2_2_chunk');
show_chunks
:SELECT decompress_chunk(i) from show_chunks('conditions', newer_than, older_than) i;
- When you have decompressed all the chunks you want to modify, perform the
INSERT
orUPDATE
commands to backfill the data. - Restart the compression policy job. The next time the job runs, it
recompresses any chunks that were decompressed.Alternatively, to recompress chunks immediately, use the
SELECT alter_job(<job_id>, scheduled => true);
run_job
command:CALL run_job(<job_id>);
Found an issue on this page?
Report an issue!Keywords