When you backfill data, you are inserting data into a chunk that has already been compressed. As of version 2.10, this has been greatly simplified by running insert commands on compressed chunks directly. When doing bulk backfilling, it is recommended to pause the compression job until finished so the policy doesn't compress the chunk you are working on.

This section contains procedures for bulk backfilling, taking you through these steps:

  1. Temporarily turning off any existing compression policy. This stops the policy from trying to compress chunks that you are currently working on.
  2. Performing the insertion or backfill.
  3. Re-enabling the compression policy. This re-compresses the chunks you worked on.

Caveats:

  • Backfilling compressed chunks with unique constraints is only supported in version 2.11 and above.
  • In order to backfill the data and enforce unique constraints, it is possible that we end up decompressing some data. If we are backfilling larger amounts of data, it might be more performant to manully decompress the chunk that you are working on (as shown in the section Backfilling manually below).

To make backfilling easier, you can use the backfilling functions in the TimescaleDB extras GitHub repository. In particular, the decompress_backfill procedure automates many of the backfilling steps for you.

Note

This section shows you how to bulk backfill data using a temporary table. Temporary tables only exist for the duration of the database session, and then are automatically dropped. If you backfill regularly, you might prefer to use a regular table instead, so that multiple writers can insert into the table at the same time. In this case, after you are done backfilling the data, clean up by truncating your table in preparation for the next backfill.

  1. At the psql prompt, create a temporary table with the same schema as the hypertable you want to backfill into. In this example, the table is named example, and the temporary table is named cpu_temp:

    CREATE TEMPORARY TABLE cpu_temp AS SELECT * FROM example WITH NO DATA;
  2. Insert your data into the temporary table.

  3. Call the decompress_backfill procedure. This procedure halts the compression policy, identifies the compressed chunks that the backfilled data corresponds to, decompresses the chunks, inserts data from the backfill table into the main hypertable, and then re-enables the compression policy:

    CALL decompress_backfill(
    staging_table=>'cpu_temp', destination_hypertable=>'example'
    );

If you don't want to use a supplied function, you can perform the steps manually.

  1. At the psql prompt, find the job_id of the policy:

    SELECT j.job_id
    FROM timescaledb_information.jobs j
    WHERE j.proc_name = 'policy_compression'
    AND j.hypertable_name = <target table>;
  2. Pause compression, to prevent the policy from trying to compress chunks that you are currently working on:

    SELECT alter_job(<job_id>, scheduled => false);
  3. Decompress the chunks that you want to modify.

    SELECT decompress_chunk('_timescaledb_internal._hyper_2_2_chunk');

    Repeat for each chunk. Alternatively, you can decompress a set of chunks based on a time range using show_chunks:

    SELECT decompress_chunk(i)
    FROM show_chunks('conditions', newer_than, older_than) i;
  4. When you have decompressed all the chunks you want to modify, perform the INSERT or UPDATE commands to backfill the data.

  5. Restart the compression policy job. The next time the job runs, it recompresses any chunks that were decompressed.

    SELECT alter_job(<job_id>, scheduled => true);

    Alternatively, to recompress chunks immediately, use the run_job command:

    CALL run_job(<job_id>);

Keywords

Found an issue on this page?Report an issue or Edit this page in GitHub.