This document provides detailed step-by-step instructions to migrate data using pgcopydb to perform a live migration from a source database which is using PostgreSQL to Timescale.

You should provision a dedicated instance to run the migration steps from. Ideally an AWS EC2 instance that's in the same region as the Timescale target service. For an ingestion load of 10'000 transactions/s, and assuming that the historical data copy takes 2 days, we recommend 4 CPUs with 4 to 8 GiB of RAM and 1.2 TiB of storage.

Before beginning the migration process, ensure that tools psql, pg_dump, pg_dumpall, and pgcopydb are installed and available on the system that performs the migration.

For Debian and Ubuntu systems, you can install all the tools with:

sudo apt update
sudo apt install -y postgresql-common
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
sudo apt install -y pgcopydb

On other distributions, you can use the following installation instructions:

  • pgcopydb: Installation instructions can be found in the official repository. When installing from package managers like apt, yum, or dnf, the other required tools are usually also installed as dependencies to pgcopydb.

  • psql, pg_dump, and pg_dumpall: These can be installed by following the instructions in the How to Install psql on Mac, Ubuntu, Debian, Windows blog post. Although the instructions specifically mention psql, following them also installs pg_dump and pg_dumpall.

Note

In the context of migrations, your existing production database is referred to as the "source" database, while the new Timescale database that you intend to migrate your data to is referred to as the "target" database.

In detail, the migration process consists of the following steps:

  1. Set up a target database instance in Timescale.
  2. Prepare the source database for the live migration.
  3. Set up a replication slot and snapshot.
  4. Migrate roles and schema from source to target.
  5. Enable hypertables.
  6. Migrate initial data from source to target.
  7. Apply the replicated changes from source.
  8. Promote target database as new primary.
Tip

If you get stuck, you can get help by either opening a support request, or take your issue to the #migration channel in the community slack, where the developers of this migration method are there to help.

You can open a support request directly from the Timescale console, or by email to [email protected].

Create a database service in Timescale.

If you intend on migrating more than 400 GB, open a support request to ensure that enough disk is pre-provisioned on your Timescale instance.

You can open a support request directly from the Timescale console, or by email to [email protected].

It's important to ensure that the old_snapshot_threshold value is set to the default value of -1 in your source database. This prevents PostgreSQL from treating the data in a snapshot as outdated. If this value is set other than -1, it might affect the existing data migration step.

To check the current value of old_snapshot_threshold, run the command:

psql -X -d $SOURCE -c 'show old_snapshot_threshold'

If the query returns something other than -1, you must change it.

If you have a superuser on a self-hosted database, run the following command:

psql -X -d $SOURCE -c 'alter system set old_snapshot_threshold=-1'

Otherwise, if you are using a managed service, use your cloud provider's configuration mechanism to set old_snapshot_threshold to -1.

Next, you should set wal_level to logical so that the write-ahead log (WAL) records information that is needed for logical decoding.

To check the current value of wal_level, run the command:

psql -X -d $SOURCE -c 'show wal_level'

If the query returns something other than logical, you must change it.

If you have a superuser on a self-hosted database, run the following command:

psql -X -d $SOURCE -c 'alter system set wal_level=logical'

Otherwise, if you are using a managed service, use your cloud provider's configuration mechanism to set wal_level to logical.

Restart your database for the changes to take effect, and verify that the settings are reflected in your database.

The replication slot forms the backbone of the replication strategy.

A slot represents a stream of changes that can be replayed to a client in the order they were made on the origin server.

The stream of changes emitted by the slot are buffered into disk until they are applied on the target. The instance used to orchestrate the migration (the one running the commands) should have enough capacity to store the files, and it should be actively monitored to prevent any issues that might result due to lack of space.

Before starting, there's an important caveat. To replicate DELETE and UPDATE operations, the source table must either have a primary key or REPLICA IDENTITY set. Replica identity assists logical decoding in identifying the rows being modified. It defaults to using the table's primary key.

If a table doesn't have a primary key, you'll have to manually set the replica identity. One option is to use a unique, non-partial, non-deferrable index that includes only columns marked as NOT NULL. This can be set as the replica identity:

ALTER TABLE {table_name} REPLICA IDENTITY USING INDEX {_index_name}

If there's no primary key or viable unique index to use, you'll have to set REPLICA IDENTITY to FULL. If you are expecting a large number of UPDATE or DELETE operations on the table we do not recommend using FULL. For each UPDATE or DELETE statement, Postgres will have to read the whole table to find all matching rows, which will result in significantly slower replication.

ALTER TABLE {table_name} REPLICA IDENTITY FULL

Once you're sure all your tables are properly configured for replication, use pgcopydb's follow command to create a replication slot:

pgcopydb follow \
--source "$SOURCE" \
--target "$TARGET" \
--fail-fast \
--plugin wal2json

This command is going to be active during most of the migration process. You can run it on a separate terminal instance, or start it in the background. To start it in the background append > pgcopydb_follow.log 2>&1 & to the command. For example:

pgcopydb follow \
--source "$SOURCE" \
--target "$TARGET" \
--fail-fast \
--plugin wal2json > pgcopydb_follow.log 2>&1 &

The > pgcopydb_follow.log 2>&1 part redirects all the messages to the pgcopydb_follow.log file, this is optional but recommended. The pgcopydb follow command outputs many messages, if they are not redirected, using the terminal becomes cumbersome due to the constant pop up of messages.

The follow command not only creates the replication slot for streaming changes, but also exports a snapshot ID to /tmp/pgcopydb/snapshot. This ID can be used to migrate the data that was stored in the database prior to the creation of the replication slot.

A snapshot determines which data is visible to the transaction that is using the snapshot. Synchronized snapshots are necessary when two or more sessions need to see identical content in the database.

Before the stream of changes can be applied, the schema and data that existed prior to the creation of the replication slot must be migrated (step 4). The point that marks the beginning of the replication and buffering of changes is given by the exported snapshot. The larger the database, the more time it takes to perform the initial migration, and the longer the buffered files need to be stored.

pg_dumpall -d "$SOURCE" \
--quote-all-identifiers \
--roles-only \
--file=roles.sql

Timescale services do not support roles with superuser access. If your SQL dump includes roles that have such permissions, you'll need to modify the file to be compliant with the security model.

You can use the following sed command to remove unsupported statements and permissions from your roles.sql file:

sed -i -E \
-e '/CREATE ROLE "postgres";/d' \
-e '/ALTER ROLE "postgres"/d' \
-e 's/(NO)*SUPERUSER//g' \
-e 's/(NO)*REPLICATION//g' \
-e 's/(NO)*BYPASSRLS//g' \
-e 's/GRANTED BY "[^"]*"//g' \
roles.sql
Note

This command works only with the GNU implementation of sed (sometimes referred to as gsed). For the BSD implementation (the default on macOS), you need to add an extra argument to change the -i flag to -i ''.

To check the sed version, you can use the command sed --version. While the GNU version explicitly identifies itself as GNU, the BSD version of sed generally doesn't provide a straightforward --version flag and simply outputs an "illegal option" error.

A brief explanation of this script is:

  • CREATE ROLE "postgres"; and ALTER ROLE "postgres": These statements are removed because they require superuser access, which is not supported by Timescale.

  • (NO)SUPERUSER | (NO)REPLICATION | (NO)BYPASSRLS: These are permissions that require superuser access.

  • GRANTED BY role_specification: The GRANTED BY clause can also have permissions that require superuser access and should therefore be removed. Note: Per the TimescaleDB documentation, the GRANTOR in the GRANTED BY clause must be the current user, and this clause mainly serves the purpose of SQL compatibility. Therefore, it's safe to remove it.

pg_dump -d "$SOURCE" \
--format=plain \
--quote-all-identifiers \
--no-tablespaces \
--no-owner \
--no-privileges \
--schema-only \
--file=dump.sql \
--snapshot=$(cat /tmp/pgcopydb/snapshot)
  • --schema-only is used to dump only the object definitions (schema), not data.

  • --snapshot is used to specified the synchronized snapshot when making a dump of the database.

  • --no-tablespaces is required because Timescale does not support tablespaces other than the default. This is a known limitation.

  • --no-owner is required because Timescale's tsdbadmin user is not a superuser and cannot assign ownership in all cases. This flag means that everything is owned by the user used to connect to the target, regardless of ownership in the source. This is a known limitation.

  • --no-privileges is required because Timescale's tsdbadmin user is not a superuser and cannot assign privileges in all cases. This flag means that privileges assigned to other users must be reassigned in the target database as a manual clean-up task. This is a known limitation.

psql -X -d "$TARGET" \
-v ON_ERROR_STOP=1 \
--echo-errors \
-f roles.sql \
-f dump.sql

This is the ideal point to convert regular tables into hypertables. In simple terms, you'll want to convert the tables that contain time series data. For each table that's going to be converted to a Hypertable in the target database, run the following command:

psql -X -d "$TARGET" \
-v ON_ERROR_STOP=1 \
-c "SELECT create_hypertable('<table name>', '<time column name>')"

A more detailed explanation can be found in the hypertable documentation.

Once the table is converted, you can follow the guides to enable more Timescale features like retention and compression.

Note

This step is optional, but we strongly recommend that you perform it now.

While it is possible to convert a table to a hypertable after the migration is complete, that requires effectively rewriting all data in the table, which locks the table for the duration of the operation and prevents writes.

Use pgcopydb and the snapshot ID obtained in step 3 (/tmp/pgcopydb/snapshot) to copy the data. This action can take a significant amount of time, up to several days, depending on the size of the source database. You must wait until it finishes before moving on to the next step. To start copying data, execute the following command:

pgcopydb copy table-data \
--source "$SOURCE" \
--target "$TARGET" \
--snapshot $(cat /tmp/pgcopydb/snapshot) \
--split-tables-larger-than '10 GB' \
--table-jobs 8
  • --snapshot specifies the synchronized snapshot to use when retrieving the data. It must be set up to the snapshot exported in step 3, that guarantees that the only data copied is the one that existed on the database prior to the creation of the replication slot.

  • --split-tables-larger-than defines the table size threshold after which more than once process is going be use at the same time to process a single source table.

  • --table-jobs is the number of concurrent COPY jobs to run.

With the schema and initial data loaded, you can now apply the buffered and live changes emitted by the replication slot:

pgcopydb stream sentinel set apply --source $SOURCE

The replication progress can be monitored by querying the pg_stat_replication view:

psql $SOURCE \
-f - <<'EOF'
SELECT replay_lag
FROM pg_stat_replication
WHERE application_name='pgcopydb' AND state='streaming'
EOF
  • replay_lag: Time elapsed between flushing recent WAL locally and receiving notification that this standby server has written, flushed and applied it.

In this step you will switch production traffic over to the new database. In order to achieve transactional consistency, your application must stop writing data to the source database, and all writes should be flushed from the source to the target before your application starts writing to the target database.

This means that your application will experience partial downtime (in writes). The duration of this downtime is proportional to the steady-state replay_lag. Ideally, this is on the order of seconds or minutes.

Before you proceed, ensure that you have checked the current value of replay_lag, and that you are comfortable with a partial downtime of this duration.

Update the table statistics by running ANALYZE on all data:

psql $TARGET -c "ANALYZE;"
Note

Application downtime begins here

Stop the write traffic to the source database. With no new data being ingested, the replay_lag should decrease considerably faster.

Instruct pgcopydb to stop when it's done applying the changes that have been generated up to this point:

pgcopydb stream sentinel set endpos --current --source $SOURCE

Data written to the source database after this point won't be replicated.

Changes to sequences are not replicated. Fortunately, pgcopydb has a command to migrate them:

pgcopydb copy sequences \
--source "$SOURCE" \
--target "$TARGET" \
--resume \
--not-consistent

Wait until the pgcopydb follow process from step 3 finishes its execution. The process runs until the end position is reached. If you started pgcopydb follow in the background you can bring it to the foreground with the fg command.

A successful execution of the follow command should have logs stating that the end position has been reached and the that the process is done:

07:58:28.859 119 INFO Transform reached end position 0/2ECDB58 at 0/2ECDB58
07:58:29.108 120 INFO Replay reached end position 0/2ECDB58 at 0/2ECDB58
07:58:29.168 120 INFO Replayed up to replay_lsn 0/1AB61F8, stopping
07:58:29.473 8 INFO Subprocesses for prefetch, transform, and catchup have now all exited
07:58:29.534 8 INFO Current sentinel replay_lsn is 0/2ECDB58, endpos is 0/2ECDB58
07:58:29.534 8 INFO Current endpos 0/2ECDB58 has been reached at 0/2ECDB58
07:58:29.534 8 INFO Follow mode is now done, reached replay_lsn 0/2ECDB58 with endpos 0/2ECDB58

If the command output was redirected to a file the messages won't be shown in the terminal even if you bring the process to the foreground. In this case, inspect the log file.

Switch your application to use the target database, and start accepting writes again.

Note

Application downtime ends here

Finally, and clean up the pgcopydb artifacts:

pgcopydb stream cleanup \
--source "$SOURCE" \
--target "$TARGET"

Keywords

Found an issue on this page?

Report an issue!