Live migration

Timescale Cloud: Performance, Scale, Enterprise

Self-hosted products

MST

Live-migration is an end-to-end solution that copies the database schema and data to your target Timescale Cloud service, then replicates the database activity in your source database to the target service in real-time. Live-migration uses the PostgreSQL logical decoding functionality and leverages pgcopydb.

You use the live-migration Docker image to move 100GB-10TB+ of data to a Timescale Cloud service seamlessly with only a few minutes downtime.

Important

If you want to migrate more than 400GB of data, create a Timescale Console support request, or send us an email at support@timescale.com saying how much data you want to migrate. We will pre-provision your Timescale Cloud instance for you.

Best practice is to use live-migration when:

Modifying your application logic to perform dual writes is a significant effort.
The insert workload does not exceed 20,000 rows per second, and inserts are batched.
Use Dual write and backfill for greater workloads.
Your source database:
- Uses UPDATE and DELETE statements on uncompressed time-series data.
  Live-migration does not support replicating INSERT/UPDATE/DELETE statements on compressed data.
- Has large, busy tables with primary keys.
- Does not have many UPDATE or DELETE statements.

This page shows you how to move your data from a self-hosted database to a Timescale Cloud service using the live-migration Docker image.

Prerequisites

Best practice is to use an Ubuntu EC2 instance hosted in the same region as your Timescale Cloud service to move data. That is, the machine you run the commands on to move your data from your source database to your target Timescale Cloud service.

Before you move your data:

Create a target Timescale Cloud service.
Each Timescale Cloud service has a single database that supports the most popular extensions. Timescale Cloud services do not support tablespaces, and there is no superuser associated with a service. Best practice is to create a Timescale Cloud service with at least 8 CPUs for a smoother experience. A higher-spec instance can significantly reduce the overall migration window.
To ensure that maintenance does not run while migration is in progress, best practice is to adjust the maintenance window.

Install Docker on your migration machine.
This machine needs sufficient space to store the buffered changes that occur while your data is being copied. This space is proportional to the amount of new uncompressed data being written to the Timescale Cloud service during migration. A general rule of thumb is between 100GB and 500GB. The CPU specifications of this EC2 instance should match those of your Timescale Cloud instance for optimal performance. For example, if your Timescale Cloud instance has an 8-CPU configuration, then your EC2 instance should also have 8 CPUs.
Before starting live-migration, read the Frequently Asked Questions.

Migrate to Timescale Cloud

To move your data from a self-hosted database to a Timescale Cloud service:

And you are done, your data is now in your Timescale Cloud service.

Troubleshooting

This section shows you how to workaround issues frequently seen issues using Live-migration.

ERROR: relation "xxx.yy" does not exist

This may happen when a relation is removed after executing the snapshot command. A relation can be a table, index, view, or materialized view. When you see you this error:

Do not perform any explicit DDL operation on the source database during the course of migration.
If you are migrating from self-hosted TimescaleDB or MST, disable the chunk retention policy on your source database until you have finished migration.

FATAL: remaining connection slots are reserved for non-replication superuser connections

This may happen when the number of connections exhaust max_connections defined in your target Timescale Cloud service. By default, live-migration needs around ~6 connections on the source and ~12 connections on the target.

Migration seems to be stuck with “x GB copied to Target DB (Source DB is y GB)”

When you are migrating a lot of data involved in aggregation, or there are many materialized views taking time to complete the materialization, this may be due to REFRESH MATERIALIZED VIEWS happening at the end of initial data migration.

To resolve this issue:

See what is happening on the target Timescale Cloud service:


psql $TARGET -c "select * from pg_stat_activity where application_name ilike '%pgcopydb%';"

When you run the migrate, add the following flags to exclude specific materialized views being materialized:
```
--skip-table-data <matview1> <matview2>”
```
When migrate has finished, manually refresh the materialized views you excluded.

Restart migration from scratch after a non-resumable failure

If the migration halts due to a failure, such as a misconfiguration of the source or target database, you may need to restart the migration from scratch. In such cases, you can reuse the original Timescale target instance created for the migration by utilizing the --drop-if-exists flag with the migrate command.

This flag ensures that the existing target objects created by the previous migration are dropped, allowing the migration to proceed without trouble.

Note: This flag also requires you to manually recreate the TimescaleDB extension on the target.

Here’s an example command sequence to restart the migration:


psql $TARGET -c "DROP EXTENSION timescaledb CASCADE"

psql $TARGET -c 'CREATE EXTENSION timescaledb VERSION "<desired version>"'

docker run --rm -it --name live-migration-migrate \
    -e PGCOPYDB_SOURCE_PGURI=$SOURCE \
    -e PGCOPYDB_TARGET_PGURI=$TARGET \
    --pid=host \
    -v ~/live-migration:/opt/timescale/ts_cdc \
    timescale/live-migration:latest migrate --drop-if-exists

This approach provides a clean slate for the migration process while reusing the existing target instance.

Inactive or lagging replication slots

If you encounter an “Inactive or lagging replication slots” warning on your cloud provider console after using live-migration, it might be due to lingering replication slots created by the live-migration tool on your source database.

To clean up resources associated with live migration, use the following command:


docker run --rm -it --name live-migration-clean \
    -e PGCOPYDB_SOURCE_PGURI=$SOURCE \
    -e PGCOPYDB_TARGET_PGURI=$TARGET \
    --pid=host \
    -v ~/live-migration:/opt/timescale/ts_cdc \
    timescale/live-migration:latest clean --prune

The --prune flag is used to delete temporary files in the ~/live-migration directory that were needed for the migration process. It's important to note that executing the clean command means you cannot resume the interrupted live migration.

Role passwords

Because of issues dumping passwords from various managed service providers, Live-migration migrates roles without passwords. You have to migrate passwords manually.

Table privileges

Live-migration does not migrate table privileges. After completing Live-migration:

Grant all roles to tsdbadmin.


psql -d $SOURCE -t -A -c "SELECT FORMAT('GRANT %I TO tsdbadmin;', rolname) FROM 
pg_catalog.pg_roles WHERE rolname not like 'pg_%' AND rolname != 'tsdbadmin' 
AND NOT rolsuper" | psql -d $TARGET -f -

On your migration machine, edit /tmp/grants.psql to match table privileges on your source database.


pg_dump --schema-only --quote-all-identifiers 
--exclude-schema=_timescaledb_catalog --format=plain --dbname "$SOURCE" | grep 
"(ALTER.*OWNER.*|GRANT|REVOKE)"  > /tmp/grants.psql

Run grants.psql on your target Timescale Cloud service.
```
psql -d $TARGET -f /tmp/grants.psql
```

Postgres to Timescale: “live-replay not keeping up with source load”

Go to the Timescale cloud console -> Insights tab and find the query which takes significant time
If the query is either UPDATE/DELETE, make sure the columns used on the WHERE clause have necessary indexes.
If the query is either UPDATE/DELETE on the tables which are converted as hypertables, make sure the REPLIDA IDENTITY(defaults to primary key) on the source is compatible with the target primary key. If not, create an UNIQUE index source database by including the hypertable partition column and make it as a REPLICA IDENTITY. Also, create the same UNIQUE index on target.

ERROR: out of memory (or) Failed on request of size xxx in memory context "yyy" on Timescale instance

This error occurs when the Out of Memory (OOM) guard is triggered due to memory allocations exceeding safe limits. It typically happens when multiple concurrent connections to the TimescaleDB instance are performing memory-intensive operations. For example, during live migrations, this error can occur when large indexes are being created simultaneously.

The live-migration tool includes a retry mechanism to handle such errors. However, frequent OOM crashes may significantly delay the migration process.

One of the following can be used to avoid the OOM errors:

Upgrade to Higher Memory Spec Instances: To mitigate memory constraints, consider using a TimescaleDB instance with higher specifications, such as an instance with 8 CPUs and 32 GB RAM (or more). Higher memory capacity can handle larger workloads and reduce the likelihood of OOM errors.
Reduce Concurrency: If upgrading your instance is not feasible, you can reduce the concurrency of the index migration process using the --index-jobs=<value> flag in the migration command. By default, the value of --index-jobs matches the GUC max_parallel_workers. Lowering this value reduces the memory usage during migration but may increase the total migration time.

By taking these steps, you can prevent OOM errors and ensure a smoother migration experience with TimescaleDB.

Important

Related Content