Live migration tooling is currently experimental. You may run into the following shortcomings:
- Live migration does not yet support mutable compression (
INSERT
,UPDATE
,DELETE
on compressed data). - By default, numeric fields containing
NaN
/+Inf
/-Inf
values are not correctly replicated, and will be converted toNULL
. A workaround is available, but is not enabled by default.
Should you run into any problems, please open a support request before losing any time debugging issues.
You can open a support request directly from the Timescale console, or by email to support@timescale.com.
Live migration involves several background processes to manage different stages of
the migration. The logs of these processes can be helpful for troubleshooting
unexpected behavior. You can find these logs in the <volume_mount>/logs
directory.
When you migrate a self-hosted or Managed Service for TimescaleDB (MST) database to Timescale, the source database and the destination Timescale Service must run the same version of TimescaleDB.
Before you start live migration:
Check the version of TimescaleDB running on the source database and the target Timescale Service:
select extversion from pg_extension where extname = 'timescaledb';If the version of TimescaleDB on the source database is lower than your Timescale Service, either:
Downgrade: reinstall an older version of TimescaleDB on your Timescale Service that matches the source database:
Connect to your Timescale Service and check the versions of TimescaleDB available:
SELECT version FROM pg_available_extension_versions WHERE name = 'timescaledb' ORDER BY 1 DESC;If an available TimescaleDB release matches your source database:
Uninstall TimescaleDB from your Timescale Service:
DROP EXTENSION timescaledb;Reinstall the correct version of TimescaleDB:
CREATE EXTENSION timescaledb VERSION '<version>';
Note
You may need to reconnect to your Timescale Service using
psql -X
when you're creating the TimescaleDB extension.
Upgrade: for self-hosted databases, upgrade TimescaleDB to match your Timescale Service.
Live migration logs a warning WARNING: no tuple identifier for UPDATE in table
when it cannot determine which specific rows should be updated after receiving an
UPDATE
statement from the source database during replication. This occurs when tables
in the source database that receive UPDATE
statements lack either a PRIMARY KEY
or
a REPLICA IDENTITY
setting. For live migration to successfully replicate UPDATE
and
DELETE
statements, tables must have either a PRIMARY KEY
or REPLICA IDENTITY
set
as a prerequisite.
If your PostgreSQL tables use native partitioning, setting REPLICA IDENTITY
on the
root (parent) table will not automatically apply it to the partitioned child tables.
You must manually set REPLICA IDENTITY
on each partitioned child table.
Live migration does not support replication from read or failover replicas. You must provide a connection string that points directly to your source database for live migration.
Live migration does not support connection poolers. You must provide a connection string that points directly to your source and target databases for live migration to work smoothly.
No, Timescale Cloud cannot be used as a source database for live migration.
At present, live migration does not allow for excluding schemas or tables from
replication, but this feature is expected to be added in future releases.
However, a workaround is available for skipping table data using the --skip-table-data
flag.
For more information, please refer to the help text under the migrate
subcommand.
Timescale's platform automatically manages the underlying disk volume. Due to
platform limitations, it is only possible to resize the disk once every six
hours. Depending on the rate at which you're able to copy data, you may be
affected by this restriction. Affected instances are unable to accept new data
and error with: FATAL: terminating connection due to administrator command
.
If you intend on migrating more than 400 GB of data to Timescale, open a support request requesting the required storage to be pre-allocated in your Timescale instance.
You can open a support request directly from the Timescale console, or by email to support@timescale.com.
When pg_dump
starts, it takes an ACCESS SHARE
lock on all tables which it
dumps. This ensures that tables aren't dropped before pg_dump
is able to drop
them. A side effect of this is that any query which tries to take an
ACCESS EXCLUSIVE
lock on a table is be blocked by the ACCESS SHARE
lock.
A number of timescale-internal processes require taking ACCESS EXCLUSIVE
locks to ensure consistency of the data. The following is a non-exhaustive list
of potentially affected operations:
- compress/decompress/recompress chunk
- continuous aggregate refresh (before 2.12)
- create hypertable with foreign keys, truncate hypertable
- enable compression on hypertable
- drop chunks
The most likely impact of the above is that background jobs for retention
policies, compression policies, and continuous aggregate refresh policies are
blocked for the duration of the pg_dump
command. This may have unintended
consequences for your database performance.
When using the pg_dump
directory format, it is possible to use concurrency to
use multiple connections to the source database to dump data. This speeds up
the dump process. Due to the fact that there are multiple connections, it is
possible for pg_dump
to end up in a deadlock situation. When it detects a
deadlock it aborts the dump.
In principle, any query which takes an ACCESS EXCLUSIVE
lock on a table
causes such a deadlock. As mentioned above, some common operations which take
an ACCESS EXCLUSIVE
lock are:
- retention policies
- compression policies
- continuous aggregate refresh policies
If you would like to use concurrency nonetheless, turn off all background jobs
in the source database before running pg_dump
, and turn them on once the dump
is complete. If the dump procedure takes longer than the continuous aggregate
refresh policy's window, you must manually refresh the continuous aggregate in
the correct time range. For more information, consult the
refresh policies documentation.
To turn off the jobs:
SELECT public.alter_job(id::integer, scheduled=>false)FROM _timescaledb_config.bgw_jobWHERE id >= 1000;
To turn on the jobs:
SELECT public.alter_job(id::integer, scheduled=>true)FROM _timescaledb_config.bgw_jobWHERE id >= 1000;
If the directory format is used for pg_dump
and pg_restore
, concurrency can be
employed to speed up the process. Unfortunately, loading the tables in the
timescaledb_catalog
schema concurrently causes errors. Furthermore, the
tsdbadmin
user does not have sufficient privileges to turn off triggers in
this schema. To get around this limitation, load this schema serially, and then
load the rest of the database concurrently.
# first, serially load JUST the _timescaledb_catalogpg_restore -d "$TARGET" \--format=directory \--schema='_timescaledb_catalog' \--exit-on-error \--no-tablespaces \--no-owner \--no-privileges \dump# next, concurrently load everything EXCEPT the _timescaledb_catalogpg_restore -d "$TARGET" \--format=directory \--jobs=8 \--exclude-schema='_timescaledb_catalog' \--exit-on-error \--disable-triggers \--no-tablespaces \--no-owner \--no-privileges \dump
The _timescaledb_config.bgw_jobs
table is used to manage background jobs.
This includes both user-defined actions, compression policies, retention
policies, and continuous aggregate refresh policies. On Timescale, this table
has a trigger which ensures that no database user can create or modify jobs
owned by another database user. This trigger can provide an obstacle for migrations.
If the --no-owner
flag is used with pg_dump
and pg_restore
, all
objects in the target database are owned by the user that ran
pg_restore
, likely tsdbadmin
.
If all the background jobs in the source database were owned by a user of the
same name as the user running the restore (again likely tsdbadmin
), then
loading the _timescaledb_config.bgw_jobs
table should work.
If the background jobs in the source were owned by the postgres
user, they
are be automatically changed to be owned by the tsdbadmin
user. In this case,
one just needs to verify that the jobs do not make use of privileges that the
tsdbadmin
user does not possess.
If background jobs are owned by one or more users other than the user
employed in restoring, then there could be issues. To work around this
issue, do not dump this table with pg_dump
. Provide either
--exclude-table-data='_timescaledb_config.bgw_job'
or
--exclude-table='_timescaledb_config.bgw_job'
to pg_dump
to skip
this table. Then, use psql
and the COPY
command to dump and
restore this table with modified values for the owner
column.
# dump the _timescaledb_config.bgw_job table to a csv file replacing the owner# values with tsdbadminpsql -d "$SOURCE" -X -v ON_ERROR_STOP=1 --echo-errors -f - <<'EOF'begin;select string_agg( case attnamewhen 'owner' then $$'tsdbadmin' as "owner"$$else format('%I', attname)end, ', ') as colsfrom pg_namespace ninner join pg_class con (n.nspname = '_timescaledb_config'and n.oid = c.relnamespaceand c.relname = 'bgw_job')inner join pg_attribute aon (c.oid = a.attrelid and a.attnum > 0)\gsetcopy(select :colsfrom _timescaledb_config.bgw_jobwhere id >= 1000) to stdout with (format csv, header true)\g bgw_job.csvrollback;\qEOF# copy the csv file into the target's _timescaledb_config.bgw_jobpsql -X -d "$TARGET" -v ON_ERROR_STOP=1 --echo-errors \-c "\copy _timescaledb_config.bgw_job from 'bgw_job.csv' with (format csv, header match)"
Once the table has been loaded and the restore completed, you may then use SQL to adjust the ownership of the jobs and/or the associated stored procedures and functions as you wish.
There are a vast number of PostgreSQL extensions available in the wild. Timescale supports many of the most popular extensions, but not all extensions. Before migrating, check that the extensions you are using are supported on Timescale. Consult the list of supported extensions.
When self-hosting, the timescaledb extension may be installed in an arbitrary
schema. Timescale only supports installing the timescaledb extension in the
public
schema. How to go about resolving this depends heavily on the
particular details of the source schema and the migration approach chosen.
Timescale does not support using custom tablespaces. Providing the
--no-tablespaces
flag to pg_dump
and pg_restore
when
dumping/restoring the schema results in all objects being in the
default tablespace as desired.
While PostgreSQL clusters can contain many databases, Timescale instances are limited to a single database. When migrating a cluster with multiple databases to Timescale, one can either migrate each source database to a separate Timescale instance or "merge" source databases to target schemas.
The tsdbadmin
database user is the most powerful available on Timescale, but it
is not a true superuser. Review your application for use of superuser privileged
operations and mitigate before migrating.
In order to improve the performance and compatibility of continuous aggregates, TimescaleDB v2.7 replaces partial continuous aggregates with finalized continuous aggregates.
To test your database for partial continuous aggregates, run the following query:
SELECT exists (SELECT 1 FROM timescaledb_information.continuous_aggregates WHERE NOT finalized);
If you have partial continuous aggregates in your database, migrate them from partial to finalized before you migrate your database.
If you accidentally migrate partial continuous aggregates across PostgreSQL versions, you see the following error when you query any continuous aggregates:
ERROR: insufficient data left in message.
Keywords
Found an issue on this page?Report an issue or Edit this page in GitHub.