You use the Livesync Docker image to synchronize all data, or specific tables, from a PostgreSQL database instance to a Timescale Cloud service in real-time. You run Livesync continuously, turning PostgreSQL into a primary database with a Timescale Cloud service as a logical replica. This enables you to leverage Timescale Cloud’s real-time analytics capabilities on your replica data.
Warning
You use Livesync for for data synchronization, rather than migration. It is in alpha and is not recommended for production use.
Livesync leverages the PostgreSQL logical replication protocol, a well-established and widely understood feature in the PostgreSQL ecosystem. By relying on this protocol, Livesync ensures compatibility, familiarity, and a broader knowledge base, making it easier for you to adopt and integrate.
You use Livesync to:
- Copy existing data from a PostgreSQL instance to a Timescale Cloud service:
- Copy data at up to 150 GB/hr. You need at least a 4 CPU/16GB source database, and a 4 CPU/16GB target service.
- Copy the publication tables in parallel. However, large tables are still copied using a single connection. Parallel copying is in the backlog.
- Forget foreign key relationships. Livesync disables foreign key validation during the sync. For example, if a
metrics
table refers to theid
column on thetags
table, you can still sync only themetrics
table without worrying about their foreign key relationships. - Track progress. PostgreSQL expose
COPY
progress under inpg_stat_progress_copy
.
- Synchronize real-time changes from a PostgreSQL instance to a Timescale Cloud service.
- Add and remove tables on demand using the PostgreSQL PUBLICATION interface.
- Enable features such as hypertables, columnstore, and continuous aggregates on your logical replica.
If you have an questions or feedback, talk to us in #livesync in Timescale Community.
Best practice is to use an Ubuntu EC2 instance hosted in the same region as your Timescale Cloud service to move data. That is, the machine you run the commands on to move your data from your source database to your target Timescale Cloud service.
Before you move your data:
Create a target Timescale Cloud service.
Each Timescale Cloud service has a single database that supports the most popular extensions. Timescale Cloud services do not support tablespaces, and there is no superuser associated with a service. Best practice is to create a Timescale Cloud services with at least 8 CPUs for a smoother experience. A higher-spec instance can significantly reduce the overall migration window.
To ensure that maintenance does not run while migration is in progress, best practice is to adjust the maintenance window.
Install Docker on your sync machine. You need a minimum of a 4 CPU/16GB EC2 instance to run Livesync
Install the PostgreSQL client tools on your sync machine.
This includes
psql
,pg_dump
, andpg_dumpall
.
- The Schema is not migrated by Livesync, you use pg_dump/restore to migrate schema
- Schema changes must be co-ordinated. Make compatible changes to the schema in your Timescale Cloud service first, then make the same changes to the source PostgreSQL instance.
- There is WAL volume growth on the source PostgreSQL instance during large table copy.
- This works for PostgreSQL databases only as source. Timescaledb is not yet supported.
The <user>
in the SOURCE
connection must have the replication role granted in order to create a replication slot.
These variables hold the connection information for the source database and target Timescale Cloud service. In Terminal on your migration machine, set the following:
export SOURCE="postgres://<user>:<password>@<source host>:<source port>/<db_name>"export TARGET="postgres://tsdbadmin:<PASSWORD>@<HOST>:<PORT>/tsdb?sslmode=require"
You find the connection information for your Timescale Cloud service in the configuration file you downloaded when you created the service.
Important
Avoid using connection strings that route through connection poolers like PgBouncer or similar tools. This tool requires a direct connection to the database to function properly.
You need to tune the Write Ahead Log (WAL) on the PostgreSQL source database:
- PostgreSQL GUC “wal_level” as “logical”
- PostgreSQLGUC “max_wal_senders” as 10
To do this:
psql $SOURCE -c "SET wal_level=’logical’;"psql $SOURCE -c "SET max_wal_sender=10;"
Replica identity assists data replication by identifying the rows being modified. By default each table and hypertable in the source database defaults to the primary key of the table being replicated. However, you can also have:
A viable unique index: each table has a unique, non-partial, non-deferrable index that includes only columns marked as
NOT NULL
. If aUNIQUE
index does not exists, create one to assist the migration. You can delete it after live sync. For each table, setREPLICA IDENTITY
to the viable unique index:psql -X -d $SOURCE -c 'ALTER TABLE <table name> REPLICA IDENTITY USING INDEX <_index_name>'
No primary key or viable unique index: use brute force. For each table, set
REPLICA IDENTITY
toFULL
:psql -X -d $SOURCE -c 'ALTER TABLE <table name> REPLICA IDENTITY FULL'For each
UPDATE
orDELETE
statement, PostgreSQL reads the whole table to find all matching rows. This results in significantly slower replication. If you are expecting a large number ofUPDATE
orDELETE
operations on the table, best practice is to not useFULL
To capture only INSERT
and ignore UPDATE
s and DELETE
s, use a
publish config
while creating the publication.
Use pg_dump to:
Download the schema from the source database
pg_dump $SOURCE \--no-privileges \--no-owner \--no-publications \--no-subscriptions \--no-table-access-method \--no-tablespaces \--schema-only \--file=schema.sqlApply the schema on the target service
psql $TARGET -f schema.sql
For efficient querying and analysis, you can convert tables which contain time-series or events data, and tables that are already partitioned using PostgreSQL declarative partition into hypertables.
Convert tables to hyperatables
Run the following on each table in the target Timescale Cloud service to convert it to a hypertable:
psql -X -d $TARGET -c "SELECT create_hypertable('<table>', by_range('<partition column>', '<chunk interval>'::interval));"For example, to convert the metrics table into a hypertable with time as a partition column and 1 day as a partition interval:
psql -X -d $TARGET -c "SELECT create_hypertable('public.metrics', by_range('time', '1 day'::interval));"Convert PostgreSQL partitions to hyperatables
Rename the partition and create a new normal table with the same name as the partitioned table, then convert to a hypertable:
psql $TARGET -f - <<EOFBEGIN;ALTER TABLE public.events RENAME TO events_part;CREATE TABLE public.events(LIKE public.events_part INCLUDING ALL);SELECT create_hypertable('public.events', by_range('time', '1 day'::interval));COMMIT;EOF
You use the Livesync docker image to synchronize changes in real-time from a PostgreSQL database instance to a Timescale Cloud service:
Start Livesync
As you run Livesync continuously, best practice is to run it as a background process.
docker run -d --rm --name livesync timescale/live-sync:v0.0.0-alpha.1-amd64 start --publication analytics --subscription livesync --source $SOURCE --target $TARGETTrace progress
Once Livesync is running as a docker daemon, you can also capture the logs:
docker logs -f livesyncView the tables being synchronized
psql $TARGET -c "SELECT * FROM _ts_live_sync.subscription_rel"subname | schemaname | tablename | rrelid | state | lsn----------+------------+-----------+--------+-------+-----livesync | public | metrics | 17261 | d |Possible values for
state
are:- d: initial table data sync
- f: initial table data sync completed
- s: catching up with the latest change
- r: table is ready, synching live changes
Stop Livesync
docker stop live-syncCleanup
You need to manually execute a SQL snippet to cleanup replication slots created by the live-migration.
psql $SOURCE -f - <<EOFselect pg_drop_replication_slot(slot_name) from pg_stat_replication_slots where slot_name like 'livesync%';select pg_drop_replication_slot(slot_name) from pg_stat_replication_slots where slot_name like 'ts%';EOFA command to clean up is coming shortly.
After the Livesync docker is up and running, you CREATE PUBLICATION
on the SOURCE database to
specify the list of tables which you intend to synchronize. Once you create a PUBLICATION, it is
automatically picked by Livesync, which starts synching the tables expressed as part of it.
For example:
Create a publication named
analytics
which publishesmetrics
andtags
tablesPUBLICATION
enables you to add all the tables in the schema or even all the tables in the database. However, it requires superuser privileges on most of the managed PostgreSQL offerings.CREATE PUBLICATION analytics FOR TABLE metrics, tags;Add tables after to an existing publication with a call to ALTER PUBLICATION
ALTER PUBLICATION analytics ADD TABLE events;Publish PostgreSQL declarative partitioned table
To publish declaratively partitioned table changes to your Timescale Cloud service, set the
publish_via_partition_root
specialPUBLICATION
config totrue
:ALTER PUBLICATION analytics SET(publish_via_partition_root=true);Stop synching a table in the
PUBLICATION
with a call toDROP TABLE
ALTER PUBLICATION analytics DROP TABLE tags;
Keywords
Found an issue on this page?Report an issue or Edit this page in GitHub.