You use livesync to synchronize CSV and Parquet files from an S3 bucket to your Timescale Cloud service in real time. Livesync runs continuously, enabling you to leverage Timescale Cloud as your analytics database with data constantly synced from S3. This lets you take full advantage of Timescale Cloud's real-time analytics capabilities without having to develop or manage custom ETL solutions between S3 and Timescale Cloud.

You can use livesync to synchronize your existing and new data. Here's what livesync can do:
Sync data from an S3 bucket instance to a Timescale Cloud service:
- Use glob patterns to identify the objects to sync.
- Livesync uses the objects returned for subsequent queries. This efficient approach means files are synced in lexicographical order
.
- Livesync watches an S3 bucket for new files and imports them automatically. It runs on a configurable schedule and tracks processed files.
- For large backlogs, livesync checks every minute until caught up.
Sync data from multiple file formats:
- CSV: files are checked for compression in
.gz
and.zip
format, then processed using timescaledb-parallel-copy.
- Parquet: files are converted to CSV, then processed using timescaledb-parallel-copy
.
- CSV: files are checked for compression in
Livesync offers an option to enable a hypertable during the file-to-table schema mapping setup. You can enable columnstore and continuous aggregates through the SQL editor once livesync has started.
Livesync offers a default 1-minute polling interval. This means that Timescale Cloud checks the S3 source every minute for new data. You can customize this interval by setting up a cron expression.
Livesync for S3 continuously imports data from an Amazon S3 bucket into your database. It monitors your S3 bucket for new files matching a specified pattern and automatically imports them into your designated database table.
Note: livesync for S3 currently only syncs existing and new files—it does not support updating or deleting records based on updates and deletes from S3 to tables in a Timescale Cloud service.
Early access: livesync is not supported for production use. If you have any questions or feedback, talk to us in #livesync in Timescale CommunityTo follow the steps on this page:
Create a target Timescale Cloud service with time-series and analytics enabled.
You need your connection details.
Ensure access to a standard Amazon S3 bucket containing your data files.
Directory buckets are not supported.
Configure access credentials for the S3 bucket.
The following credentials are supported:
Configure the trust policy. Set the:
Principal
:arn:aws:iam::142548018081:role/timescale-s3-connections
.ExternalID
: set to the Timescale Cloud project and Timescale Cloud service ID of the service you are syncing to in the format<projectId>/<serviceId>
.This is to avoid the confused deputy problem
.
Give the following access permissions:
s3:GetObject
.s3:ListBucket
.
CSV:
Maximum file size: 1 GB
To increase this limit, contact sales@timescale.com
Maximum row size: 2 MB
Supported compressed formats:
.gz
.zip
Advanced settings:
- Delimiter: the default character is
,
, you can choose a different delimiter - Skip header: skip the first row if your file has headers
- Delimiter: the default character is
Parquet:
- Maximum file size: 1 GB
- Maximum row group uncompressed size: 200 MB
- Maximum row size: 2 MB
Sync iteration:
To prevent system overload, livesync tracks up to 100 files for each sync iteration. Additional checks only fill empty queue slots.
To sync data from your S3 bucket to your Timescale Cloud service using Timescale Console:
Connect to your Timescale Cloud service
In Timescale Console
, select the service to sync live data to.
Start livesync
- Click
Actions
>Livesync for S3
. - Click
New livesync for S3
.
- Click
Connect the source S3 bucket to the target service
In
Livesync for S3
, set theBucket name
andAuthentication method
, then clickContinue
.For instruction on creating the IAM role you need to connect your S3 bucket, click
Learn how
:Timescale Console connects to the source bucket.In
Define files to sync
, choose theFile type
and set theGlob pattern
.Use the following patterns:
<folder name>/*
: match all files in a folder. Also, any pattern ending with/
is treated as/*
.<folder name>/**
: match all recursively.<folder name>/**/*.csv
: match a specific file type.
Livesync uses prefix filters where possible, place patterns carefully at the end of your glob expression. AWS S3 doesn't support complex filtering. If your expression filters too many files, the list operation may timeout.
Click the search icon, you see files to sync. Click
Continue
.
Optimize the data to synchronize in hypertables
Timescale Console checks the file schema and, if possible, suggests the column to use as the time dimension in a hypertable.
Choose the
Data type
for each column, then clickContinue
.Choose the interval. This can be a minute, an hour, or use a cron expression
.
Repeat this step for each table you want to sync.
Click
Start Livesync
.Timescale Console starts livesync between the source database and the target service and displays the progress.
Monitor syncronization
To view the progress of the livesync, click the name of the livesync process.
You see the status of the file being synced. Only one file runs at a time.
To pause and restart livesync, click the buttons on the right of the livesync process and select an action.
During pauses, you can edit the configuration before resuming.
And that is it, you are using livesync to synchronize all the data, or specific files, from an S3 bucket to your Timescale Cloud service in real time.
Keywords
Found an issue on this page?Report an issue or Edit this page
in GitHub.