You use livesync to synchronize tabular data, from an S3 bucket to your Timescale Cloud service in real time. You run livesync continuously, turning S3 into a primary database with your Timescale Cloud service as a logical replica. This enables you to leverage Timescale Cloud’s real-time analytics capabilities on your replica data.

Livesync view status

You use livesync for data synchronization, rather than migration. Livesync can:

  • Sync data from an S3 bucket instance to a Timescale Cloud service:

    • livesync uses Glob patterns to identify the objects to sync.
    • livesync uses the objects returned for subsequent queries. This efficient approach means files are synced in lexicographical order.
    • livesync watches an S3 bucket for new files and imports them automatically. livesync runs on a configurable schedule and tracks processed files.
    • For large backlogs, livesync checks every minute until caught up.
  • Sync data from multiple file formats:

  • Enable features such as hypertables, columnstore, and continuous aggregates on your logical replica.

livesync for S3 continuously imports data from an Amazon S3 bucket into your database. It monitors your S3 bucket for new files matching a specified pattern and automatically imports them into your designated database table.

Early access: livesync is not supported for production use. If you have any questions or feedback, talk to us in #livesync in Timescale Community.

To follow the steps on this page:

  • Access to a standard Amazon S3 bucket containing your data files. Directory buckets are not supported.

  • Access credentials for the S3 bucket.

  • CSV:
    • Maximum file size: 1GB To increase these limits, contact sales@timescale.com
    • Maximum row size: 2MB
    • Supported compressed formats:
      • .gz
      • .zip
    • Advanced settings:
      • Delimiter: the default character is ,, you can choose a different delimiter
      • Skip Header: skip the first row if your file has headers
  • Parquet:
    • Maximum file size: 1GB
    • Maximum row group uncompressed size: 200MB
    • Maximum row size: 2MB
  • Sync iteration: To prevent system overload, livesync tracks up to 100 files for each sync iteration. Additional checks only fill empty queue slots.

To sync data from your S3 bucket to your Timescale Cloud service using Timescale Console:

  1. Connect to your Timescale Cloud service

    In Timescale Console, select the service to sync live data to.

  2. Start livesync

    1. Click Actions > livesync for S3.
    2. Click New Livesync for S3
  3. Connect the source S3 bucket to the target service

    Livesync connect to bucket

    1. In Livesync for S3, set the Bucket name and Authentication method, then click Continue.

      For instruction on creating the IAM role you need to connect your S3 bucket, click Learn how:

      Livesync connect to bucket
      Timescale Console connects to the source bucket.

    2. In Define files to sync, choose the File type and set the Glob pattern.

      Use the following patterns:

      • <folder name>/*: match all files in a folder. Also, any pattern ending with / is treated as /*.
      • <folder name>/**: match all recursively.
      • <folder name>/**/*.csv: match a specific file type.

      livesync uses prefix filters where possible, place patterns carefully at the end of your glob expression. AWS S3 doesn't support complex filtering. If your expression filters too many files, the list operation may timeout.

    3. Click the search icon, you see files to sync. Click Continue.

  4. Optimize the data to synchronize in hypertables

    Timescale Console checks the file schema and, if possible, suggests the column to use as the time dimension in a hypertable.

    Livesync choose table

    1. Choose the Data type for each column, then click Continue.

    2. Choose the interval. This can be a minute, an hour or use a cron expression.

    3. Repeat this step for each table you want to sync.

    4. Press Start Livesync.

      Timescale Console starts livesync between the source database and the target service and displays the progress.

  5. Monitor syncronization

    1. To view the progress of the livesync, click the name of the livesync process: You see the status of the file being synced. Only one file runs at a time.
      livesync view status
    2. To pause and restart livesync, click the buttons on the right of the livesync process and select an action: During pauses, you can edit the configuration before resuming.
      livesync start stop

And that is it, you are using livesync to synchronize all the data, or specific files, from an S3 bucket to your Timescale Cloud service in real time.

Keywords

Found an issue on this page?Report an issue or Edit this page in GitHub.