Import data from CSV

Timescale Cloud: Performance, Scale, Enterprise

Self-hosted products

MST

CSV is a file format that is widely used for data migration.

This page shows you how to import data into your Timescale Cloud service from a CSV file.

Prerequisites

Best practice is to use an Ubuntu EC2 instance hosted in the same region as your Timescale Cloud service as a migration machine. That is, the machine you run the commands on to move your data from your source database to your target Timescale Cloud service.

Before you migrate your data:

Create a target Timescale Cloud service.
Each Timescale Cloud service has a single database that supports the most popular extensions. Timescale Cloud services do not support tablespaces, and there is no superuser associated with a service. Best practice is to create a Timescale Cloud service with at least 8 CPUs for a smoother experience. A higher-spec instance can significantly reduce the overall migration window.
To ensure that maintenance does not run during the process, adjust the maintenance window.

Install Go v1.13 or later
Install timescaledb-parallel-copy
timescaledb-parallel-copy Improves performance for large datasets by parallelizing the import process. It also preserves row order and uses a round-robin approach to optimize memory management and disk operations.
To verify your installation, run timescaledb-parallel-copy --version.
Ensured that the time column in the CSV file uses the TIMESTAMPZ data type.

For faster data transfer, best practice is that your target service, and the system running the data import are in the same region.

Import data into your service

To import data from a CSV file:

Setup your service connection string
This variable hold the connection information for the target Timescale Cloud service.
In Terminal on the source machine, set the following:
```
export TARGET=postgres://tsdbadmin:<PASSWORD>@<HOST>:<PORT>/tsdb?sslmode=require
```
You find the connection information for your Timescale Cloud service in the configuration file you downloaded when you created the service.
Create a hypertable to hold your data
1. Create a new empty table with a schema that is compatible with the data in your parquet file.
  For example, if your parquet file contains the columns ts, location, and temperature with types TIMESTAMP, STRING, and DOUBLE:
```
psql $TARGET -c  "CREATE TABLE <TABLE_NAME> ( \
   ts          TIMESTAMPTZ         NOT NULL,  \
   location    TEXT                NOT NULL,  \
   temperature DOUBLE PRECISION    NULL  \
);"
```
  If you prefer using a secure UI to the command line, use Data mode in Timescale Console.
2. Convert the empty table to a hypertable:
  In the following command, replace <TABLE NAME> with the name of the table you just created, and <COLUMN_NAME> with the partitioning column in <TABLE NAME>.
```
psql $TARGET -c  "SELECT create_hypertable('<TABLE_NAME>', by_range('<COLUMN_NAME>'))"
```
Import your data
In the folder containing your CSV files, either:
- Use timescaledb-parallel-copy:
  For the best performances while avoiding resource competition, set <NUM_WORKERS> to twice the number of CPUs in your service, but less than the available CPU cores.
```
timescaledb-parallel-copy \
  --connection $TARGET \
  --table <TABLE_NAME> \
  --file <FILE_NAME>.csv \
  --workers <NUM_WORKERS> \
  --reporting-period 30s
```
  For self-hosted TimescaleDB, set $TARGET to host=localhost user=postgres sslmode=disable
- Use psql:
  psql COPY is single-threaded, and may be slower for large datasets.
```
psql $TARGET
\c <DATABASE_NAME> 
\COPY <TABLE_NAME> FROM <FILENAME>.csv CSV"
```
Verify the data was imported correctly into your service

And that is it, you have imported your data from a CSV file.

Import data from CSV

Prerequisites

Import data into your service

Related Content