This tutorial uses a dataset that contains historical data from New York's
yellow taxi network, in a hypertable named rides
. It also includes a separate
tables of payment types and rates, in a regular PostgreSQL table named
payment_types
, and rates
.
A service in Timescale is a cloud instance which contains your database.
Each service contains a single database, named tsdb
.
You can connect to a service from your local system using the psql
command-line utility. If you've used PostgreSQL before, you might already have
psql
installed. If not, check out the installing psql section.
In the Timescale portal, click
Create service
.Click
Download the cheatsheet
to download an SQL file that contains the login details for your new service. You can also copy the details directly from this page. When you have copied your password, clickI stored my password, go to service overview
at the bottom of the page.When your service is ready to use, is shows a green
Running
label in theService Overview
. You also receive an email confirming that your service is ready to use.On your local system, at the command prompt, connect to the service using the
Service URL
from the SQL file that you downloaded. When you are prompted, enter the password:psql -x "<SERVICE_URL>"Password for user tsdbadmin:If your connection is successful, you'll see a message like this, followed by the
psql
prompt:psql (13.3, server 12.8 (Ubuntu 12.8-1.pgdg21.04+1))SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)Type "help" for help.tsdb=>
This tutorial uses historical data from New York's yellow taxi network, provided by the New York City Taxi and Limousine Commission NYC TLC.
Hypertables are the core of Timescale. Hypertables enable Timescale to work efficiently with time-series data. Because Timescale is PostgreSQL, all the standard PostgreSQL tables, indexes, stored procedures and other objects can be created alongside your Timescale hypertables. This makes creating and working with Timescale tables similar to standard PostgreSQL.
Create a standard PostgreSQL table to store the taxi trip data using
CREATE TABLE
:CREATE TABLE "rides"(vendor_id TEXT,pickup_datetime TIMESTAMP WITHOUT TIME ZONE NOT NULL,dropoff_datetime TIMESTAMP WITHOUT TIME ZONE NOT NULL,passenger_count NUMERIC,trip_distance NUMERIC,pickup_longitude NUMERIC,pickup_latitude NUMERIC,rate_code INTEGER,dropoff_longitude NUMERIC,dropoff_latitude NUMERIC,payment_type INTEGER,fare_amount NUMERIC,extra NUMERIC,mta_tax NUMERIC,tip_amount NUMERIC,tolls_amount NUMERIC,improvement_surcharge NUMERIC,total_amount NUMERIC);Convert the standard table into a hypertable partitioned on the
time
column using thecreate_hypertable()
function provided by Timescale. You must provide the name of the table and the column in that table that holds the timestamp data to use for partitioning:SELECT create_hypertable('rides', by_range('pickup_datetime'), create_default_indexes=>FALSE);SELECT add_dimension('rides', by_hash('payment_type', 2));Note
The
by_range
andby_hash
dimension builder is an addition to TimescaleDB 2.13.Create an index to support efficient queries by vendor, rate code, and passenger count:
CREATE INDEX ON rides (vendor_id, pickup_datetime DESC);CREATE INDEX ON rides (rate_code, pickup_datetime DESC);CREATE INDEX ON rides (passenger_count, pickup_datetime DESC);
When you have other relational data that enhances your time-series data, you can
create standard PostgreSQL tables just as you would normally. For this dataset,
there are two other tables of data, called payment_types
and rates
.
Add a table to store the payment types data:
CREATE TABLE IF NOT EXISTS "payment_types"(payment_type INTEGER,description TEXT);INSERT INTO payment_types(payment_type, description) VALUES(1, 'credit card'),(2, 'cash'),(3, 'no charge'),(4, 'dispute'),(5, 'unknown'),(6, 'voided trip');Add a table to store the rates data:
CREATE TABLE IF NOT EXISTS "rates"(rate_code INTEGER,description TEXT);INSERT INTO rates(rate_code, description) VALUES(1, 'standard rate'),(2, 'JFK'),(3, 'Newark'),(4, 'Nassau or Westchester'),(5, 'negotiated fare'),(6, 'group ride');
You can confirm that the scripts were successful by running the \dt
command in
the psql
command line. You should see this:
List of relationsSchema | Name | Type | Owner--------+---------------+-------+----------public | payment_types | table | tsdbadminpublic | rates | table | tsdbadminpublic | rides | table | tsdbadmin(3 rows)
When you have your database set up, you can load the taxi trip data into the
rides
hypertable.
Important
This is a large dataset, so it might take a long time, depending on your network connection.
Download the dataset:
Use your file manager to decompress the downloaded dataset, and take a note of the path to the
nyc_data_rides.csv
file.At the psql prompt, copy the data from the
nyc_data_rides.csv
file into your hypertable. Make sure you point to the correct path, if it is not in your current working directory:\COPY rides FROM nyc_data_rides.csv CSV;
You can check that the data has been copied successfully with this command:
SELECT * FROM rides LIMIT 5;
You should get five records that look like this:
-[ RECORD 1 ]---------+--------------------vendor_id | 1pickup_datetime | 2016-01-01 00:00:01dropoff_datetime | 2016-01-01 00:11:55passenger_count | 1trip_distance | 1.20pickup_longitude | -73.979423522949219pickup_latitude | 40.744613647460938rate_code | 1dropoff_longitude | -73.992034912109375dropoff_latitude | 40.753944396972656payment_type | 2fare_amount | 9extra | 0.5mta_tax | 0.5tip_amount | 0tolls_amount | 0improvement_surcharge | 0.3total_amount | 10.3
The queries in this tutorial are suitable for visualizing in Grafana. If you want to visualize the results of your queries, connect your Grafana account to the NYC taxi cab dataset.
Grafana is and open source analytics and monitoring solution. You use Grafana to visualize queries directly from your Timescale Cloud service.
Best practice is to use an Ubuntu EC2 instance hosted in the same region as your Timescale Cloud service as a migration machine. That is, the machine you run the commands on to move your data from your source database to your target Timescale Cloud service.
Before you migrate your data:
Create a target Timescale Cloud service.
Each Timescale Cloud service has a single database that supports the most popular extensions. Timescale Cloud services do not support tablespaces, and there is no superuser associated with a service. Best practice is to create a Timescale Cloud services with at least 8 CPUs for a smoother experience. A higher-spec instance can significantly reduce the overall migration window.
To ensure that maintenance does not run during the process, adjust the maintenance window.
- Install self-managed Grafana, or sign up for Grafana Cloud
To connect the data in your Timescale Cloud service to Grafana:
Log in to Grafana
In your browser, log in to either :
- Self-hosted Grafana: at
http://localhost:3000/
. The default credentials areadmin
,admin
. - Grafana Cloud: use the URL and credentials you set when you created your account.
- Self-hosted Grafana: at
Add your Timescale Cloud service as a data source
In the Grafana dashboard, navigate to
Configuration
>Data sources
, then clickAdd data source
.In
Add data source
, selectPostgreSQL
.Configure the data source using the connection in
$TARGET
:Name
: the name to use for the datasetHost
: the host and port for your service, in this format:<HOST>:<PORT>
.For example:
example.tsdb.cloud.timescale.com:35177
.Database
:tsdb
User
:tsdbadmin
, or another privileged userPassword
: the password forUser
TLS/SSL Mode
: selectrequire
PostgreSQL details
: enableTimescaleDB
Leave the default setting for all other fields
Click
Save & test
.
Grafana checks that your details are set correctly.
Keywords
Found an issue on this page?Report an issue or Edit this page in GitHub.