This tutorial uses a dataset that contains historical data from New York's
yellow taxi network, in a hypertable named
rides. It also includes a separate
tables of payment types and rates, in a regular PostgreSQL table named
If you have been provided with a pre-loaded dataset on your Timescale service, go directly to the queries section.
A service in Timescale is a cloud instance which contains your database.
Each service contains a single database, named
You can connect to a service from your local system using the
command-line utility. If you've used PostgreSQL before, you might already have
psql installed. If not, check out the installing psql section.
In the Timescale portal, click
Download the cheatsheetto download an SQL file that contains the login details for your new service. You can also copy the details directly from this page. When you have copied your password, click
I stored my password, go to service overviewat the bottom of the page.
When your service is ready to use, is shows a green
Runninglabel in the
Service Overview. You also receive an email confirming that your service is ready to use.
On your local system, at the command prompt, connect to the service using the
Service URLfrom the SQL file that you downloaded. When you are prompted, enter the password:psql -x "<SERVICE_URL>"Password for user tsdbadmin:
If your connection is successful, you'll see a message like this, followed by the
psqlprompt:psql (13.3, server 12.8 (Ubuntu 12.8-1.pgdg21.04+1))SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)Type "help" for help.tsdb=>
This tutorial uses historical data from New York's yellow taxi network, provided by the New York City Taxi and Limousine Commission NYC TLC.
Hypertables are the core of Timescale. Hypertables enable Timescale to work efficiently with time-series data. Because Timescale is PostgreSQL, all the standard PostgreSQL tables, indexes, stored procedures and other objects can be created alongside your Timescale hypertables. This makes creating and working with Timescale tables similar to standard PostgreSQL.
Create a standard PostgreSQL table to store the taxi trip data using
CREATE TABLE:CREATE TABLE "rides"(vendor_id TEXT,pickup_datetime TIMESTAMP WITHOUT TIME ZONE NOT NULL,dropoff_datetime TIMESTAMP WITHOUT TIME ZONE NOT NULL,passenger_count NUMERIC,trip_distance NUMERIC,pickup_longitude NUMERIC,pickup_latitude NUMERIC,rate_code INTEGER,dropoff_longitude NUMERIC,dropoff_latitude NUMERIC,payment_type INTEGER,fare_amount NUMERIC,extra NUMERIC,mta_tax NUMERIC,tip_amount NUMERIC,tolls_amount NUMERIC,improvement_surcharge NUMERIC,total_amount NUMERIC);
Convert the standard table into a hypertable partitioned on the
timecolumn using the
create_hypertable()function provided by Timescale. You must provide the name of the table and the column in that table that holds the timestamp data to use for partitioning:SELECT create_hypertable('rides', 'pickup_datetime', 'payment_type', 2, create_default_indexes=>FALSE);
Create an index to support efficient queries by vendor, rate code, and passenger count:CREATE INDEX ON rides (vendor_id, pickup_datetime DESC);CREATE INDEX ON rides (rate_code, pickup_datetime DESC);CREATE INDEX ON rides (passenger_count, pickup_datetime DESC);
When you have other relational data that enhances your time-series data, you can
create standard PostgreSQL tables just as you would normally. For this dataset,
there are two other tables of data, called
Add a table to store the payment types data:CREATE TABLE IF NOT EXISTS "payment_types"(payment_type INTEGER,description TEXT);INSERT INTO payment_types(payment_type, description) VALUES(1, 'credit card'),(2, 'cash'),(3, 'no charge'),(4, 'dispute'),(5, 'unknown'),(6, 'voided trip');
Add a table to store the rates data:CREATE TABLE IF NOT EXISTS "rates"(rate_code INTEGER,description TEXT);INSERT INTO rates(rate_code, description) VALUES(1, 'standard rate'),(2, 'JFK'),(3, 'Newark'),(4, 'Nassau or Westchester'),(5, 'negotiated fare'),(6, 'group ride');
You can confirm that the scripts were successful by running the
\dt command in
psql command line. You should see this:
List of relationsSchema | Name | Type | Owner--------+---------------+-------+----------public | payment_types | table | tsdbadminpublic | rates | table | tsdbadminpublic | rides | table | tsdbadmin(3 rows)
When you have your database set up, you can load the taxi trip data into the
This is a large dataset, so it might take a long time, depending on your network connection.
Download the dataset:
Use your file manager to decompress the downloaded dataset, and take a note of the path to the
At the psql prompt, copy the data from the
nyc_data_rides.csvfile into your hypertable. Make sure you point to the correct path, if it is not in your current working directory:\COPY rides FROM nyc_data_rides.csv CSV;
You can check that the data has been copied successfully with this command:
SELECT * FROM rides LIMIT 5;
You should get five records that look like this:
-[ RECORD 1 ]---------+--------------------vendor_id | 1pickup_datetime | 2016-01-01 00:00:01dropoff_datetime | 2016-01-01 00:11:55passenger_count | 1trip_distance | 1.20pickup_longitude | -73.979423522949219pickup_latitude | 40.744613647460938rate_code | 1dropoff_longitude | -73.992034912109375dropoff_latitude | 40.753944396972656payment_type | 2fare_amount | 9extra | 0.5mta_tax | 0.5tip_amount | 0tolls_amount | 0improvement_surcharge | 0.3total_amount | 10.3
Found an issue on this page?Report an issue!