Time-series data

Time-series data represents how a system, process, or behavior changes over time. For example, if you are taking measurements from a temperature gauge every five minutes, you are collecting time-series data. Another common example is stock price changes, or even the battery life of your smart phone. As these measurements change over time, each data point is recorded alongside its timestamp, allowing it to be measured, analyzed, and visualized.

Time-series data can be collected very frequently, such as financial data, or infrequently, such as weather or system measurements. It can also be collected regularly, such as every millisecond or every hour, or irregularly, such as only when a change occurs.

Databases have always had time fields, but using a special database for handling time-series data can make your database work much more effectively. Specialized time-series databases, like Timescale, are designed to handle large amounts of database writes, so they work much faster. They are also optimized to handle schema changes, and use more flexible indexing, so you don't need to spend time migrating your data whenever you make a change.

Time-series data is everywhere, but there are some environments where it is especially important to use a specialized time-series database, like Timescale:

Monitoring computer systems: virtual machines, servers, container metrics, CPU, free memory, net/disk IOPs, service and application metrics such as request rates, and request latency.
Financial trading systems: securities, cryptocurrencies, payments, and transaction events.
Internet of things: data from sensors on industrial machines and equipment, wearable devices, vehicles, physical containers, pallets, and consumer devices for smart homes.
Eventing applications: user or customer interaction data such as clickstreams, pageviews, logins, and signups.
Business intelligence: Tracking key metrics and the overall health of the business.
Environmental monitoring: temperature, humidity, pressure, pH, pollen count, air flow, carbon monoxide, nitrogen dioxide, or particulate matter.

To explore Timescale's features, you need some sample data. This guide uses real-time stock trade data, also known as tick data, from Twelve Data.

About the dataset

The dataset contains second-by-second stock-trade data for the top 100 most-traded symbols, in a hypertable named stocks_real_time. It also includes a separate table of company symbols and company names, in a regular PostgreSQL table named company.

The dataset is updated on a nightly basis and contains data from the last four weeks, typically ~8 million rows of data. Stock trades are recorded in real-time Monday through Friday, during normal trading hours of the New York Stock Exchange (9:30 AM - 4:00 PM EST).

Ingest the dataset

To ingest data into the tables that you created, you need to download the dataset and copy the data to your database.

Ingesting the dataset

Download the real_time_stock_data.zip file. The file contains two .csv files; one with company information, and one with real-time stock trades for the past month. Download:
real_time_stock_data.zip
In a new terminal window, run this command to unzip the .csv files:
```
unzip real_time_stock_data.zip
```
At the psql prompt, use the COPY command to transfer data into your Timescale instance. If the .csv files aren't in your current directory, specify the file paths in the following commands:
```
\COPY stocks_real_time from './tutorial_sample_tick.csv' DELIMITER ',' CSV HEADER;
```
```
\COPY company from './tutorial_sample_company.csv' DELIMITER ',' CSV HEADER;
```
Because there are millions of rows of data, the COPY process may take a few minutes depending on your internet connection and local client resources.