Insert and query Bitcoin transactions

This section of the tutorial provides an example database schema that you can use to ingest and store Bitcoin blockchain data in TimescaleDB. The schema consists of only one table called transactions.

Bitcoin transaction data fields

The sample Bitcoin dataset for this tutorial has the following fields:

FieldDescription
timeTimestamp of the transaction
block_idBlock ID
hashHash ID of the transaction
sizeSize of the transaction in KB
weightSize of the transaction in weight units
is_coinbaseWhether the transaction is the first transaction in a block, which includes the miner's reward
output_totalValue of the transaction in Satoshi (sat)
output_total_usdValue of the transaction in USD
feeTransaction fee in Satoshi (sat)
fee_usdTransaction fee in USD

Table definition

Create a table named transactions to hold the Bitcoin data. Run the following query:

CREATE TABLE transactions (
   time TIMESTAMPTZ,
   block_id INT,
   hash TEXT,
   size INT,
   weight INT,
   is_coinbase BOOLEAN,
   output_total BIGINT,
   output_total_usd DOUBLE PRECISION,
   fee BIGINT,
   fee_usd DOUBLE PRECISION,
   details JSONB
);

The table schema includes all the fields described above, plus an additional JSONB column named details. This column stores a JSONB string with extra information about each transaction. Data from this column isn't used in this tutorial, but you can explore the data and get inspired to perform your own analyses.

Turn the table into a hypertable by using the create_hypertable() function. A hypertable gives you performance improvements by using TimescaleDB's chunking feature behind the scenes. This function needs two parameters: the name of the table and the name of the TIMESTAMP column. In this case, the names are transactions and time.

SELECT create_hypertable('transactions', 'time');

Next, create some additional indexes on the hypertable. This optimizes execution of later SQL queries.

Create indexes

When you create a hypertable, TimescaleDB automatically adds a B-tree index on the timestamp column. This improves queries where you filter by the time column.

To speed up queries where you search for individual transactions with the hash column, add a HASH INDEX to the column:

CREATE INDEX hash_idx ON public.transactions USING HASH (hash)

Next, speed up block-level queries by adding an index on the block_id column:

CREATE INDEX block_idx ON public.transactions (block_id)

To ensure that you don't accidentally insert duplicate records, add a UNIQUE INDEX on the time and hash columns.

CREATE UNIQUE INDEX time_hash_idx ON public.transactions (time, hash)

Ingest Bitcoin transactions

You created the hypertable and added proper indexes. Next, ingest some Bitcoin transactions. The sample data file contains Bitcoin transactions from the past five days. This CSV file is updated daily so you always download recent Bitcoin transactions. Insert this dataset into your TimescaleDB instance.

Ingesting Bitcoin transactions

  1. Download the sample .csv file:

    wget https://assets.timescale.com/docs/downloads/bitcoin-blockchain/bitcoin_sample.zip
  2. Unzip the file and change the directory if you need to:

    unzip bitcoin_sample.zip
    cd bitcoin_sample
  3. At the psql prompt, insert the content of the .csv file into the database.

    psql -x "postgres://tsdbadmin:<YOUR_PASSWORD_HERE>@<YOUR_HOSTNAME_HERE>:<YOUR_PORT_HERE>/tsdb?sslmode=require"
    
    \COPY transactions FROM 'tutorial_bitcoin_sample.csv' CSV HEADER;

    The process should complete in 3-5 minutes.

Once the ingestion finishes, your database contains around 1.5 million Bitcoin transactions. Now, you can make your first queries.

Query Bitcoin transactions

Query for the five most recent non-coinbase transactions:

SELECT time, hash, block_id, weight  FROM transactions 
WHERE is_coinbase IS NOT TRUE
ORDER BY time DESC
LIMIT 5

The result looks something like this:

timehashblock_idweight
2022-05-30 01:42:176543a8e489eade391f099df7066f17783ea2f9d19d644d818ac22bd8fb86005e738489863
2022-05-30 01:42:17a9e2bb3e734e0c0535da4e8ab6e3d0352a44db443d48a861bd5b196575dfd3ff738489577
2022-05-30 01:42:17fd0a9a8c31962107d0a5a0c4ef2a5702e2c9fad6d989e7ac543d87783205a980738489758
2022-05-30 01:42:17e2aedc6026459381485cd57f3e66ea88121e5094c03fa4634193417069058609738489766
2022-05-30 01:42:17429c0d00282645b54bd3c0082800a85d1c952d1764c54dc2a591f97b97c93fbd738489766

note

A coinbase transaction is the first transaction in each block. This transaction contains the miner's reward.

Here's another example query that returns the five most recent blocks, with statistics such as block weight, transaction count, and value in USD:

WITH recent_blocks AS (
    SELECT block_id FROM transactions 
    WHERE is_coinbase IS TRUE
    ORDER BY time DESC 
    LIMIT 5
) 
SELECT
    t.block_id, count(*) AS transaction_count,
    SUM(weight) AS block_weight,
    SUM(output_total_usd) AS block_value_usd
FROM transactions t
INNER JOIN recent_blocks b ON b.block_id = t.block_id
WHERE is_coinbase IS NOT TRUE
GROUP BY t.block_id

Result:

block_idtransaction_countblock_weightblock_value_usd
73848925083991873402592534.37649953
73848722313991983560811110.1410986
73848832083991994422477674.0440979
73848621543996197481098865.6260999
73848526023991871761258578.3764017

At this point, you have Bitcoin blockchain data in your database and you've made your first SQL queries. In the next section, dig deeper into the blockchain and use TimescaleDB hyperfunctions to generate insights!

Found an issue on this page?

Report an issue!

Keywords

Related Content