Query data in Promscale
You can query the data stored in Promscale using SQL (metrics and traces) and PromQL (only metrics).
PromQL queries have to be directed to the Promscale Connector.
Alternatively, you can direct PromQL queries to the Prometheus instance,
which reads data from the Connector using the remote_read
interface. The Connector, in turn, fetches data from TimescaleDB. You would
typically use a visualization tool to run PromQL queries.
Learn more about PromQL in the Prometheus documentation.
SQL queries are handled directly by TimescaleDB. You can query the data
in Promscale with your preferred SQL tool. In this section, we use psql
.
For more information about installing and using psql
, see the
installing psql section.
Query metric data with SQL
This section covers information about the different SQL queries you can use for metrics data.
Query a metric
When you query a metric, the query is performed against the view of the metric
you're interested in. This example queries a metric named go_dc_duration
for
its samples in the past five minutes. This metric is a measurement for how long
garbage collection is taking in Go applications:
SELECT * from go_gc_duration_seconds
WHERE time > now() - INTERVAL '5 minutes';
An example of the output for this query:
| time | value | series_id | labels | instance_id | job_id | quantile_id |
|----------------------------|-------------|-----------|-------------------|-------------|--------|-------------|
| 2021-01-27 18:43:42.389+00 | 0 | 495 | {208,43,51,212} | 43 | 51 | 212 |
| 2021-01-27 18:43:42.389+00 | 0 | 497 | {208,43,51,213} | 43 | 51 | 213 |
| 2021-01-27 18:43:42.389+00 | 0 | 498 | {208,43,51,214} | 43 | 51 | 214 |
| 2021-01-27 18:43:42.389+00 | 0 | 499 | {208,43,51,215} | 43 | 51 | 215 |
| 2021-01-27 18:43:42.389+00 | 0 | 500 | {208,43,51,216} | 43 | 51 | 216 |
In this output, each row includes a series_id
field, which uniquely identifies
its measurements label set. This enables efficient aggregation by series.
Each row also includes a labels
field, which contains an array of foreign keys
to label key-value pairs making up the label set.
While the labels
array is the entire label set, there are also separate fields
for each label key in the label set, to simplify access. These fields end with
the suffix _id
.
Query values for label keys
Each label key is expanded into its own column, which stores foreign key
identifiers to their value. This allows you to JOIN
, aggregate, and filter by
label keys and values.
To retrieve the text represented by a label ID, you can use the val(field_id)
function. This allows you to do things like aggregation across all series with a
particular label key.
For example, to find the median value for the go_gc_duration_seconds
metric,
grouped by the job associated with it:
SELECT
val(job_id) as job,
percentile_cont(0.5) within group (order by value) AS median
FROM
go_gc_duration_seconds
WHERE
time > now() - INTERVAL '5 minutes'
GROUP BY job_id;
An example of the output for this query:
| job | median |
|---------------|---------- |
| prometheus | 6.01e-05 |
| node-exporter | 0.0002631 |
Query label sets for a metric
The labels
field in any metric row represents the full set of labels
associated with the measurement. It is represented as an array of identifiers.
To return the entire labelset in JSON, you can use the jsonb()
function, like
this:
SELECT
time, value, jsonb(labels) as labels
FROM
go_gc_duration_seconds
WHERE
time > now() - INTERVAL '5 minutes';
An example of the output for this query:
| time | value | labels |
|----------------------------|-------------|--------------------------------------------------------------------------------------------------------------------|
| 2021-01-27 18:43:48.236+00 | 0.000275625 | {"job": "prometheus", "__name__": "go_gc_duration_seconds", "instance": "localhost:9090", "quantile": "0.5"} |
| 2021-01-27 18:43:48.236+00 | 0.000165632 | {"job": "prometheus", "__name__": "go_gc_duration_seconds", "instance": "localhost:9090", "quantile": "0.25"} |
| 2021-01-27 18:43:48.236+00 | 0.000320684 | {"job": "prometheus", "__name__": "go_gc_duration_seconds", "instance": "localhost:9090", "quantile": "0.75"} |
| 2021-01-27 18:43:52.389+00 | 1.9633e-05 | {"job": "node-exporter", "__name__": "go_gc_duration_seconds", "instance": "node_exporter:9100", "quantile": "0"} |
| 2021-01-27 18:43:52.389+00 | 1.9633e-05 | {"job": "node-exporter", "__name__": "go_gc_duration_seconds", "instance": "node_exporter:9100", "quantile": "1"} |
| 2021-01-27 18:43:52.389+00 | 1.9633e-05 | {"job": "node-exporter", "__name__": "go_gc_duration_seconds", "instance": "node_exporter:9100", "quantile": "0.5"} |
This query returns the label set for the metric go_gc_duration
in JSON format,
so you can read or further interact with it.
Advanced query: percentiles aggregated over time and series
This query calculates the 99th percentile over both time and series (app_id
)
for the metric named go_gc_duration_seconds
. This metric is a measurement for
how long garbage collection is taking in Go applications:
SELECT
val(instance_id) as app,
percentile_cont(0.99) within group(order by value) p99
FROM
go_gc_duration_seconds
WHERE
value != 'NaN' AND val(quantile_id) = '1' AND instance_id > 0
GROUP BY instance_id
ORDER BY p99 desc;
An example of the output for this query:
| app | p99 |
|-------------------|------------ |
|node_exporter:9100 | 0.002790063 |
|localhost:9090 | 0.00097977 |
This query is unique to Promscale, as it aggregates over both time and series and returns an accurate calculation of the percentile. It is not possible to use PromQL alone to accurately calculate percentiles when aggregating over both time and series.
Filter by labels
You can filter by labels, because matching operators correspond to the selectors in
PromQL. The operators are used in a WHERE
clause, in the
labels ? (<label_key> <operator> <pattern>)
.
The four matching operators are:
Operator | Description |
---|---|
== | Matches tag values that are equal to the pattern |
!== | Matches tag values that are not equal to the pattern |
==~ | Matches tag values that match the pattern regex |
!=~ | Matches tag values that are not equal to the pattern regex |
Each operator corresponds to a selector in PromQL, although they have slightly
different spellings to avoid clashing with other PostgreSQL operators. You can
combine them using any Boolean logic, with any arbitrary WHERE
clauses. For
example, if you want only metrics from the job called node-exporter
, you can
filter by labels like this:
SELECT
time, value, jsonb(labels) as labels
FROM
go_gc_duration_seconds
WHERE
labels ? ('job' == 'node-exporter')
AND time > now() - INTERVAL '5 minutes';
An example of the output for this query:
| time | value | labels |
|----------------------------|-----------|------------------------------------------------------------------------------------------------------------------|
| 2021-01-28 02:01:18.066+00 | 3.05e-05 | {"job": "node-exporter", "__name__": "go_gc_duration_seconds", "instance": "node_exporter:9100", "quantile": "0"}|
| 2021-01-28 02:01:28.066+00 | 3.05e-05 | {"job": "node-exporter", "__name__": "go_gc_duration_seconds", "instance": "node_exporter:9100", "quantile": "0"}|
|2021-01-28 02:01:38.032+00 | 3.05e-05 | {"job": "node-exporter", "__name__": "go_gc_duration_seconds", "instance": "node_exporter:9100", "quantile": "0"}|
Query the number of datapoints in a series
Each row in a metric's view has a series_id
that uniquely identifies the
measurement's label set. This allows you to aggregate by series more
efficiently. You can retrieve the labels array from a series_id
using the
labels(series_id)
function. For example, this query shows how many data points
we have in each series:
SELECT jsonb(labels(series_id)) as labels, count(*)
FROM go_gc_duration_seconds
GROUP BY series_id;
An example of the output for this query:
| labels | count |
|---------------------------------------------------------------------------------------------------------------------|-------|
|{"job": "node-exporter", "__name__": "go_gc_duration_seconds", "instance": "node_exporter:9100", "quantile": "0.75"} | 631 |
|{"job": "prometheus", "__name__": "go_gc_duration_seconds", "instance": "localhost:9090", "quantile": "0.75"} | 631 |
|{"job": "node-exporter", "__name__": "go_gc_duration_seconds", "instance": "node_exporter:9100", "quantile": "1"} | 631 |
|{"job": "node-exporter", "__name__": "go_gc_duration_seconds", "instance": "node_exporter:9100", "quantile": "0.5"} | 631 |
|{"job": "prometheus", "__name__": "go_gc_duration_seconds", "instance": "localhost:9090", "quantile": "0.5"} | 631 |
|{"job": "node-exporter", "__name__": "go_gc_duration_seconds", "instance": "node_exporter:9100", "quantile": "0"} | 631 |
|{"job": "prometheus", "__name__": "go_gc_duration_seconds", "instance": "localhost:9090", "quantile": "1"} | 631 |
|{"job": "node-exporter", "__name__": "go_gc_duration_seconds", "instance": "node_exporter:9100", "quantile": "0.25"} | 631 |
|{"job": "prometheus", "__name__": "go_gc_duration_seconds", "instance": "localhost:9090", "quantile": "0.25"} | 631 |
|{"job": "prometheus", "__name__": "go_gc_duration_seconds", "instance": "localhost:9090", "quantile": "0"} | 631 |
Other complex queries
The examples in this section are for querying metrics from Prometheus and
node_exporter
. A more complex example provided by Dan Luu
shows how you can discover Kubernetes containers that are over-provisioned. In
this query, you find containers whose 99th percentile memory utilization is low,
like this:
WITH memory_allowed as (
SELECT
labels(series_id) as labels,
value,
min(time) start_time,
max(time) as end_time
FROM container_spec_memory_limit_bytes total
WHERE value != 0 and value != 'NaN'
GROUP BY series_id, value
)
SELECT
val(memory_used.container_id) container,
percentile_cont(0.99)
within group(order by memory_used.value/memory_allowed.value)
AS percent_used_p99,
max(memory_allowed.value) max_memory_allowed
FROM container_memory_working_set_bytes AS memory_used
INNER JOIN memory_allowed
ON (memory_used.time >= memory_allowed.start_time AND
memory_used.time <= memory_allowed.end_time AND
eq(memory_used.labels,memory_allowed.labels))
WHERE memory_used.value != 'NaN'
GROUP BY container
ORDER BY percent_used_p99 ASC
LIMIT 100;
An example of the output for this query:
| container | percent_used_p99 | total |
|--------------------------------|-------------------------|-------------|
| cluster-overprovisioner-system | 6.961822509765625e-05 | 4294967296 |
| sealed-secrets-controller | 0.00790748596191406 | 1073741824 |
| dumpster | 0.0135690307617187 | 268435456 |
This example uses cAdvisor
, as an example of the sorts of sophisticated
analysis enabled by Promscale's support to query your data in SQL.
Found an issue on this page?
Report an issue!Keywords