count_min_sketch() functions
ToolkitTimescaleDB Toolkit functions are available under Timescale Community Edition. They are automatically included with Timescale, but must be installed separately for self-hosted TimescaleDB. Click to learn more.Introduction
Count the number of times a value appears in a column, using the probabilistic
count-min sketch
data structure and its associated
algorithms. For applications where a small error rate is tolerable, this can
result in huge savings in both CPU time and memory, especially for large
datasets.
Related hyperfunction groups
Warning
This function group includes some experimental functions. Experimental functions might change or be removed in future releases. We do not recommend using them in production. Experimental functions are marked with an Experimental tag.
Aggregate
- count_min_sketch
- ExperimentalAggregate data into a
CountMinSketch
for approximate counting
Accessor
- approx_count
- ExperimentalEstimate the number of times a value appears from a
CountMinSketch
count_min_sketch(values TEXT,error DOUBLE PRECISION,probability DOUBLE PRECISION,) RETURNS CountMinSketch
Aggregate data into a CountMinSketch
object, which you can use to estimate the number of times a given item appears in a column.
The sketch produces a biased estimator of frequency.
It might overestimate the item count, but it can't underestimate.
You can control the relative error and the probability that the estimate falls outside the error bounds.
Required arguments
Name | Type | Description |
---|---|---|
values | TEXT | The column of values to count |
error | DOUBLE PRECISION | Error tolerance in estimate, calculated relative to the number of values added to the sketch |
probability | DOUBLE PRECISION | Probability that an estimate falls outside the error bounds |
Returns
Column | Type | Description |
---|---|---|
count_min_sketch | CountMinSketch | An object storing a table of counters |
approx_count (item TEXT,agg CountMinSketch) RETURNS INTEGER
Estimate the number of times a given text value appears in a column.
Required arguments
Name | Type | Description |
---|---|---|
item | TEXT | The value you want to estimate occurrences of |
agg | CountMinSketch | A CountMinSketch object created using count_min_sketch |
Returns
Column | Type | Description |
---|---|---|
approx_count | INTEGER | The estimated number of times item appeared in the sketch |
Examples
Given a table of stock data, estimate how many times the symbol AAPL
appears:
WITH t AS (SELECT toolkit_experimental.count_min_sketch(symbol, 0.01, 0.01) AS symbol_sketchFROM stocks_real_time)SELECT toolkit_experimental.approx_count('AAPL', symbol_sketch)FROM t;
Keywords
Found an issue on this page?Report an issue or Edit this page in GitHub.