timescaledb_toolkit
Library of analytical hyperfunctions, time-series pipelining, and other SQL utilities
Overview
1010 | timescaledb_toolkit | timescaledb_toolkit | 1.21.0 | TIME | Timescale | Rust |
Attributes
Yes | No | Yes | No | Yes | No | Yes | - |
Packages
EL | PIGSTY | timescaledb-toolkit_$v | 1.21.0 | - | 17 16 15 14 13 |
Debian | PIGSTY | postgresql-$v-timescaledb-toolkit | 1.21.0 | - | 17 16 15 14 13 |
Comments
pgrx=0.12.9
Availability
PG17 | PG16 | PG15 | PG14 | PG13 | |
---|---|---|---|---|---|
el8.x86_64 | 1.21.0 | 1.21.0 | 1.21.0 | 1.19.0 | ✗ |
el8.aarch64 | 1.21.0 | 1.21.0 | 1.21.0 | 1.19.0 | ✗ |
el9.x86_64 | 1.21.0 | 1.21.0 | 1.21.0 | 1.19.0 | ✗ |
el9.aarch64 | 1.21.0 | 1.21.0 | 1.21.0 | 1.19.0 | ✗ |
d12.x86_64 | 1.21.0 | 1.21.0 | 1.21.0 | 1.19.0 | ✗ |
d12.aarch64 | 1.21.0 | 1.21.0 | 1.21.0 | 1.19.0 | ✗ |
u22.x86_64 | 1.21.0 | 1.21.0 | 1.21.0 | 1.19.0 | ✗ |
u22.aarch64 | 1.21.0 | 1.21.0 | 1.21.0 | 1.19.0 | ✗ |
u24.x86_64 | 1.21.0 | 1.21.0 | 1.21.0 | 1.19.0 | ✗ |
u24.aarch64 | 1.21.0 | 1.21.0 | 1.21.0 | 1.19.0 | ✗ |
Download
To add the required PGDG / PIGSTY upstream repository, use:
pig repo add pgdg -u # add PGDG repo and update cache (leave existing repos)
pig repo add pigsty -u # add PIGSTY repo and update cache (leave existing repos)
pig repo add pgsql -u # add PGDG + Pigsty repo and update cache (leave existing repos)
pig repo set all -u # set repo to all = NODE + PGSQL + INFRA (remove existing repos)
./node.yml -t node_repo -e node_repo_modules=node,pgsql # -l <cluster>
Or download the latest packages directly:
Install
Install this extension with:
pig ext install timescaledb_toolkit; # install by extension name, for the current active PG version
pig ext install timescaledb_toolkit -v 17; # install for PG 17
pig ext install timescaledb_toolkit -v 16; # install for PG 16
pig ext install timescaledb_toolkit -v 15; # install for PG 15
pig ext install timescaledb_toolkit -v 14; # install for PG 14
dnf install timescaledb-toolkit_17;
dnf install timescaledb-toolkit_16;
dnf install timescaledb-toolkit_15;
dnf install timescaledb-toolkit_14;
apt install postgresql-17-timescaledb-toolkit;
apt install postgresql-16-timescaledb-toolkit;
apt install postgresql-15-timescaledb-toolkit;
apt install postgresql-14-timescaledb-toolkit;
./pgsql.yml -t pg_ext -e '{"pg_extensions": ["timescaledb_toolkit"]}' # -l <cls>
Create this extension with:
CREATE EXTENSION timescaledb_toolkit;
Usage
TimescaleDB Toolkit provides specialized functions for time-series analytics using a two-step aggregation pattern. Most functions create intermediate representations that accessor functions then query, enabling efficient reuse and multiple analyses.
Approximate Analytics
HyperLogLog - Distinct Count Estimation
Probabilistic distinct counting with configurable precision for high-cardinality datasets.
-- Estimate unique users per day
SELECT
date_trunc('day', timestamp) as day,
distinct_count(hyperloglog(64, user_id)) as unique_users
FROM events
GROUP BY day;
-- Combine counts across partitions
SELECT distinct_count(rollup(hll))
FROM (SELECT hyperloglog(32, session_id) as hll FROM events_2023
UNION ALL
SELECT hyperloglog(32, session_id) FROM events_2024) t;
T-Digest - Quantile Approximation
High-accuracy percentile estimation optimized for tail quantiles (P95, P99).
-- Track response time percentiles
SELECT
service_name,
approx_percentile(0.50, tdigest(100, response_time)) as p50,
approx_percentile(0.95, tdigest(100, response_time)) as p95,
approx_percentile(0.99, tdigest(100, response_time)) as p99
FROM api_metrics
GROUP BY service_name;
-- Hourly percentiles with continuous aggregation
CREATE MATERIALIZED VIEW hourly_percentiles AS
SELECT
time_bucket('1 hour', timestamp) as hour,
tdigest(200, response_time) as digest
FROM requests GROUP BY hour;
UddSketch - Bounded Error Quantiles
Quantile estimation with guaranteed maximum relative error bounds.
-- CPU utilization percentiles with 1% max error
SELECT
host_id,
approx_percentile(0.95, uddsketch(100, 0.01, cpu_percent)) as p95_cpu,
error(uddsketch(100, 0.01, cpu_percent)) as actual_error
FROM system_metrics
GROUP BY host_id;
Counter Analytics
Counter Aggregates - Monotonic Metrics
Handle counters that increase monotonically with automatic reset detection.
-- Request rate calculation
SELECT
time_bucket('5 min', timestamp) as bucket,
rate(counter_agg(timestamp, request_count)) as requests_per_sec,
delta(counter_agg(timestamp, request_count)) as total_requests
FROM metrics
GROUP BY bucket;
-- Extrapolated rate for partial buckets
SELECT
extrapolated_rate(
counter_agg(timestamp, bytes_sent,
bounds => time_bucket_range('1 hour', timestamp))
) as bytes_per_second
FROM network_stats;
Gauge Aggregates - Varying Metrics
Analytics for metrics that vary up and down (temperature, memory usage).
-- Temperature change analysis
SELECT
sensor_id,
delta(gauge_agg(timestamp, temperature)) as temp_delta,
rate(gauge_agg(timestamp, temperature)) as temp_rate_per_sec
FROM weather_data
GROUP BY sensor_id;
Time-Weighted Analytics
Time-Weighted Averages
Handle irregularly sampled data with interpolation methods (LOCF, Linear).
-- Weighted average for irregular sensor readings
SELECT
device_id,
average(time_weight('LOCF', timestamp, sensor_value)) as weighted_avg,
average(time_weight('Linear', timestamp, sensor_value)) as linear_avg
FROM iot_readings
GROUP BY device_id;
-- Combining multiple time ranges
SELECT average(rollup(tw))
FROM (SELECT time_weight('LOCF', ts, val) as tw FROM readings_2023
UNION ALL
SELECT time_weight('LOCF', ts, val) FROM readings_2024) t;
Data Visualization
LTTB Downsampling
Downsample time series while preserving visual similarity for charts.
-- Reduce 100K points to 1K for visualization
SELECT time, value
FROM unnest((
SELECT lttb(timestamp, price, 1000)
FROM stock_prices
WHERE symbol = 'AAPL'
));
ASAP Smoothing
Generate human-readable graphs by reducing noise while preserving trends.
-- Smooth daily data to weekly resolution
SELECT time, value
FROM unnest((
SELECT asap_smooth(date, daily_sales, 52)
FROM sales_data
WHERE date >= '2023-01-01'
));
Statistical Analysis
Stats Aggregates
Comprehensive statistical analysis with 1D and 2D regression capabilities.
-- Multi-variable analysis
SELECT
-- Basic statistics
average(stats_agg(response_time)) as avg_response,
stddev(stats_agg(response_time)) as response_stddev,
-- Regression analysis
slope(stats_agg(response_time, request_size)) as size_impact,
corr(stats_agg(response_time, request_size)) as correlation,
determination_coeff(stats_agg(response_time, request_size)) as r_squared
FROM performance_data;
Timevector Data Type
Efficient intermediate representation for time series operations.
-- Create and manipulate timevector
CREATE VIEW cpu_series AS
SELECT host_id, timevector(timestamp, cpu_percent) as ts
FROM system_metrics GROUP BY host_id;
-- Chain operations on timevector
SELECT host_id, unnest(lttb(ts, 100))
FROM cpu_series;
Integration Patterns
Continuous Aggregation Support
Most toolkit functions work seamlessly with TimescaleDB continuous aggregates:
CREATE MATERIALIZED VIEW hourly_analytics AS
SELECT
time_bucket('1 hour', timestamp) as hour,
service_name,
tdigest(100, response_time) as response_digest,
counter_agg(timestamp, request_count) as request_counter,
hyperloglog(64, user_id) as unique_users
FROM api_events
GROUP BY hour, service_name;
-- Query pre-computed aggregates
SELECT
hour,
approx_percentile(0.95, response_digest) as p95_response,
rate(request_counter) as req_per_sec,
distinct_count(unique_users) as unique_users
FROM hourly_analytics
WHERE hour >= NOW() - INTERVAL '24 hours';
Two-Step Analysis Pattern
Store intermediate aggregates for multiple analyses:
-- Step 1: Create aggregates
CREATE TABLE daily_summaries AS
SELECT
date_trunc('day', timestamp) as day,
tdigest(200, response_time) as response_digest,
stats_agg(response_time, request_size) as stats
FROM requests GROUP BY day;
-- Step 2: Multiple analyses from same data
SELECT
day,
approx_percentile(0.50, response_digest) as median,
approx_percentile(0.99, response_digest) as p99,
average(stats) as avg_response,
slope(stats) as size_correlation
FROM daily_summaries;
All functions in the experimental schema (toolkit_experimental
) may change between versions. Use stable functions for production workloads requiring API stability.