pg_pathman
supports Postgres versions [9.5..13], but most probably it won't be ported to 14 and later releases. Native partitioning is pretty mature now and has almost everything implemented in pg_pathman
'; we encourage users switching to it. We are still maintaining the project (fixing bugs in supported versions), but no new development is going to happen here.
The pg_pathman
module provides optimized partitioning mechanism and functions to manage partitions.
The extension is compatible with:
- PostgreSQL 9.5, 9.6, 10, 11, 12, 13;
- Postgres Pro Standard 9.5, 9.6, 10, 11, 12;
- Postgres Pro Enterprise;
Take a look at our Wiki out there.
Partitioning means splitting one large table into smaller pieces. Each row in such table is moved to a single partition according to the partitioning key. PostgreSQL <= 10 supports partitioning via table inheritance: each partition must be created as a child table with CHECK CONSTRAINT:
CREATE TABLE test (id SERIAL PRIMARY KEY, title TEXT);
CREATE TABLE test_1 (CHECK ( id >= 100 AND id < 200 )) INHERITS (test);
CREATE TABLE test_2 (CHECK ( id >= 200 AND id < 300 )) INHERITS (test);
PostgreSQL 10 provides native partitioning:
CREATE TABLE test(id int4, value text) PARTITION BY RANGE(id);
CREATE TABLE test_1 PARTITION OF test FOR VALUES FROM (1) TO (10);
CREATE TABLE test_2 PARTITION OF test FOR VALUES FROM (10) TO (20);
It's not so different from the classic approach; there are implicit check constraints, and most of its limitations are still relevant.
Despite the flexibility, this approach forces the planner to perform an exhaustive search and to check constraints on each partition to determine whether it should be present in the plan or not. Large amount of partitions may result in significant planning overhead.
The pg_pathman
module features partition managing functions and optimized planning mechanism which utilizes knowledge of the partitions' structure. It stores partitioning configuration in the pathman_config
table; each row contains a single entry for a partitioned table (relation name, partitioning column and its type). During the initialization stage the pg_pathman
module caches some information about child partitions in the shared memory, which is used later for plan construction. Before a SELECT query is executed, pg_pathman
traverses the condition tree in search of expressions like:
VARIABLE OP CONST
where VARIABLE
is a partitioning key, OP
is a comparison operator (supported operators are =, <, <=, >, >=), CONST
is a scalar value. For example:
WHERE id = 150
Based on the partitioning type and condition's operator, pg_pathman
searches for the corresponding partitions and builds the plan. Currently pg_pathman
supports two partitioning schemes:
- RANGE - maps rows to partitions using partitioning key ranges assigned to each partition. Optimization is achieved by using the binary search algorithm;
- HASH - maps rows to partitions using a generic hash function.
More interesting features are yet to come. Stay tuned!
- HASH and RANGE partitioning schemes;
- Partitioning by expression and composite key;
- Both automatic and manual partition management;
- Support for integer, floating point, date and other types, including domains;
- Effective query planning for partitioned tables (JOINs, subselects etc);
RuntimeAppend
&RuntimeMergeAppend
custom plan nodes to pick partitions at runtime;PartitionFilter
: an efficient drop-in replacement for INSERT triggers;PartitionRouter
andPartitionOverseer
for cross-partition UPDATE queries (instead of triggers);- Automatic partition creation for new INSERTed data (only for RANGE partitioning);
- Improved
COPY FROM
statement that is able to insert rows directly into partitions; - User-defined callbacks for partition creation event handling;
- Non-blocking concurrent table partitioning;
- FDW support (foreign partitions);
- Various GUC toggles and configurable settings.
- Partial support of
declarative partitioning
(from PostgreSQL 10).
To install pg_pathman
, execute this in the module's directory:
make install USE_PGXS=1
Important: Don't forget to set the
PG_CONFIG
variable (make PG_CONFIG=...
) in case you want to testpg_pathman
on a non-default or custom build of PostgreSQL. Read more here.
Modify the shared_preload_libraries
parameter in postgresql.conf
as following:
shared_preload_libraries = 'pg_pathman'
Important:
pg_pathman
may cause conflicts with some other extensions that use the same hook functions. For example,pg_pathman
usesProcessUtility_hook
to handle COPY queries for partitioned tables, which means it may interfere withpg_stat_statements
from time to time. In this case, try listing libraries in certain order:shared_preload_libraries = 'pg_stat_statements, pg_pathman'
.
It is essential to restart the PostgreSQL instance. After that, execute the following query in psql:
CREATE EXTENSION pg_pathman;
Done! Now it's time to setup your partitioning schemes.
Windows-specific: pg_pathman imports several symbols (e.g. None_Receiver, InvalidObjectAddress) from PostgreSQL, which is fine by itself, but requires that those symbols are marked as
PGDLLIMPORT
. Unfortunately, some of them are not exported from vanilla PostgreSQL, which means that you have to either use Postgres Pro Standard/Enterprise (which includes all necessary patches), or patch and build your own distribution of PostgreSQL.
In order to update pg_pathman:
- Install the latest stable release of pg_pathman.
- Restart your PostgreSQL cluster.
- Execute the following queries:
/* only required for major releases, e.g. 1.4 -> 1.5 */
ALTER EXTENSION pg_pathman UPDATE;
SET pg_pathman.enable = t;
pathman_version()
Although it's possible to get major and minor version numbers using \dx pg_pathman
, it doesn't show the actual patch number. This function returns a complete version number of the loaded pg_pathman module in MAJOR.MINOR.PATCH
format.
create_hash_partitions(parent_relid REGCLASS,
expression TEXT,
partitions_count INTEGER,
partition_data BOOLEAN DEFAULT TRUE,
partition_names TEXT[] DEFAULT NULL,
tablespaces TEXT[] DEFAULT NULL)
Performs HASH partitioning for relation
by partitioning expression expr
. The partitions_count
parameter specifies the number of partitions to create; it cannot be changed afterwards. If partition_data
is true
then all the data will be automatically copied from the parent table to partitions. Note that data migration may took a while to finish and the table will be locked until transaction commits. See partition_table_concurrently()
for a lock-free way to migrate data. Partition creation callback is invoked for each partition if set beforehand (see set_init_callback()
).
create_range_partitions(parent_relid REGCLASS,
expression TEXT,
start_value ANYELEMENT,
p_interval ANYELEMENT,
p_count INTEGER DEFAULT NULL
partition_data BOOLEAN DEFAULT TRUE)
create_range_partitions(parent_relid REGCLASS,
expression TEXT,
start_value ANYELEMENT,
p_interval INTERVAL,
p_count INTEGER DEFAULT NULL,
partition_data BOOLEAN DEFAULT TRUE)
create_range_partitions(parent_relid REGCLASS,
expression TEXT,
bounds ANYARRAY,
partition_names TEXT[] DEFAULT NULL,
tablespaces TEXT[] DEFAULT NULL,
partition_data BOOLEAN DEFAULT TRUE)
Performs RANGE partitioning for relation
by partitioning expression expr
, start_value
argument specifies initial value, p_interval
sets the default range for auto created partitions or partitions created with append_range_partition()
or prepend_range_partition()
(if NULL
then auto partition creation feature won't work), p_count
is the number of premade partitions (if not set then pg_pathman
tries to determine it based on expression's values). The bounds
array can be built using generate_range_bounds()
. Partition creation callback is invoked for each partition if set beforehand.
generate_range_bounds(p_start ANYELEMENT,
p_interval INTERVAL,
p_count INTEGER)
generate_range_bounds(p_start ANYELEMENT,
p_interval ANYELEMENT,
p_count INTEGER)
Builds bounds
array for create_range_partitions()
.
partition_table_concurrently(relation REGCLASS,
batch_size INTEGER DEFAULT 1000,
sleep_time FLOAT8 DEFAULT 1.0)
Starts a background worker to move data from parent table to partitions. The worker utilizes short transactions to copy small batches of data (up to 10K rows per transaction) and thus doesn't significantly interfere with user's activity. If the worker is unable to lock rows of a batch, it sleeps for sleep_time
seconds before the next attempt and tries again up to 60 times, and quits if it's still unable to lock the batch.
stop_concurrent_part_task(relation REGCLASS)
Stops a background worker performing a concurrent partitioning task. Note: worker will exit after it finishes relocating a current batch.
Triggers are no longer required nor for INSERTs, neither for cross-partition UPDATEs. However, user-supplied triggers are supported:
- Each inserted row results in execution of
BEFORE/AFTER INSERT
trigger functions of a corresponding partition. - Each updated row results in execution of
BEFORE/AFTER UPDATE
trigger functions of a corresponding partition. - Each moved row (cross-partition update) results in execution of
BEFORE UPDATE
+BEFORE/AFTER DELETE
+BEFORE/AFTER INSERT
trigger functions of corresponding partitions.
replace_hash_partition(old_partition REGCLASS,
new_partition REGCLASS,
lock_parent BOOLEAN DEFAULT TRUE)
Replaces specified partition of HASH-partitioned table with another table. The lock_parent
parameter will prevent any INSERT/UPDATE/ALTER TABLE queries to parent table.
split_range_partition(partition_relid REGCLASS,
split_value ANYELEMENT,
partition_name TEXT DEFAULT NULL,
tablespace TEXT DEFAULT NULL)
Split RANGE partition
in two by split_value
. Partition creation callback is invoked for a new partition if available.
merge_range_partitions(variadic partitions REGCLASS[])
Merge several adjacent RANGE partitions. Partitions are automatically ordered by increasing bounds; all the data will be accumulated in the first partition.
append_range_partition(parent_relid REGCLASS,
partition_name TEXT DEFAULT NULL,
tablespace TEXT DEFAULT NULL)
Append new RANGE partition with pathman_config.range_interval
as interval.
prepend_range_partition(parent_relid REGCLASS,
partition_name TEXT DEFAULT NULL,
tablespace TEXT DEFAULT NULL)
Prepend new RANGE partition with pathman_config.range_interval
as interval.
add_range_partition(parent_relid REGCLASS,
start_value ANYELEMENT,
end_value ANYELEMENT,
partition_name TEXT DEFAULT NULL,
tablespace TEXT DEFAULT NULL)
Create new RANGE partition for relation
with specified range bounds. If start_value
or end_value
are NULL then corresponding range bound will be infinite.
drop_range_partition(partition TEXT, delete_data BOOLEAN DEFAULT TRUE)
Drop RANGE partition and all of its data if delete_data
is true.
attach_range_partition(parent_relid REGCLASS,
partition_relid REGCLASS,
start_value ANYELEMENT,
end_value ANYELEMENT)
Attach partition to the existing RANGE-partitioned relation. The attached table must have exactly the same structure as the parent table, including the dropped columns. Partition creation callback is invoked if set (see pathman_config_params
).
detach_range_partition(partition_relid REGCLASS)
Detach partition from the existing RANGE-partitioned relation.
disable_pathman_for(parent_relid REGCLASS)
Permanently disable pg_pathman
partitioning mechanism for the specified parent table and remove the insert trigger if it exists. All partitions and data remain unchanged.
drop_partitions(parent_relid REGCLASS,
delete_data BOOLEAN DEFAULT FALSE)
Drop partitions of the parent
table (both foreign and local relations). If delete_data
is false
, the data is copied to the parent table first. Default is false
.
To remove partitioned table along with all partitions fully, use conventional
DROP TABLE relation CASCADE
. However, care should be taken in somewhat rare
case when you are running logical replication and DROP
was executed by
replication apply worker, e.g. via trigger on replicated table. pg_pathman
uses pathman_ddl_trigger
event trigger to remove the record about dropped
table from pathman_config
, and this trigger by default won't fire on replica,
leading to inconsistent state when pg_pathman
thinks that the table still
exists, but in fact it doesn't. If this is the case, configure this trigger to
fire on replica too:
ALTER EVENT TRIGGER pathman_ddl_trigger ENABLE ALWAYS;
Physical replication doesn't have this problem since DDL as well as
pathman_config
table is replicated too; master and slave PostgreSQL instances
are basically identical, and it is only harmful to keep this trigger in ALWAYS
mode.
set_interval(relation REGCLASS, value ANYELEMENT)
Update RANGE partitioned table interval. Note that interval must not be negative and it must not be trivial, i.e. its value should be greater than zero for numeric types, at least 1 microsecond for TIMESTAMP
and at least 1 day for DATE
.
set_enable_parent(relation REGCLASS, value BOOLEAN)
Include/exclude parent table into/from query plan. In original PostgreSQL planner parent table is always included into query plan even if it's empty which can lead to additional overhead. You can use disable_parent()
if you are never going to use parent table as a storage. Default value depends on the partition_data
parameter that was specified during initial partitioning in create_range_partitions()
function. If the partition_data
parameter was true
then all data have already been migrated to partitions and parent table disabled. Otherwise it is enabled.
set_auto(relation REGCLASS, value BOOLEAN)
Enable/disable auto partition propagation (only for RANGE partitioning). It is enabled by default.
set_init_callback(relation REGCLASS, callback REGPROC DEFAULT 0)
Set partition creation callback to be invoked for each attached or created partition (both HASH and RANGE). If callback is marked with SECURITY INVOKER, it's executed with the privileges of the user that produced a statement which has led to creation of a new partition (e.g. INSERT INTO partitioned_table VALUES (-5)
). The callback must have the following signature: part_init_callback(args JSONB) RETURNS VOID
. Parameter arg
consists of several fields whose presence depends on partitioning type:
/* RANGE-partitioned table abc (child abc_4) */
{
"parent": "abc",
"parent_schema": "public",
"parttype": "2",
"partition": "abc_4",
"partition_schema": "public",
"range_max": "401",
"range_min": "301"
}
/* HASH-partitioned table abc (child abc_0) */
{
"parent": "abc",
"parent_schema": "public",
"parttype": "1",
"partition": "abc_0",
"partition_schema": "public"
}
set_set_spawn_using_bgw(relation REGCLASS, value BOOLEAN)
When INSERTing new data beyond the partitioning range, use SpawnPartitionsWorker to create new partitions in a separate transaction.
CREATE TABLE IF NOT EXISTS pathman_config (
partrel REGCLASS NOT NULL PRIMARY KEY,
expr TEXT NOT NULL,
parttype INTEGER NOT NULL,
range_interval TEXT,
cooked_expr TEXT);
This table stores a list of partitioned tables.
CREATE TABLE IF NOT EXISTS pathman_config_params (
partrel REGCLASS NOT NULL PRIMARY KEY,
enable_parent BOOLEAN NOT NULL DEFAULT TRUE,
auto BOOLEAN NOT NULL DEFAULT TRUE,
init_callback TEXT DEFAULT NULL,
spawn_using_bgw BOOLEAN NOT NULL DEFAULT FALSE);
This table stores optional parameters which override standard behavior.
-- helper SRF function
CREATE OR REPLACE FUNCTION show_concurrent_part_tasks()
RETURNS TABLE (
userid REGROLE,
pid INT,
dbid OID,
relid REGCLASS,
processed INT,
status TEXT)
AS 'pg_pathman', 'show_concurrent_part_tasks_internal'
LANGUAGE C STRICT;
CREATE OR REPLACE VIEW pathman_concurrent_part_tasks
AS SELECT * FROM show_concurrent_part_tasks();