Percona DistributionPostgreSQL 16.9
Percona DistributionPostgreSQL 16.9
Documentation
16.9 (May 29, 2025)
Page 1
Table of Contents
Home
Get started
Quickstart guide
1. Install
Via apt
Via yum
From tarballs
Run in Docker
Repositories overview
2. Connect to PostgreSQL
4. What's next
Extensions
Extensions
Percona-authored extensions
Third-party components
Solutions
Overview
High availability
High availability
pgBackRest setup
Overview
Deploying backup and disaster recovery solution in Percona Distribution for PostgreSQL
Overview
Page 2
Deployment
LDAP authentication
LDAP authentication
Upgrade
Major upgrade
Troubleshooting guide
Uninstall
Release Notes
Reference
Telemetry
Licensing
Trademark policy
Page 3
Percona Distribution for PostgreSQL 16
Documentation
Percona Distribution for PostgreSQL is a suite of open source software, tools and services required to deploy
and maintain a reliable production cluster for PostgreSQL.
Percona Distribution for PostgreSQL includes PostgreSQL server , packaged with extensions from open
source community that are certified and tested to work together for high availability, backups, security, and
monitoring that help ensure the cluster’s peak performance.
Part of the solution, Percona Operator for PostgreSQL, makes it easy to orchestrate the cluster reliably and
repeatably in Kubernetes.
No guesswork on finding the right version of a component – they all undergo thorough testing to ensure
compatibility
Freely available reference architectures for solutions like high-availability, backups and disaster recovery
Monitoring of the database health, performance and infrastructure usage via open source Percona
Management and Monitoring with PostgreSQL-specific dashboards
Run PostgreSQL on Kubernetes using open source Percona Operator for PostgreSQL . It not only
automates deployment and management of PostgreSQL clusters on Kubernetes, but also includes enterprise-
ready features for high-availability, backup and restore, replication, logging, and more
Page 4
Our comprehensive resources will help you Learn about the releases and changes in the
overcome challenges, from everyday issues to Distribution.
specific doubts.
Page 5
Get help from Percona
Our documentation guides are packed with information, but they can’t cover everything you need to know about
Percona Distribution for PostgreSQL. They also won’t cover every scenario you might come across. Don’t be
afraid to try things out and ask questions when you get stuck.
We recommend visiting our Community Forum. It’s an excellent place for discussions, technical insights, and
support around Percona database software. If you’re new and feeling a bit unsure, our FAQ and Guide for New
Users ease you in.
If you have thoughts, feedback, or ideas, the community team would like to hear from you at Any ideas on how to
make the forum better?. We’re always excited to connect and improve everyone’s experience.
Percona experts
Percona experts bring years of experience in tackling tough database performance issues and design
challenges.
We understand your challenges when managing complex database environments. That’s why we offer various
services to help you simplify your operations and achieve your goals.
Service Description
24/7 Expert Support Our dedicated team of database experts is available 24/7 to assist you with any database
issues. We provide flexible support plans tailored to your specific needs.
Hands-On Database Our managed services team can take over the day-to-day management of your database
Management infrastructure, freeing up your time to focus on other priorities.
Expert Consulting Our experienced consultants provide guidance on database topics like architecture design,
migration planning, performance optimization, and security best practices.
Comprehensive Training Our training programs help your team develop skills to manage databases effectively, offering
virtual and in-person courses.
Page 6
We’re here to help you every step of the way. Whether you need a quick fix or a long-term partnership, we’re ready
to provide our expertise and support.
Page 7
Get started
Page 8
Quickstart guide
Percona Distribution for PostgreSQL is the PostgreSQL server with the collection of tools from PostgreSQL
community that are tested to work together and serve to assist you in deploying and managing PostgreSQL.
Read more.
This document aims to guide database application developers and DevOps engineer in getting started with
Percona Distribution for PostgreSQL. Upon completion of this guide, you’ll have Percona Distribution for
PostgreSQL installed and operational, and you’ll be able to:
Understand the next steps you can take as a database application developer or administrator to expand your
knowledge of Percona Distribution for PostgreSQL
Page 9
Package manager
Percona provides installation packages in DEB and RPM format for 64-bit Linux distributions. Find the full list of
supported platforms and versions on the Percona Software and Platform Lifecycle page .
If you are on Red Hat Enterprise Linux or compatible derivatives, use yum .
Docker
Get our image from Docker Hub and spin up a cluster on a Docker container for quick evaluation.
Run in Docker
Kubernetes
Percona Operator for Kubernetes is a controller introduced to simplify complex deployments that require
meticulous and secure database expertise.
If installing the package (the recommended method for a safe, secure, and reliable setup) is not an option, refer
to the link below for step-by-step instructions on installing from tarballs using the provided download links.
In this scenario, you must ensure that all dependencies are met. Failure to do so may result in errors or crashes.
Note
Page 10
1. Install
Page 11
Install Percona Distribution for PostgreSQL on
Debian and Ubuntu
This document describes how to install Percona Server for PostgreSQL from Percona repositories on DEB-based
distributions such as Debian and Ubuntu. Read more about Percona repositories.
Preconditions
1. Debian and other systems that use the apt package manager include the upstream PostgreSQL server
package postgresql-16 by default. The components of Percona Distribution for PostgreSQL 16 can only
be installed together with the PostgreSQL server shipped by Percona ( percona-postgresql-16 ). If you
wish to use Percona Distribution for PostgreSQL, uninstall the postgresql-16 package provided by your
distribution and then install the chosen components from Percona Distribution for PostgreSQL.
2. Install curl for Telemetry. We use it to better understand the use of our products and improve them.
Procedure
Run all the commands in the following sections as root or using the sudo command:
$ wget https://repo.percona.com/apt/percona-release_latest.$(lsb_release -
sc)_all.deb
Percona provides two repositories for Percona Distribution for PostgreSQL. We recommend enabling the Major
release repository to timely receive the latest updates.
Page 12
$ sudo percona-release setup ppg-16
Install packages
Page 13
Install using meta-package (deprecated)
The meta package enables you to install several components of the distribution in one go.
Install pg_repack :
Install pgAudit :
Install pgBackRest :
Install Patroni :
Install pgAudit-set_user :
Install pgBadger :
Page 14
$ sudo apt install percona-pgbadger
Install wal2json :
Install HAProxy
Install pgpool2
Install pg_gather
Install pgvector
Some extensions require additional setup in order to use them with Percona Distribution for PostgreSQL.
For more information, refer to Enabling extensions.
The installation process automatically initializes and starts the default database. You can check the database
status using the following command:
Next steps
Enable extensions
Page 15
Connect to PostgreSQL
Page 16
Install Percona Distribution for PostgreSQL on Red
Hat Enterprise Linux and derivatives
This document describes how to install Percona Distribution for PostgreSQL from Percona repositories on RPM-
based distributions such as Red Hat Enterprise Linux and compatible derivatives. Read more about Percona
repositories.
CentOS 7
You may need to install the percona-postgresql16-devel package when working with some extensions or
creating programs that interface with PostgreSQL database. This package requires dependencies that are not
part of the Distribution, but can be installed from the specific repositories:
Page 17
RHEL8
Rocky Linux 8
Oracle Linux 8
Rocky Linux 9
Oracle Linux 9
To install Patroni on Red Hat Enterprise Linux 9 and compatible derivatives, enable the epel repository
To installpgpool2 on Red Hat Enterprise Linux and compatible derivatives, enable the codeready builder
repository first to resolve dependencies conflict for pgpool2 .
The following are commands for Red Hat Enterprise Linux 9 and derivatives. For Red Hat Enterprise Linux 8,
replace the operating system version in the commands accordingly.
Page 18
RHEL 9
Rocky Linux 9
Oracle Linux 9
For PostGIS
For Red Hat Enterprise Linux 8 and derivatives, replace the operating system version in the following commands
accordingly.
Page 19
RHEL 9
Rocky Linux 9
Oracle Linux 9
Page 20
RHEL UBI 9
/etc/yum.repos.d/oracle-linux-ol9.repo
[ol9_baseos_latest]
name=Oracle Linux 9 BaseOS Latest ($basearch)
baseurl=https://yum.oracle.com/repo/OracleLinux/OL9/baseos/latest/$basearch/
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
gpgcheck=1
enabled=1
[ol9_appstream]
name=Oracle Linux 9 Application Stream ($basearch)
baseurl=https://yum.oracle.com/repo/OracleLinux/OL9/appstream/$basearch/
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
gpgcheck=1
enabled=1
[ol9_codeready_builder]
name=Oracle Linux 9 CodeReady Builder ($basearch) - Unsupported
baseurl=https://yum.oracle.com/repo/OracleLinux/OL9/codeready/builder/$basearch/
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
gpgcheck=1
enabled=1
2 Download the right GPG key for the Oracle Yum Repository:
Procedure
Page 21
Run all the commands in the following sections as root or using the sudo command:
Install dependencies
Install curl for Telemetry. We use it to better understand the use of our products and improve them.
Percona provides two repositories for Percona Distribution for PostgreSQL. We recommend enabling the Major
release repository to timely receive the latest updates.
Install packages
Page 22
Install using meta-package (deprecated)
The meta package enables you to install several components of the distribution in one go.
Install pgaudit :
Install pgBackRest :
Install Patroni :
Install pgAudit-set_user :
Install pgBadger :
Page 23
$ sudo yum install percona-pgbadger
Install wal2json :
Install HAProxy
Install pg_gather
Install pgpool2
Some extensions require additional setup in order to use them with Percona Distribution for PostgreSQL.
For more information, refer to Enabling extensions.
After the installation, the default database storage is not automatically initialized. To complete the installation
and start Percona Distribution for PostgreSQL, initialize the database using the following command:
$ /usr/pgsql-16/bin/postgresql-16-setup initdb
Page 24
Start the PostgreSQL service:
Next steps
Enable extensions
Connect to PostgreSQL
Page 25
Install Percona Distribution for PostgreSQL from
binary tarballs
You can download the tarballs using the links below.
Note
Unlike package managers, a tarball installation does not provide mechanisms to ensure that all dependencies are resolved to
the correct library versions. There is no built-in method to verify that required libraries are present or to prevent them from
being removed. As a result, unresolved or broken dependencies may lead to errors, crashes, or even data corruption.
For this reason, tarball installations are not recommended for environments where safety, security, reliability, or mission-critical
stability are required.
The following tarballs are available for the x86_64 and ARM64 architectures:
To check what OpenSSL version you have, run the following command:
$ openssl version
Tarball contents
The tarballs include the following components:
Component Description
percona- The latest version of PostgreSQL server and the following extensions:
postgresql16 - pgaudit
- pgAudit_set_user
- pg_repack
- pg_stat_monitor
- pg_gather
- wal2json
- postGIS
Page 26
Component Description
- pgvector
- the set of contrib extensions
percona- PostgreSQL log analyzer with fully detailed reports and graphs
pgbadger
percona-pgpool- A middleware between PostgreSQL server and client for high availability, connection pooling and load
II balancing
percona-perl A Perl module required to create the plperl extension - a procedural language handler for PostgreSQL
that allows writing functions in the Perl programming language
percona-python3 A Python3 module required to create plpython extension - a procedural language handler for
PostgreSQL that allows writing functions in the Python programming language. Python is also required
by Patroni
percona-tcl Tcl development libraries required to create the pltcl extension - a loadable procedural language for
the PostgreSQL database system that enables the creation of functions and trigger procedures in the
Tcl language
percona-etcd A key-value distributed store that stores the state of the PostgreSQL cluster
Preconditions
Page 27
Debian and Ubuntu
2. Ensure that the libreadline is present on the system, as it is required for tarballs to work correctly:
3. Create the user to own the PostgreSQL process. For example, mypguser . Run the following command:
Ensure that the libreadline is present on the system, as it is required for tarballs to work correctly:
Create the user to own the PostgreSQL process. For example, mypguser , Run the following command:
Procedure
The steps below install the tarballs for OpenSSL 3.x on x86_64 architecture. Use another tarball if your operating
system has OpenSSL version 1.x and / or has the ARM64 architecture.
1. Create the directory where you will store the binaries. For example, /opt/pgdistro
2. Grant access to this directory for the mypguser user.
Page 28
3. Fetch the binary tarball.
$ wget https://downloads.percona.com/downloads/postgresql-distribution-
16/16.9/binary/tarball/percona-postgresql-16.9-ssl3-linux-x86_64.tar.gz
4. Extract the tarball to the directory for binaries that you created on step 1.
5. If you extracted the tarball in a directory other than /opt , copy percona-python3 , percona-tcl and
percona-perl to the /opt directory. This is required for the correct run of libraries that require those
modules.
$ export PATH=:/opt/pgdistro/percona-haproxy/sbin/:/opt/pgdistro/percona-
patroni/bin/:/opt/pgdistro/percona-pgbackrest/bin/:/opt/pgdistro/percona-
pgbadger/:/opt/pgdistro/percona-pgbouncer/bin/:/opt/pgdistro/percona-pgpool-
II/bin/:/opt/pgdistro/percona-postgresql16/bin/:/opt/pgdistro/percona-
etcd/bin/:/opt/percona-perl/bin/:/opt/percona-tcl/bin/:/opt/percona-
python3/bin/:$PATH
7. Create the data directory for PostgreSQL server. For example, /usr/local/pgsql/data .
8. Grant access to this directory for the mypguser user.
9. Switch to the user that owns the Postgres process. In our example, mypguser :
$ su - mypguser
$ /opt/pgdistro/percona-postgresql16/bin/initdb -D /usr/local/pgsql/data
Sample output
Page 29
11. Start the PostgreSQL server:
Sample output
$ /opt/pgdistro/percona-postgresql16/bin/psql -d postgres
Sample output
psql (16.9)
Type "help" for help.
postgres=#
After you unpacked the tarball and added the location of the components’ binaries to the $PATH variable, the
components are available for use. You can invoke a component by running its command-line tool.
$ haproxy version
Some components require additional setup. Check the Enabling extensions page for details.
Page 30
Run Percona Distribution for PostgreSQL in a
Docker container
Docker images of Percona Distribution for PostgreSQL are hosted publicly on Docker Hub .
For more information about using Docker, see the Docker Docs .
Make sure that you are using the latest version of Docker . The ones provided via apt and yum may be outdated and cause
errors.
By default, Docker pulls the image from Docker Hub if it is not available locally.
Page 31
Docker image contents
The Docker image of Percona Distribution for PostgreSQL includes the following components:
percona-pgaudit16 Provides detailed session or object audit logging via the standard PostgreSQL logging
facility.
percona- An additional layer of logging and control when unprivileged users must escalate
pgaudit16_set_user themselves to superuser or object owner roles in order to perform needed maintenance
tasks.
Page 32
$ docker run --name container-name -e POSTGRES_PASSWORD=secret -d percona/percona-
distribution-postgresql:16.9
Where:
→ 16.9 is the tag specifying the version you need. Docker identifies the architecture (x86_64 or ARM64)
and pulls the respective image. See the full list of tags .
Tip
You can secure the password by exporting it to the environment file and using that to start the container.
The container-name is the name of the container that you started in the previous step.
where:
Page 33
container name is the name of your Percona Distribution for PostgreSQL container, and
app-that-uses-postgresql is the name of your PostgreSQL client.
Where:
16.9 is the tag specifying the version you need. Docker identifies the architecture (x86_64 or ARM64) and
pulls the respective image.
address is the network address where your database container is running. Use 127.0.0.1, if the database
container is running on the local machine/host.
Enable pg_stat_monitor
To enable the pg_stat_monitor extension after launching the container, do the following:
select the desired database and enable the pg_stat_monitor view for that database:
\d pg_stat_monitor;
Page 34
Output
View "public.pg_stat_monitor"
Column | Type | Collation | Nullable | Default
---------------------+--------------------------+-----------+----------+---------
bucket | integer | | |
bucket_start_time | timestamp with time zone | | |
userid | oid | | |
dbid | oid | | |
queryid | text | | |
query | text | | |
plan_calls | bigint | | |
plan_total_time | numeric | | |
plan_min_timei | numeric | | |
plan_max_time | numeric | | |
plan_mean_time | numeric | | |
plan_stddev_time | numeric | | |
plan_rows | bigint | | |
calls | bigint | | |
total_time | numeric | | |
min_time | numeric | | |
max_time | numeric | | |
mean_time | numeric | | |
stddev_time | numeric | | |
rows | bigint | | |
shared_blks_hit | bigint | | |
shared_blks_read | bigint | | |
shared_blks_dirtied | bigint | | |
shared_blks_written | bigint | | |
local_blks_hit | bigint | | |
local_blks_read | bigint | | |
local_blks_dirtied | bigint | | |
local_blks_written | bigint | | |
temp_blks_read | bigint | | |
temp_blks_written | bigint | | |
blk_read_time | double precision | | |
blk_write_time | double precision | | |
host | bigint | | |
client_ip | inet | | |
resp_calls | text[] | | |
cpu_user_time | double precision | | |
cpu_sys_time | double precision | | |
tables_names | text[] | | |
wait_event | text | | |
wait_event_type | text | | |
Note that the pg_stat_monitor view is available only for the databases where you enabled it. If you create a
new database, make sure to create the view for it to see its statistics data.
Page 35
Enable Percona Distribution for PostgreSQL
components
Some components require additional configuration before using them with Percona Distribution for PostgreSQL.
This sections provides configuration instructions per component.
Patroni
Patroni is the high availability solution for PostgreSQL. The High Availability in PostgreSQL with Patroni chapter
provides details about the solution overview and architecture deployment.
While setting up a high availability PostgreSQL cluster with Patroni, you will need the following components:
HAProxy .
If you install the software fom packages, all required dependencies and service unit files are included. If you
install the software from the tarballs, you must first enable etcd . See the steps in the etcd section in this
document.
See the configuration guidelines for Debian and Ubuntu and RHEL and CentOS.
etcd
If you installed etcd from binary tarballs, you need to create the etcd.service file. This file allows systemd to
start, stop, restart, and manage the etcd service. This includes handling dependencies, monitoring the service,
and ensuring it runs as expected.
/etc/systemd/system/etcd.service
Page 36
[Unit]
After=network.target
Description=etcd - highly-available key value store
[Service]
LimitNOFILE=65536
Restart=on-failure
Type=notify
ExecStart=/usr/bin/etcd --config-file /etc/etcd/etcd.conf.yaml
User=etcd
[Install]
WantedBy=multi-user.target
pgBadger
Enable the following options in postgresql.conf configuration file before starting the service:
log_min_duration_statement = 0
log_line_prefix = '%t [%p]: '
log_checkpoints = on
log_connections = on
log_disconnections = on
log_lock_waits = on
log_temp_files = 0
log_autovacuum_min_duration = 0
log_error_verbosity = default
pgaudit
Add the pgaudit to shared_preload_libraries in postgresql.conf . The recommended way is to use the
ALTER SYSTEM command. Connect to psql and use the following command:
To configure pgaudit , you must have the privileges of a superuser. You can specify the settings in one of these
ways:
at the role level (using ALTER ROLE … SET). Note that settings are not inherited through normal role
inheritance and SET ROLE will not alter a user’s pgAudit settings. This is a limitation of the roles system and
not inherent to pgAudit.
Page 37
Refer to the pgaudit documentation for details about available settings.
To enable pgaudit , connect to psql and run the CREATE EXTENSION command:
pgaudit set-user
Add the set-user to shared_preload_libraries in postgresql.conf . The recommended way is to use the
ALTER SYSTEM command. Connect to psql and use the following command:
psql <database>
CREATE EXTENSION set_user;
You can fine-tune user behavior with the custom parameters supplied with the extension.
pgbouncer
pgbouncer requires the pgbouncer.ini configuration file to start. The default path is
/etc/pgbouncer/pgbouncer.ini . When installing pgbouncer from a tarball, the path is percona-
pgbouncer/etc/pgbouncer.ini .
Find detailed information about configuration file options in the pgbouncer documentation .
pgpool2
pgpool-II requires the configuration file to start. When you install pgpool from a package, the configuration file
is automatically created for you at the path/etc/pgpool2/pgpool.conf on Debian and Ubuntu and
/etc/pgpool-II/pgpool.conf on RHEL and derivatives.
When you installed pgpool from tarballs, you can use the sample configuration file <tarballsdir>/percona-
pgpool-II/etc/pgpool2/pgpool.conf.sample :
$ cp <tarballsdir>/percona-pgpool-II/etc/pgpool2/pgpool.conf.sample <config-gile-
path>/pgpool.conf
$ pgpool -f <config-gile-path>/pgpool.conf
Page 38
pg_stat_monitor
Please refer to pg_stat_monitor for setup steps.
wal2json
After the installation, enable the following option in postgresql.conf configuration file before starting the
service:
wal_level = logical
pgvector
To get started, enable the extension for the database where you want to use it:
Next steps
Connect to PostgreSQL
Page 39
Repositories overview
Percona provides two repositories for Percona Distribution for PostgreSQL.
Major Release repository ( ppg-16 ) it includes Minor Release repository includes a particular minor release of the
the latest version packages. Whenever a database and all of the packages that were tested and verified to work
package is updated, the package manager of with that minor release (e.g. ppg-16.0 ). You may choose to install
your operating system detects that and Percona Distribution for PostgreSQL from the Minor Release repository if
prompts you to update. As long as you update you have decided to standardize on a particular release which has passed
all Distribution packages at the same time, rigorous testing procedures and which has been verified to work with your
you can ensure that the packages you’re using applications. This allows you to deploy to a new host and ensure that
have been tested and verified by Percona. you’ll be using the same version of all the Distribution packages, even if
newer releases exist in other repositories.
We recommend installing Percona
Distribution for PostgreSQL from the Major The disadvantage of using a Minor Release repository is that you are
Release repository locked in this particular release. When potentially critical fixes are
released in a later minor version of the database, you will not be prompted
for an upgrade by the package manager of your operating system. You
would need to change the configured repository in order to install the
upgrade.
Repository contents
Percona Distribution for PostgreSQL provides individual packages for its components. It also includes two meta-
packages: percona-ppg-server and percona-ppg-server-ha .
Using a meta-package, you can install all components it contains in one go.
Note
percona-ppg-server
percona-ppg-server-16
percona-ppg-server16
The percona-ppg-server meta-package installs the PostgreSQL server with the following packages:
Page 40
Package contents Description
percona- PostgreSQL database-cluster manager. It provides a structure under which multiple versions of
postgresql-common PostgreSQL may be installed and/or multiple clusters maintained at one time.
percona-pgaudit16 Provides detailed session or object audit logging via the standard PostgreSQL logging facility.
percona-ppg-server-ha
percona-ppg-server-ha-16
percona-ppg-server-16
Page 41
Package contents Description
Page 42
Connect to the PostgreSQL server
With PostgreSQL server up and running, let’s connect to it.
By default, the postgres user and the postgres database are created in PostgreSQL upon its installation and
initialization. This allows you to connect to the database as the postgres user.
$ sudo su postgres
$ psql
Hint: You can connect to psql as the postgres user in one go:
1. List databases:
$ \l
$ \dt
$ \d <table_name>
4. Switch databases
$ \c <database_name>
Page 43
$ \du
$ \q
Congratulations! You have connected to PostgreSQL and learned some essential psql commands.
Next steps
Manipulate data in PostgreSQL
Page 44
Manipulate data in PostgreSQL
On the previous step, you have connected to PostgreSQL as the superuser postgres . Now, let’s insert some
sample data and operate with it in PostgreSQL.
Create a database
Let’s create the database test . Use the CREATE DATABASE command:
Create a table
Let’s create a sample table Customers in the test database using the following command:
Hint: Having issues with table creation? Check our Troubleshooting guide
Query data
Let’s verify the data insertion by querying it:
Page 45
Expected output
Update data
Let’s update John Doe’s record with a new email address.
UPDATE customers
SET email = 'john.doe@myemail.com'
WHERE first_name = 'John' AND last_name = 'Doe';
Expected output
Delete data
Use the DELETE command to delete rows. For example, delete the record of Alice Smith:
If you wish to delete the whole table, use the DROP TABLE command instead as follows:
Page 46
DROP DATABASE test;
Congratulations! You have used basic create, read, update and delete (CRUD) operations to manipulate data in
Percona Distribution for PostgreSQL. To deepen your knowledge, see the data manipulation section in
PostgreSQL documentation.
Next steps
What’s next?
Page 47
What’s next?
You’ve just had your first hands-on experience with PostgreSQL! That’s a great start.
To become more confident and proficient in developing database applications, let’s expand your knowledge and
skills in using PostgreSQL. Dive deeper into these key topics to solidify your PostgreSQL skills:
SQL Syntax
Data definition
Queries
Indexes
Also, check out our solutions to help you meet the requirements of your organization.
Solutions
Page 48
Extensions
Page 49
Extensions
Percona Distribution for PostgreSQL is not only the PostgreSQL server. It also includes extensions - the add-ons
that enhance the functionality of PostgreSQL database. By installing these extensions, you can modify and
extend your database server with new features, functions, and data types.
Percona Distribution for PostgreSQL includes the extensions that have been tested to work together. These
extensions encompass the following:
Percona Distribution for PostgreSQL includes the extensions that have been tested to work together. These
extensions encompass the following:
Third-party components
Percona also supports extra modules, not included in Percona Distribution for PostgreSQL but tested to work
with it.
Install an extension
To use an extension, install it. Run the CREATE EXTENSION command on the PostgreSQL node where you want
the extension to be available.
The user should be a superuser or have the CREATE privilege on the current database to be able to run the
CREATE EXTENSION command. Some extensions may require additional privileges depending on their
functionality. To learn more, check the documentation for the desired extension.
Page 50
PostgreSQL contrib modules and utilities
Find the list of controb modules and extensions included in Percona Distribution for PostgtreSQL.
adminpack Required Support toolpack for pgAdmin to provide additional functionality like remote
management of server log files.
amcheck Required Provides functions to verify the logical consistency of the structure of indexes,
such as B-trees. It’s useful for detecting system catalog corruption and index
corruption.
auth_delay Required Causes the server to pause briefly before reporting authentication failure, to
make brute-force attacks on database passwords more difficult.
auto_explain Required Automatically logs execution plans of slow SQL statements. It helps in
performance analysis by tracking down un-optimized queries in large
applications that exceed a specified time threshold.
basebackup_to_shell Adds a custom basebackup target called shell . This enables an administartor
to make a base backup of a running PostgreSQL server to a shell archive.
basic-archive Required An archive module that copies completed WAL segment files to the specified
directory. Can be used as a starting point for developing own archive module.
btree_gin Required Provides GIN index operator classes with B-tree-like behavior. This allows you to
use GIN indexes, which are typically used for full-text search, in situations where
you might otherwise use a B-tree index, such as with integer or text data.
btree_gist Required Provides GiST (Generalized Search Tree) index operator classes that implement
B-tree-like behavior. This allows you to use GiST indexes, which are typically
used for multidimensional and non-scalar data, in situations where you might
otherwise use a B-tree index, such as with integer or text data.
Page 51
Name Database Description
superuser
dblink Required Provides functions to connect to other PostgreSQL databases from within a
database session. This allows for queries to be run across multiple databases
as if they were on the same server.
dict_int An example of an add-on dictionary template for full-text search. It’s used to
demonstrate how to create custom dictionaries in PostgreSQL.
dict_xsyn Required Example synonym full-text search dictionary. This dictionary type replaces
words with groups of their synonyms, and so makes it possible to search for a
word using any of its synonyms.
earthdistance Required This module provides two different approaches to calculating great circle
distances on the surface of the Earth. The fisrt one depends on the cube
module. The second one is based on the built-in point data type, using
longitude and latitude for the coordinates.
hstore Implements the hstore data type for storing sets of key/value pairs within a
single PostgreSQL value.
intarray Provides a number of useful functions and operators for manipulating null-free
arrays of integers.
isn Provides data types for the following international product numbering
standards: EAN13, UPC, ISBN (books), ISMN (music), and ISSN (serials).
lo Provides support for managing Large Objects (also called LOs or BLOBs). This
includes a data type lo and a trigger lo_manage.
ltree Implements a data type ltree for representing labels of data stored in a
hierarchical tree-like structure. Extensive facilities for searching through label
trees are provided.
oldsnapshot Required Allows inspection of the server state that is used to implement
old_snapshot_threshold.
pageinspect Required Provides functions that allow you to inspect the contents of database pages at
a low level, which is useful for debugging purposes.
passwordcheck Checks users’ passwords whenever they are set with CREATE ROLE or ALTER
ROLE. If a password is considered too weak, it will be rejected and the
command will terminate with an error.
Page 52
Name Database Description
superuser
pg_buffercache Required Provides the set of functions for examining what’s happening in the shared
buffer cache in real time.
pg_freespacemap Required Provides a means of examining the free space map (FSM), which PostgreSQL
uses to track the locations of available space in tables and indexes. This can be
useful for understanding space utilization and planning for maintenance
operations.
pg_prewarm Provides a convenient way to load relation data into either the operating system
buffer cache or the PostgreSQL buffer cache. This can be useful for reducing
the time needed for a newly started database to reach its full performance
potential by preloading frequently accessed data.
pgrowlocks Required Provides a function to show row locking information for a specified table.
pg_stat_statements Required A module for tracking planning and execution statistics of all SQL statements
executed by a server. Consider using an advanced version of
pg_stat_statements - pg_stat_monitor
pgstattuple Required Povides various functions to obtain tuple-level statistics. It offers detailed
information about tables and indexes, such as the amount of free space and the
number of live and dead tuples.
pg_surgery Required Provides various functions to perform surgery on a damaged relation. These
functions are unsafe by design and using them may corrupt (or further corrupt)
your database. Use them with caution and only as a last resort
pg_trgm Provides functions and operators for determining the similarity of alphanumeric
text based on trigram matching. A trigram is a contiguous sequence of three
characters. The extension can be used for text search and pattern matching
operations.
pg_visibility Required Provides a way to examine the visibility map (VM) and the page-level visibility
information of a table. It also provides functions to check the integrity of a
visibility map and to force it to be rebuilt.
pg_walinspect Required Provides SQL functions that allow you to inspect the contents of write-ahead
log of a running PostgreSQL database cluster at a low level, which is useful for
debugging, analytical, reporting or educational purposes.
postgres_fdw Required Provides a Foreign Data Wrapper (FDW) for accessing data in remote
PostgreSQL servers. It allows a PostgreSQL database to interact with remote
tables as if they were local.
Page 53
Name Database Description
superuser
seg Implements a data type seg for representing line segments, or floating point
intervals. seg can represent uncertainty in the interval endpoints, making it
especially useful for representing laboratory measurements.
segpgsql SELinux-, label-based mandatory access control (MAC) security module. It can
only be used on Linux 2.6.28 or higher with SELinux enabled.
spi Required Provides several workable examples of using the Server Programming Interface
(SPI) and triggers.
sslinfo Reqjuired Provides information about the SSL certificate that the current client provided
when connecting to PostgreSQL.
tablefunc Includes various functions that return tables (that is, multiple rows). These
functions are useful both in their own right and as examples of how to write C
functions that return multiple rows.
tcn Provides a trigger function that notifies listeners of changes to any table on
which it is attached.
tsm_system_rows Provides the table sampling method SYSTEM_ROWS, which can be used in the
TABLESAMPLE clause of a SELECT command.
tsm_system_time Provides the table sampling method SYSTEM_TIME, which can be used in the
TABLESAMPLE clause of a SELECT command.
unaccent A text search dictionary that removes accents (diacritic signs) from lexemes.
It’s a filtering dictionary, which means its output is always passed to the next
dictionary (if any). This allows accent-insensitive processing for full text search.
uuid-ossp Required Provides functions to generate universally unique identifiers (UUIDs) using one
of several standard algorithms
Page 54
Percona-authored extensions
pg_stat_monitor pg_tde
Page 55
Third-party components
Percona Distribution for PostgreSQL is supplied with the set of third-party open source components and tools
that provide additional functionality such as high-availability or disaster recovery, without the need of modifying
PostgreSQL core code. These components are included in the Percona Distribution for PostgreSQL repository
and are tested to work together.
etcd Required A distributed, reliable key-value store for setting up high available Patroni clusters
pgAudit Required Provides detailed session or object audit logging via the standard PostgreSQL
logging facility
pgAudit Required The set_user part of pgAudit extension provides an additional layer of logging
set_user and control when unprivileged users must escalate themselves to superuser or
object owner roles in order to perform needed maintenance tasks
pg_gather Required An SQL script to assess the health of PostgreSQL cluster by gathering
performance and configuration data from PostgreSQL databases
pgpool2 Required A middleware between PostgreSQL server and client for high availability,
connection pooling and load balancing
pg_stat_monitor Required Collects and aggregates statistics for PostgreSQL and provides histogram
information
pgvector Required An extension that enables you to use PostgreSQL as a vector database
Page 56
Solutions
Page 57
Percona Distribution for PostgreSQL solutions
Find the right solution to help you achieve your organization’s goals.
Check out how you can ensure continuous Dealing with spatial data? Learn how you can
access to your database. store and manipulate it.
Page 58
High availability
Page 59
High Availability in PostgreSQL with Patroni
PostgreSQL has been widely adopted as a modern, high-performance transactional database. A highly available
PostgreSQL cluster can withstand failures caused by network outages, resource saturation, hardware failures,
operating system crashes or unexpected reboots. Such cluster is often a critical component of the enterprise
application landscape, where four nines of availability is a minimum requirement.
There are several methods to achieve high availability in PostgreSQL. This solution document provides Patroni -
the open-source extension to facilitate and manage the deployment of high availability in PostgreSQL.
There are several native methods for achieving high availability with PostgreSQL:
trigger-based replication,
statement-based replication,
logical replication,
streaming replication
Streaming replication
Streaming replication is part of Write-Ahead Log shipping, where changes to the WALs are immediately made available to
standby replicas. With this approach, a standby instance is always up-to-date with changes from the primary node and can
assume the role of primary in case of a failover.
Although the native streaming replication in PostgreSQL supports failing over to the primary node, it lacks some key features
expected from a truly highly-available solution. These include:
No automated way to bring back the failed primary node to the cluster
To address these shortcomings, there are a multitude of third-party, open-source extensions for PostgreSQL. The challenge for
a database administrator here is to select the right utility for the current scenario.
Percona Distribution for PostgreSQL solves this challenge by providing the Patroni extension for achieving PostgreSQL high
availability.
Patroni
Page 60
Patroni is a Patroni is an open-source tool that helps to deploy, manage, and monitor highly available
PostgreSQL clusters using physical streaming replication. Patroni relies on a distributed configuration store like
ZooKeeper, etcd, Consul or Kubernetes to store the cluster configuration.
etcd
As stated before, Patroni uses a distributed configuration store to store the cluster configuration, health and
status.The most popular implementation of the distributed configuration store is etcd due to its simplicity,
consistency and reliability. Etcd not only stores the cluster data, it also handles the election of a new primary
node (a leader in ETCD terminology).
etcd is deployed as a cluster for fault-tolerance. An etcd cluster needs a majority of nodes, a quorum, to agree on
updates to the cluster state.
The recommended approach is to deploy an odd-sized cluster (e.g. 3, 5 or 7 nodes). The odd number of nodes
ensures that there is always a majority of nodes available to make decisions and keep the cluster running
smoothly. This majority is crucial for maintaining consistency and availability, even if one node fails. For a cluster
with n members, the majority is (n/2)+1.
To better illustrate this concept, let’s take an example of clusters with 3 nodes and 4 nodes.
In a 3-node cluster, if one node fails, the remaining 2 nodes still form a majority (2 out of 3), and the cluster can
continue to operate.
In a 4-nodes cluster, if one node fails, there are only 3 nodes left, which is not enough to form a majority (3 out of
4). The cluster stops functioning.
In this solution we use a 3-nodes etcd cluster that resides on the same hosts with PostgreSQL and Patroni.
Though
Page 61
See also
Patroni documentation
Percona Blog:
Architecture layout
The following diagram shows the architecture of a three-node PostgreSQL cluster with a single-leader node.
Components
PostgreSQL nodes
etcd - a Distributed Configuration store that stores the state of the PostgreSQL cluster.
Page 62
HAProxy - the load balancer for the cluster and is the single point of entry to client applications.
Percona Monitoring and Management (PMM) - the solution to monitor the health of your cluster
Each PostgreSQL instance in the cluster maintains consistency with other members through streaming
replication. Each instance hosts Patroni - a cluster manager that monitors the cluster health. Patroni relies on
the operational etcd cluster to store the cluster configuration and sensitive data about the cluster health there.
Patroni periodically sends heartbeat requests with the cluster status to etcd. etcd writes this information to disk
and sends the response back to Patroni. If the current primary fails to renew its status as leader within the
specified timeout, Patroni updates the state change in etcd, which uses this information to elect the new primary
and keep the cluster up and running.
The connections to the cluster do not happen directly to the database nodes but are routed via a connection
proxy like HAProxy. This proxy determines the active node by querying the Patroni REST API.
Next steps
Deploy on Debian or Ubuntu Deploy on RHEL or derivatives
Page 63
Deploying PostgreSQL for high availability with
Patroni on Debian or Ubuntu
This guide provides instructions on how to set up a highly available PostgreSQL cluster with Patroni on Debian or
Ubuntu.
Preconditions
1. This is an example deployment where etcd runs on the same host machines as the Patroni and PostgreSQL
and there is a single dedicated HAProxy host. Alternatively etcd can run on different set of nodes.
If etcd is deployed on the same host machine as Patroni and PostgreSQL, separate disk system for etcd and
PostgreSQL is recommended due to performance reasons.
2. For this setup, we will use the nodes running on Ubuntu 22.04 as the base operating system:
Note
We recommend not to expose the hosts/nodes where Patroni / etcd / PostgreSQL are running to public networks due to
security risks. Use Firewalls, Virtual networks, subnets or the like to protect the database hosts from any kind of attack.
Initial setup
Configure every node.
It’s not necessary to have name resolution, but it makes the whole setup more readable and less error prone.
Here, instead of configuring a DNS, we use a local name resolution by updating the file /etc/hosts . By
resolving their hostnames to their IP addresses, we make the nodes aware of each other’s names and allow their
seamless communication.
Page 64
node1
2. Modify the /etc/hosts file to include the hostnames and IP addresses of the remaining nodes. Add the
following at the end of the /etc/hosts file on all nodes:
node2
2. Modify the /etc/hosts file to include the hostnames and IP addresses of the remaining nodes. Add the
following at the end of the /etc/hosts file on all nodes:
node3
2. Modify the /etc/hosts file to include the hostnames and IP addresses of the remaining nodes. Add the
following at the end of the /etc/hosts file on all nodes:
Page 65
HAproxy-demo
2. Modify the /etc/hosts file. The HAProxy instance should have the name resolution for all the three nodes
in its /etc/hosts file. Add the following lines at the end of the file:
$ curl -O https://repo.percona.com/apt/percona-release_latest.generic_all.deb
c. Install the downloaded repository package and its dependencies using apt :
Page 66
$ sudo percona-release setup ppg16
2. Install some Python and auxiliary packages to help with Patroni and etcd
5. Even though Patroni can use an existing Postgres installation, remove the data directory to force it to
initialize a new Postgres cluster instance.
Note
If you installed the software from tarballs, you must first enable etcd before configuring it.
To get started with etcd cluster, you need to bootstrap it. This means setting up the initial configuration and
starting the etcd nodes so they can form a cluster. There are the following bootstrapping mechanisms:
Static in the case when the IP addresses of the cluster nodes are known
Discovery service - for cases when the IP addresses of the cluster are not known ahead of time.
Since we know the IP addresses of the nodes, we will use the static method. For using the discovery service,
please refer to the etcd documentation :octicons-external-link-16:.
Page 67
We will configure and start all etcd nodes in parallel. This can be done either by modifying each node’s
configuration or using the command line options. Use the method that you prefer more.
1. Create the etcd configuration file on every node. You can edit the sample configuration file
/etc/etcd/etcd.conf.yaml or create your own one. Replace the node names and IP addresses with the
actual names and IP addresses of your nodes.
Page 68
node1
/etc/etcd/etcd.conf.yaml
name: 'node1'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: new
initial-cluster:
node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380
data-dir: /var/lib/etcd
initial-advertise-peer-urls: http://10.104.0.1:2380
listen-peer-urls: http://10.104.0.1:2380
advertise-client-urls: http://10.104.0.1:2379
listen-client-urls: http://10.104.0.1:2379
node2
/etc/etcd/etcd.conf.yaml
name: 'node2'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: new
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,
node3=http://10.104.0.3:2380
data-dir: /var/lib/etcd
initial-advertise-peer-urls: http://10.104.0.2:2380
listen-peer-urls: http://10.104.0.2:2380
advertise-client-urls: http://10.104.0.2:2379
listen-client-urls: http://10.104.0.2:2379
node3
/etc/etcd/etcd.conf.yaml
name: 'node3'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: new
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,
node3=http://10.104.0.3:2380
data-dir: /var/lib/etcd
initial-advertise-peer-urls: http://10.104.0.3:2380
listen-peer-urls: http://10.104.0.3:2380
advertise-client-urls: http://10.104.0.3:2379
listen-client-urls: http://10.104.0.3:2379
Page 69
During the node start, etcd searches for other cluster nodes defined in the configuration. If the other nodes
are not yet running, the start may fail by a quorum timeout. This is expected behavior. Try starting all nodes
again at the same time for the etcd cluster to be created.
3. Check the etcd cluster members. Use etcdctl for this purpose. Ensure that etcdctl interacts with etcd
using API version 3 and knows which nodes, or endpoints, to communicate with. For this, we will define the
required information as environment variables. Run the following commands on one of the nodes:
export ETCDCTL_API=3
HOST_1=10.104.0.1
HOST_2=10.104.0.2
HOST_3=10.104.0.3
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379
4. Now, list the cluster members and output the result as a table as follows:
Sample output
+------------------+---------+-------+------------------------+----------------------------+---
---------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS
LEARNER |
+------------------+---------+-------+------------------------+----------------------------+---
---------+
| 4788684035f976d3 | started | node2 | http://10.104.0.2:2380 | http://192.168.56.102:2379 |
false |
| 67684e355c833ffa | started | node3 | http://10.104.0.3:2380 | http://192.168.56.103:2379 |
false |
| 9d2e318af9306c67 | started | node1 | http://10.104.0.1:2380 | http://192.168.56.101:2379 |
false |
+------------------+---------+-------+------------------------+----------------------------+---
---------+
5. To check what node is currently the leader, use the following command
Page 70
Sample output
+-----------------+------------------+---------+---------+-----------+------------+-----------
+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM |
RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------+------------------+---------+---------+-----------+------------+-----------
+------------+--------------------+--------+
| 10.104.0.1:2379 | 9d2e318af9306c67 | 3.5.16 | 20 kB | true | false | 2 |
10 | 10 | |
| 10.104.0.2:2379 | 4788684035f976d3 | 3.5.16 | 20 kB | false | false | 2 |
10 | 10 | |
| 10.104.0.3:2379 | 67684e355c833ffa | 3.5.16 | 20 kB | false | false | 2 |
10 | 10 | |
+-----------------+------------------+---------+---------+-----------+------------+-----------
+------------+--------------------+--------+
1. On each etcd node, set the environment variables for the cluster members, the cluster token and state:
TOKEN=PostgreSQL_HA_Cluster_1
CLUSTER_STATE=new
NAME_1=node1
NAME_2=node2
NAME_3=node3
HOST_1=10.104.0.1
HOST_2=10.104.0.2
HOST_3=10.104.0.3
CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=ht
tp://${HOST_3}:2380
Page 71
node1
THIS_NAME=${NAME_1}
THIS_IP=${HOST_1}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls
http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls
http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
node2
THIS_NAME=${NAME_2}
THIS_IP=${HOST_2}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls
http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls
http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
node3
THIS_NAME=${NAME_3}
THIS_IP=${HOST_3}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls
http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls
http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
3. Check the etcd cluster members. Use etcdctl for this purpose. Ensure that etcdctl interacts with etcd
using API version 3 and knows which nodes, or endpoints, to communicate with. For this, we will define the
required information as environment variables. Run the following commands on one of the nodes:
export ETCDCTL_API=3
HOST_1=10.104.0.1
HOST_2=10.104.0.2
HOST_3=10.104.0.3
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379
4. Now, list the cluster members and output the result as a table as follows:
Page 72
Sample output
+------------------+---------+-------+------------------------+----------------------------+---
---------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS
LEARNER |
+------------------+---------+-------+------------------------+----------------------------+---
---------+
| 4788684035f976d3 | started | node2 | http://10.104.0.2:2380 | http://192.168.56.102:2379 |
false |
| 67684e355c833ffa | started | node3 | http://10.104.0.3:2380 | http://192.168.56.103:2379 |
false |
| 9d2e318af9306c67 | started | node1 | http://10.104.0.1:2380 | http://192.168.56.101:2379 |
false |
+------------------+---------+-------+------------------------+----------------------------+---
---------+
5. To check what node is currently the leader, use the following command
Sample output
+-----------------+------------------+---------+---------+-----------+------------+-----------
+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM |
RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------+------------------+---------+---------+-----------+------------+-----------
+------------+--------------------+--------+
| 10.104.0.1:2379 | 9d2e318af9306c67 | 3.5.16 | 20 kB | true | false | 2 |
10 | 10 | |
| 10.104.0.2:2379 | 4788684035f976d3 | 3.5.16 | 20 kB | false | false | 2 |
10 | 10 | |
| 10.104.0.3:2379 | 67684e355c833ffa | 3.5.16 | 20 kB | false | false | 2 |
10 | 10 | |
+-----------------+------------------+---------+---------+-----------+------------+-----------
+------------+--------------------+--------+
Configure Patroni
Run the following commands on all nodes. You can do this in parallel:
1. Export and create environment variables to simplify the config file creation:
Node name:
Node IP:
Page 73
$ export NODE_IP=`hostname -i | awk '{print $1}'`
DATA_DIR="/var/lib/postgresql/16/main"
PG_BIN_DIR="/usr/lib/postgresql/16/bin"
NOTE: Check the path to the data and bin folders on your operating system and change it for the variables
accordingly.
Patroni information:
NAMESPACE="percona_lab"
SCOPE="cluster_1"
2. Use the following command to create the /etc/patroni/patroni.yml configuration file and add the
following configuration for node1 :
Page 74
echo "
namespace: ${NAMESPACE}
scope: ${SCOPE}
name: ${NODE_NAME}
restapi:
listen: 0.0.0.0:8008
connect_address: ${NODE_IP}:8008
etcd3:
host: ${NODE_IP}:2379
bootstrap:
# this section will be written into Etcd:/<namespace>/<scope>/config after
initializing new cluster
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
wal_level: replica
hot_standby: "on"
wal_keep_segments: 10
max_wal_senders: 5
max_replication_slots: 10
wal_log_hints: "on"
logging_collector: 'on'
max_wal_size: '10GB'
archive_mode: "on"
archive_timeout: 600s
archive_command: "cp -f %p /home/postgres/archived/%f"
pg_hba:
- local all all peer
- host replication replicator 127.0.0.1/32 trust
- host replication replicator 192.0.0.0/8 scram-sha-256
- host all all 0.0.0.0/0 scram-sha-256
recovery_conf:
restore_command: cp /home/postgres/archived/%f %p
postgresql:
cluster_name: cluster_1
listen: 0.0.0.0:5432
Page 75
connect_address: ${NODE_IP}:5432
data_dir: ${DATA_DIR}
bin_dir: ${PG_BIN_DIR}
pgpass: /tmp/pgpass0
authentication:
replication:
username: replicator
password: replPasswd
superuser:
username: postgres
password: qaz123
parameters:
unix_socket_directories: "/var/run/postgresql/"
create_replica_methods:
- basebackup
basebackup:
checkpoint: 'fast'
watchdog:
mode: required # Allowed values: off, automatic, required
device: /dev/watchdog
safety_margin: 5
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
" | sudo tee -a /etc/patroni/patroni.yml
Following these, there is a bootstrap section that contains the PostgreSQL configurations and the steps to run once the
database is initialized. The pg_hba.conf entries specify all the other nodes that can connect to this node and their
authentication mechanism.
If it’s not created, create it manually and specify the following contents within:
Page 76
```ini title="/etc/systemd/system/percona-patroni.service"
[Unit]
Description=Runners to orchestrate a high-availability PostgreSQL
After=syslog.target network.target
[Service]
Type=simple
User=postgres
Group=postgres
# only kill the patroni process, not its children, so it will gracefully stop postgres
KillMode=process
# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec=30
[Install]
WantedBy=multi-user.target
```
2. Repeat steps 1-4 on the remaining nodes. In the end you must have the configuration file and the systemd
unit file created on every node.
3. Now it’s time to start Patroni. You need the following commands on all nodes but not in parallel. Start with
the node1 first, wait for the service to come to live, and then proceed with the other nodes one-by-one,
always waiting for them to sync with the primary node:
When Patroni starts, it initializes PostgreSQL (because the service is not currently running and the data directory
is empty) following the directives in the bootstrap section of the configuration file.
Page 77
1. Check the service to see if there are errors:
A common error is Patroni complaining about the lack of proper entries in the pg_hba.conf file. If you see
such errors, you must manually add or fix the entries in that file and then restart the service.
Changing the patroni.yml file and restarting the service will not have any effect here because the bootstrap
section specifies the configuration to apply when PostgreSQL is first started in the node. It will not repeat
the process even if the Patroni configuration file is modified and the service is restarted.
If Patroni has started properly, you should be able to locally connect to a PostgreSQL node using the following
command:
psql (16)
Type "help" for help.
postgres=#
Configure HAProxy
HAproxy is the load balancer and the single point of entry to your PostgreSQL cluster for client applications. A
client application accesses the HAPpoxy URL and sends its read/write requests there. Behind-the-scene,
HAProxy routes write requests to the primary node and read requests - to the secondaries in a round-robin
fashion so that no secondary instance is unnecessarily loaded. To make this happen, provide different ports in
the HAProxy configuration file. In this deployment, writes are routed to port 5000 and reads - to port 5001
Page 78
This way, a client application doesn’t know what node in the underlying cluster is the current primary. HAProxy
sends connections to a healthy node (as long as there is at least one healthy node available) and ensures that
client application requests are never rejected.
2. The HAProxy configuration file path is: /etc/haproxy/haproxy.cfg . Specify the following configuration in
this file.
global
maxconn 100
defaults
log global
mode tcp
retries 2
timeout client 30m
timeout connect 4s
timeout server 30m
timeout check 5s
listen stats
mode http
bind *:7000
stats enable
stats uri /
listen primary
bind *:5000
option httpchk /primary
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server node1 node1:5432 maxconn 100 check port 8008
server node2 node2:5432 maxconn 100 check port 8008
server node3 node3:5432 maxconn 100 check port 8008
listen standbys
balance roundrobin
bind *:5001
option httpchk /replica
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server node1 node1:5432 maxconn 100 check port 8008
server node2 node2:5432 maxconn 100 check port 8008
server node3 node3:5432 maxconn 100 check port 8008
HAProxy will use the REST APIs hosted by Patroni to check the health status of each PostgreSQL node and
route the requests appropriately.
Page 79
3. Restart HAProxy:
Next steps
Configure pgBackRest
Page 80
Deploying PostgreSQL for high availability with
Patroni on RHEL or CentOS
This guide provides instructions on how to set up a highly available PostgreSQL cluster with Patroni on Red Hat
Enterprise Linux or CentOS.
Considerations
1. This is an example deployment where etcd runs on the same host machines as the Patroni and PostgreSQL
and there is a single dedicated HAProxy host. Alternatively etcd can run on different set of nodes.
If etcd is deployed on the same host machine as Patroni and PostgreSQL, separate disk system for etcd and
PostgreSQL is recommended due to performance reasons.
2. For this setup, we use the nodes running on Red Hat Enterprise Linux 8 as the base operating system:
Note
We recommend not to expose the hosts/nodes where Patroni / etcd / PostgreSQL are running to public networks due to
security risks. Use Firewalls, Virtual networks, subnets or the like to protect the database hosts from any kind of attack.
Initial setup
Set up hostnames in the /etc/hosts file
It’s not necessary to have name resolution, but it makes the whole setup more readable and less error prone.
Here, instead of configuring a DNS, we use a local name resolution by updating the file /etc/hosts . By
resolving their hostnames to their IP addresses, we make the nodes aware of each other’s names and allow their
seamless communication.
Page 81
node1
2. Modify the /etc/hosts file to include the hostnames and IP addresses of the remaining nodes. Add the
following at the end of the /etc/hosts file on all nodes:
node2
2. Modify the /etc/hosts file to include the hostnames and IP addresses of the remaining nodes. Add the
following at the end of the /etc/hosts file on all nodes:
node3
2. Modify the /etc/hosts file to include the hostnames and IP addresses of the remaining nodes. Add the
following at the end of the /etc/hosts file on all nodes:
Page 82
HAproxy-demo
2. Modify the /etc/hosts file. The HAProxy instance should have the name resolution for all the three nodes
in its /etc/hosts file. Add the following lines at the end of the file:
Important
Don’t initialize the cluster and start the postgresql service. The cluster initialization and setup are handled by Patroni
during the bootsrapping stage.
2. Install some Python and auxiliary packages to help with Patroni and etcd
Page 83
$ sudo yum install python3-pip python3-devel binutils
3. Install etcd, Patroni, pgBackRest packages. Check platform specific notes for Patroni:
Note
If you installed the software from tarballs, you must first enable etcd before configuring it.
To get started with etcd cluster, you need to bootstrap it. This means setting up the initial configuration and
starting the etcd nodes so they can form a cluster. There are the following bootstrapping mechanisms:
Static in the case when the IP addresses of the cluster nodes are known
Discovery service - for cases when the IP addresses of the cluster are not known ahead of time.
Since we know the IP addresses of the nodes, we will use the static method. For using the discovery service,
please refer to the etcd documentation :octicons-external-link-16:.
We will configure and start all etcd nodes in parallel. This can be done either by modifying each node’s
configuration or using the command line options. Use the method that you prefer more.
1. Create the etcd configuration file on every node. You can edit the sample configuration file
/etc/etcd/etcd.conf.yaml or create your own one. Replace the node names and IP addresses with the
actual names and IP addresses of your nodes.
Page 84
node1
/etc/etcd/etcd.conf.yaml
name: 'node1'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: new
initial-cluster:
node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380
data-dir: /var/lib/etcd
initial-advertise-peer-urls: http://10.104.0.1:2380
listen-peer-urls: http://10.104.0.1:2380
advertise-client-urls: http://10.104.0.1:2379
listen-client-urls: http://10.104.0.1:2379
node2
/etc/etcd/etcd.conf.yaml
name: 'node2'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: new
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,
node3=http://10.104.0.3:2380
data-dir: /var/lib/etcd
initial-advertise-peer-urls: http://10.104.0.2:2380
listen-peer-urls: http://10.104.0.2:2380
advertise-client-urls: http://10.104.0.2:2379
listen-client-urls: http://10.104.0.2:2379
node3
/etc/etcd/etcd.conf.yaml
name: 'node3'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: new
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,
node3=http://10.104.0.3:2380
data-dir: /var/lib/etcd
initial-advertise-peer-urls: http://10.104.0.3:2380
listen-peer-urls: http://10.104.0.3:2380
advertise-client-urls: http://10.104.0.3:2379
listen-client-urls: http://10.104.0.3:2379
Page 85
During the node start, etcd searches for other cluster nodes defined in the configuration. If the other nodes
are not yet running, the start may fail by a quorum timeout. This is expected behavior. Try starting all nodes
again at the same time for the etcd cluster to be created.
3. Check the etcd cluster members. Use etcdctl for this purpose. Ensure that etcdctl interacts with etcd
using API version 3 and knows which nodes, or endpoints, to communicate with. For this, we will define the
required information as environment variables. Run the following commands on one of the nodes:
export ETCDCTL_API=3
HOST_1=10.104.0.1
HOST_2=10.104.0.2
HOST_3=10.104.0.3
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379
4. Now, list the cluster members and output the result as a table as follows:
Sample output
+------------------+---------+-------+------------------------+----------------------------+---
---------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS
LEARNER |
+------------------+---------+-------+------------------------+----------------------------+---
---------+
| 4788684035f976d3 | started | node2 | http://10.104.0.2:2380 | http://192.168.56.102:2379 |
false |
| 67684e355c833ffa | started | node3 | http://10.104.0.3:2380 | http://192.168.56.103:2379 |
false |
| 9d2e318af9306c67 | started | node1 | http://10.104.0.1:2380 | http://192.168.56.101:2379 |
false |
+------------------+---------+-------+------------------------+----------------------------+---
---------+
5. To check what node is currently the leader, use the following command
Page 86
Sample output
+-----------------+------------------+---------+---------+-----------+------------+-----------
+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM |
RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------+------------------+---------+---------+-----------+------------+-----------
+------------+--------------------+--------+
| 10.104.0.1:2379 | 9d2e318af9306c67 | 3.5.16 | 20 kB | true | false | 2 |
10 | 10 | |
| 10.104.0.2:2379 | 4788684035f976d3 | 3.5.16 | 20 kB | false | false | 2 |
10 | 10 | |
| 10.104.0.3:2379 | 67684e355c833ffa | 3.5.16 | 20 kB | false | false | 2 |
10 | 10 | |
+-----------------+------------------+---------+---------+-----------+------------+-----------
+------------+--------------------+--------+
1. On each etcd node, set the environment variables for the cluster members, the cluster token and state:
TOKEN=PostgreSQL_HA_Cluster_1
CLUSTER_STATE=new
NAME_1=node1
NAME_2=node2
NAME_3=node3
HOST_1=10.104.0.1
HOST_2=10.104.0.2
HOST_3=10.104.0.3
CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=ht
tp://${HOST_3}:2380
Page 87
node1
THIS_NAME=${NAME_1}
THIS_IP=${HOST_1}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls
http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls
http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
node2
THIS_NAME=${NAME_2}
THIS_IP=${HOST_2}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls
http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls
http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
node3
THIS_NAME=${NAME_3}
THIS_IP=${HOST_3}
etcd --data-dir=data.etcd --name ${THIS_NAME} \
--initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls
http://${THIS_IP}:2380 \
--advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls
http://${THIS_IP}:2379 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}
3. Check the etcd cluster members. Use etcdctl for this purpose. Ensure that etcdctl interacts with etcd
using API version 3 and knows which nodes, or endpoints, to communicate with. For this, we will define the
required information as environment variables. Run the following commands on one of the nodes:
export ETCDCTL_API=3
HOST_1=10.104.0.1
HOST_2=10.104.0.2
HOST_3=10.104.0.3
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379
4. Now, list the cluster members and output the result as a table as follows:
Page 88
Sample output
+------------------+---------+-------+------------------------+----------------------------+---
---------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS
LEARNER |
+------------------+---------+-------+------------------------+----------------------------+---
---------+
| 4788684035f976d3 | started | node2 | http://10.104.0.2:2380 | http://192.168.56.102:2379 |
false |
| 67684e355c833ffa | started | node3 | http://10.104.0.3:2380 | http://192.168.56.103:2379 |
false |
| 9d2e318af9306c67 | started | node1 | http://10.104.0.1:2380 | http://192.168.56.101:2379 |
false |
+------------------+---------+-------+------------------------+----------------------------+---
---------+
5. To check what node is currently the leader, use the following command
Sample output
+-----------------+------------------+---------+---------+-----------+------------+-----------
+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM |
RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------+------------------+---------+---------+-----------+------------+-----------
+------------+--------------------+--------+
| 10.104.0.1:2379 | 9d2e318af9306c67 | 3.5.16 | 20 kB | true | false | 2 |
10 | 10 | |
| 10.104.0.2:2379 | 4788684035f976d3 | 3.5.16 | 20 kB | false | false | 2 |
10 | 10 | |
| 10.104.0.3:2379 | 67684e355c833ffa | 3.5.16 | 20 kB | false | false | 2 |
10 | 10 | |
+-----------------+------------------+---------+---------+-----------+------------+-----------
+------------+--------------------+--------+
Configure Patroni
Run the following commands on all nodes. You can do this in parallel:
1. Export and create environment variables to simplify the config file creation:
Node name:
Node IP:
Page 89
$ export NODE_IP=`hostname -i | awk '{print $1}'`
DATA_DIR="/var/lib/pgsql/data/"
PG_BIN_DIR="/usr/pgsql-16/bin"
NOTE: Check the path to the data and bin folders on your operating system and change it for the variables
accordingly.
Patroni information:
NAMESPACE="percona_lab"
SCOPE="cluster_1
Create the directory to store the configuration file and make it owned by the postgres user.
Create the data directory to store PostgreSQL data. Change its ownership to the postgres user and
restrict the access to it
3. Use the following command to create the /etc/patroni/patroni.yml configuration file and add the
following configuration for node1 :
Page 90
echo "
namespace: ${NAMESPACE}
scope: ${SCOPE}
name: ${NODE_NAME}
restapi:
listen: 0.0.0.0:8008
connect_address: ${NODE_IP}:8008
etcd3:
host: ${NODE_IP}:2379
bootstrap:
# this section will be written into Etcd:/<namespace>/<scope>/config after
initializing new cluster
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
wal_level: replica
hot_standby: "on"
wal_keep_segments: 10
max_wal_senders: 5
max_replication_slots: 10
wal_log_hints: "on"
logging_collector: 'on'
max_wal_size: '10GB'
archive_mode: "on"
archive_timeout: 600s
archive_command: "cp -f %p /home/postgres/archived/%f"
pg_hba:
- local all all peer
- host replication replicator 127.0.0.1/32 trust
- host replication replicator 192.0.0.0/8 scram-sha-256
- host all all 0.0.0.0/0 scram-sha-256
recovery_conf:
restore_command: cp /home/postgres/archived/%f %p
postgresql:
cluster_name: cluster_1
listen: 0.0.0.0:5432
Page 91
connect_address: ${NODE_IP}:5432
data_dir: ${DATA_DIR}
bin_dir: ${PG_BIN_DIR}
pgpass: /tmp/pgpass0
authentication:
replication:
username: replicator
password: replPasswd
superuser:
username: postgres
password: qaz123
parameters:
unix_socket_directories: "/var/run/postgresql/"
create_replica_methods:
- basebackup
basebackup:
checkpoint: 'fast'
watchdog:
mode: required # Allowed values: off, automatic, required
device: /dev/watchdog
safety_margin: 5
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
" | sudo tee -a /etc/patroni/patroni.yml
If it’s not created, create it manually and specify the following contents within:
[Service] Type=simple
User=postgres Group=postgres
# only kill the patroni process, not its children, so it will gracefully stop postgres KillMode=process
# Give a reasonable amount of time for the server to start up/shut down TimeoutSec=30
# Do not restart the service if it crashes, we want to manually inspect database on failure Restart=no
Page 92
$ sudo systemctl daemon-reload
6. Repeat steps 1-5 on the remaining nodes. In the end you must have the configuration file and the systemd
unit file created on every node.
7. Now it’s time to start Patroni. You need the following commands on all nodes but not in parallel. Start with
the node1 first, wait for the service to come to live, and then proceed with the other nodes one-by-one,
always waiting for them to sync with the primary node:
When Patroni starts, it initializes PostgreSQL (because the service is not currently running and the data directory
is empty) following the directives in the bootstrap section of the configuration file.
A common error is Patroni complaining about the lack of proper entries in the pg_hba.conf file. If you see
such errors, you must manually add or fix the entries in that file and then restart the service.
Changing the patroni.yml file and restarting the service will not have any effect here because the bootstrap
section specifies the configuration to apply when PostgreSQL is first started in the node. It will not repeat
the process even if the Patroni configuration file is modified and the service is restarted.
If Patroni has started properly, you should be able to locally connect to a PostgreSQL node using the
following command:
psql (16.0)
Type "help" for help.
postgres=#
2. When all nodes are up and running, you can check the cluster status using the following command:
Page 93
+ Cluster: cluster_1 (7440127629342136675) -----+----+-------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+------------+---------+-----------+----+-----------+
| node1 | 10.0.100.1 | Leader | running | 1 | |
| node2 | 10.0.100.2 | Replica | streaming | 1 | 0 |
| node3 | 10.0.100.3 | Replica | streaming | 1 | 0 |
+--------+------------+---------+-----------+----+-----------+
Configure HAProxy
HAproxy is the load balancer and the single point of entry to your PostgreSQL cluster for client applications. A
client application accesses the HAPpoxy URL and sends its read/write requests there. Behind-the-scene,
HAProxy routes write requests to the primary node and read requests - to the secondaries in a round-robin
fashion so that no secondary instance is unnecessarily loaded. To make this happen, provide different ports in
the HAProxy configuration file. In this deployment, writes are routed to port 5000 and reads - to port 5001
This way, a client application doesn’t know what node in the underlying cluster is the current primary. HAProxy
sends connections to a healthy node (as long as there is at least one healthy node available) and ensures that
client application requests are never rejected.
2. The HAProxy configuration file path is: /etc/haproxy/haproxy.cfg . Specify the following configuration in
this file.
Page 94
global
maxconn 100
defaults
log global
mode tcp
retries 2
timeout client 30m
timeout connect 4s
timeout server 30m
timeout check 5s
listen stats
mode http
bind *:7000
stats enable
stats uri /
listen primary
bind *:5000
option httpchk /primary
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server node1 node1:5432 maxconn 100 check port 8008
server node2 node2:5432 maxconn 100 check port 8008
server node3 node3:5432 maxconn 100 check port 8008
listen standbys
balance roundrobin
bind *:5001
option httpchk /replica
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server node1 node1:5432 maxconn 100 check port 8008
server node2 node2:5432 maxconn 100 check port 8008
server node3 node3:5432 maxconn 100 check port 8008
HAProxy will use the REST APIs hosted by Patroni to check the health status of each PostgreSQL node and
route the requests appropriately.
4. Restart HAProxy:
Page 95
$ sudo journalctl -u haproxy.service -n 100 -f
Next steps
Configure pgBackRest
Page 96
pgBackRest setup
pgBackRest is a backup tool used to perform PostgreSQL database backup, archiving, restoration, and point-
in-time recovery. While it can be used for local backups, this procedure shows how to deploy a pgBackRest
server running on a dedicated host and how to configure PostgreSQL servers to use it for backups and
archiving.
You also need a backup storage to store the backups. It can either be a remote storage such as AWS S3, S3-
compatible storages or Azure blob storage, or a filesystem-based one.
$ sudo su -
Install pgBackRest
On Debian/Ubuntu
On RHEL/derivatives
export SRV_NAME="bkp-srv"
export NODE1_NAME="node-1"
export NODE2_NAME="node-2"
export NODE3_NAME="node-3"
export CA_PATH="/etc/ssl/certs/pg_ha"
Page 97
2. Create the pgBackRest repository, if necessary
pgBackRest stores backups. In this example, the backups will be saved to
A repository is where
/var/lib/pgbackrest .
This directory is usually created during pgBackRest’s installation process. If it’s not there already, create it as
follows:
$ mkdir -p /var/lib/pgbackrest
$ chmod 750 /var/lib/pgbackrest
$ chown postgres:postgres /var/lib/pgbackrest
Access the file’s parent directory (either cd /etc/ or cd /etc/pgbackrest/ ), and make a backup copy of
it:
$ cp pgbackrest.conf pgbackrest.conf.bak
Then use the following command to create a basic configuration file using the environment variables we
created in a previous step:
Page 98
On Debian/Ubuntu
Page 99
cat <<EOF > pgbackrest.conf
[global]
# repo1-retention-archive
# - Number of backups worth of continuous WAL to retain
# - NOTE: WAL segments required to make a backup consistent are always retained
until the backup is expired regardless of how this option is configured
# - If this value is not set and repo-retention-full-type is count (default), then
the archive to expire will default to the repo-retention-full
# repo1-retention-archive=2
# repo1-retention-full
# - Full backup retention count/time.
# - When a full backup expires, all differential and incremental backups associated
with the full backup will also expire.
# - When the option is not defined a warning will be issued.
# - If indefinite retention is desired then set the option to the max value.
repo1-retention-full=4
Page 100
pg1-path=/var/lib/postgresql/16/main
pg1-host-type=tls
pg1-host-cert-file=${CA_PATH}/${SRV_NAME}.crt
pg1-host-key-file=${CA_PATH}/${SRV_NAME}.key
pg1-host-ca-file=${CA_PATH}/ca.crt
pg1-socket-path=/var/run/postgresql
pg2-host=${NODE2_NAME}
pg2-host-port=8432
pg2-port=5432
pg2-path=/var/lib/postgresql/16/main
pg2-host-type=tls
pg2-host-cert-file=${CA_PATH}/${SRV_NAME}.crt
pg2-host-key-file=${CA_PATH}/${SRV_NAME}.key
pg2-host-ca-file=${CA_PATH}/ca.crt
pg2-socket-path=/var/run/postgresql
pg3-host=${NODE3_NAME}
pg3-host-port=8432
pg3-port=5432
pg3-path=/var/lib/postgresql/16/main
pg3-host-type=tls
pg3-host-cert-file=${CA_PATH}/${SRV_NAME}.crt
pg3-host-key-file=${CA_PATH}/${SRV_NAME}.key
pg3-host-ca-file=${CA_PATH}/ca.crt
pg3-socket-path=/var/run/postgresql
EOF
On RHEL/derivatives
Page 101
cat <<EOF > pgbackrest.conf
[global]
# repo1-retention-archive
# - Number of backups worth of continuous WAL to retain
# - NOTE: WAL segments required to make a backup consistent are always retained
until the backup is expired regardless of how this option is configured
# - If this value is not set and repo-retention-full-type is count (default), then
the archive to expire will default to the repo-retention-full
# repo1-retention-archive=2
# repo1-retention-full
# - Full backup retention count/time.
# - When a full backup expires, all differential and incremental backups associated
with the full backup will also expire.
# - When the option is not defined a warning will be issued.
# - If indefinite retention is desired then set the option to the max value.
repo1-retention-full=4
Page 102
pg1-path=/var/lib/pgsql/16/data
pg1-host-type=tls
pg1-host-cert-file=${CA_PATH}/${SRV_NAME}.crt
pg1-host-key-file=${CA_PATH}/${SRV_NAME}.key
pg1-host-ca-file=${CA_PATH}/ca.crt
pg1-socket-path=/var/run/postgresql
pg2-host=${NODE2_NAME}
pg2-host-port=8432
pg2-port=5432
pg2-path=/var/lib/pgsql/16/data
pg2-host-type=tls
pg2-host-cert-file=${CA_PATH}/${SRV_NAME}.crt
pg2-host-key-file=${CA_PATH}/${SRV_NAME}.key
pg2-host-ca-file=${CA_PATH}/ca.crt
pg2-socket-path=/var/run/postgresql
pg3-host=${NODE3_NAME}
pg3-host-port=8432
pg3-port=5432
pg3-path=/var/lib/pgsql/16/data
pg3-host-type=tls
pg3-host-cert-file=${CA_PATH}/${SRV_NAME}.crt
pg3-host-key-file=${CA_PATH}/${SRV_NAME}.key
pg3-host-ca-file=${CA_PATH}/ca.crt
pg3-socket-path=/var/run/postgresql
EOF
NOTE: The option backup-standby=y above indicates the backups should be taken from a standby server.
If you are operating with a primary only, or if your secondaries are not configured with pgBackRest , set this
option to n .
$ mkdir -p ${CA_PATH}
$ openssl req -new -x509 -days 365 -nodes -out ${CA_PATH}/ca.crt -keyout
${CA_PATH}/ca.key -subj "/CN=root-ca"
3. Create the certificate for the backup and the PostgreSQL servers
Page 103
$ for node in ${SRV_NAME} ${NODE1_NAME} ${NODE2_NAME} ${NODE3_NAME}
do
openssl req -new -nodes -out ${CA_PATH}/$node.csr -keyout ${CA_PATH}/$node.key -subj
"/CN=$node";
done
5. Remove temporary files, set ownership of the remaining files to the postgres user, and restrict their
access:
$ rm -f ${CA_PATH}/*.csr
$ chown postgres:postgres -R ${CA_PATH}
$ chmod 0600 ${CA_PATH}/*
/etc/systemd/system/pgbackrest.service
[Unit]
Description=pgBackRest Server
After=network.target
[Service]
Type=simple
User=postgres
Restart=always
RestartSec=1
ExecStart=/usr/bin/pgbackrest server
#ExecStartPost=/bin/sleep 3
#ExecStartPost=/bin/bash -c "[ ! -z $MAINPID ]"
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
Page 104
$ systemctl daemon-reload
$ systemctl start pgbackrest.service
$ systemctl enable pgbackrest.service
On Debian/Ubuntu
On RHEL/derivatives
$ mkdir -p ${CA_PATH}
4. Copy the .crt , .key certificate files and the ca.crt file from the backup server where they were created
to every respective node. Then change the ownership to the postgres user and restrict their access. Use
the following commands to achieve this:
5. Edit or create the configuration file which, as explained above, can be either at the
/etc/pgbackrest/pgbackrest.conf or /etc/pgbackrest.conf path:
Page 105
On Debian/Ubuntu
pgbackrest.conf
# general options
process-max=16
log-level-console=info
log-level-file=debug
[cluster_1]
pg1-path=/var/lib/postgresql/16/main
EOF
On RHEL/derivatives
pgbackrest.conf
Page 106
cat <<EOF > pgbackrest.conf
[global]
repo1-host=${SRV_NAME}
repo1-host-user=postgres
repo1-host-type=tls
repo1-host-cert-file=${CA_PATH}/${NODE_NAME}.crt
repo1-host-key-file=${CA_PATH}/${NODE_NAME}.key
repo1-host-ca-file=${CA_PATH}/ca.crt
# general options
process-max=16
log-level-console=info
log-level-file=debug
[cluster_1]
pg1-path=/var/lib/pgsql/16/data
EOF
/etc/systemd/system/pgbackrest.service
[Unit]
Description=pgBackRest Server
After=network.target
[Service]
Type=simple
User=postgres
Restart=always
RestartSec=1
ExecStart=/usr/bin/pgbackrest server
#ExecStartPost=/bin/sleep 3
#ExecStartPost=/bin/bash -c "[ ! -z $MAINPID ]"
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
$ systemctl daemon-reload
$ systemctl start pgbackrest
$ systemctl enable pgbackrest
Page 107
The pgBackRest daemon listens on port 8432 by default:
$ netstat -taunp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
PID/Program name
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN
1/systemd
tcp 0 0 0.0.0.0:8432 0.0.0.0:* LISTEN
40224/pgbackrest
On Debian/Ubuntu
/etc/patroni/patroni.yml
postgresql:
(...)
parameters:
(...)
archive_command: pgbackrest --stanza=cluster_1 archive-push
/var/lib/postgresql/16/main/pg_wal/%f
(...)
recovery_conf:
(...)
restore_command: pgbackrest --config=/etc/pgbackrest.conf --stanza=cluster_1
archive-get %f %p
(...)
On RHEL/derivatives
/etc/patroni/patroni.yml
postgresql:
(...)
parameters:
archive_command: pgbackrest --stanza=cluster_1 archive-push
/var/lib/pgsql/16/data/pg_wal/%f
(...)
recovery_conf:
restore_command: pgbackrest --config=/etc/pgbackrest.conf --stanza=cluster_1
archive-get %f %p
(...)
Page 108
Reload the changed configurations. Specify either the cluster name or a node name for the following
command:
Note: When configuring a PostgreSQL server that is not managed by Patroni to archive/restore WALs
from thepgBackRest server, edit the server’s main configuration file directly and adjust the
archive_command and restore_command variables as shown above.
Create backups
Run the following commands on the backup server:
1. Create the stanza. A stanza is the configuration for a PostgreSQL database cluster that defines where it is
located, how it will be backed up, archiving options, etc.
Page 109
Testing the Patroni PostgreSQL Cluster
This document covers the following scenarios to test the PostgreSQL cluster:
replication,
connectivity,
failover, and
manual switchover.
Testing replication
1. Connect to the cluster and establish the psql session from a client machine that can connect to the
HAProxy node. Use the HAProxy-demo node’s public IP address:
2. Run the following commands to create a table and insert a few rows:
3. To ensure that the replication is working, we can log in to each PostgreSQL node and run a simple SQL
statement against the locally running instance:
name | age
--------+-----
john | 30
dawson | 35
(2 rows)
Testing failover
In a proper setup, client applications won’t have issues connecting to the cluster, even if one or even two of the
nodes go down. We will test the cluster for failover in the following scenarios:
Scenario 1. Intentionally stop the PostgreSQL on the primary node and verify access to PostgreSQL.
1. Run the following command on any node to check the current cluster status:
Page 110
$ sudo patronictl -c /etc/patroni/patroni.yml list
Output:
2. node1 is the current leader. Stop Patroni in node1 to see how it changes the cluster:
3. Once the service stops in node1 , check the logs in node2 and node3 using the following command:
Output
Page 111
The logs in node3 show that the requests to node1 are failing, the watchdog is coming into action, and
node3 is promoting itself as the leader:
4. Verify that you can still access the cluster through the HAProxy instance and read data:
name | age
--------+-----
john | 30
dawson | 35
(2 rows)
Output:
As we see, node3 remains the leader and the rest are replicas.
To emulate the power outage, let’s kill the service in node3 and see what happens in node1 and node2 .
postgres 10042 0.1 2.1 647132 43948 ? Ssl 12:50 0:09 /usr/bin/python3
/usr/bin/patroni /etc/patroni/patroni.yml
Page 112
$ sudo journalctl -u patroni.service -n 100 -f
Output
node2 realizes that the leader is dead, and promotes itself as the leader.
3. Try accessing the cluster using the HAProxy endpoint at any point in time between these operations. The
cluster is still accepting connections.
Manual switchover
Typically, a manual switchover is needed for planned downtime to perform maintenance activity on the leader
node. Patroni provides the switchover command to manually switch over from the leader node.
Patroni asks the name of the current primary node and then the node that should take over as the switched-over
primary. You can also specify the time at which the switchover should happen. To trigger the process
immediately, specify the value now:
Page 113
primary [node2]: node2
Candidate ['node1', 'node3'] []: node1
When should the switchover take place (e.g. 2021-09-23T15:56 ) [now]: now
Current cluster topology
+ Cluster: stampede1 (7011110722654005156) -----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+-------+---------+---------+----+-----------+
| node1 | node1 | Replica | running | 3 | 0 |
| node2 | node2 | Leader | running | 3 | |
| node3 | node3 | Replica | stopped | | unknown |
+--------+-------+---------+---------+----+-----------+
Are you sure you want to switchover cluster stampede1, demoting current primary node2?
[y/N]: y
2021-09-23 14:56:40.54009 Successfully switched over to "node1"
+ Cluster: stampede1 (7011110722654005156) -----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+-------+---------+---------+----+-----------+
| node1 | node1 | Leader | running | 3 | |
| node2 | node2 | Replica | stopped | | unknown |
| node3 | node3 | Replica | stopped | | unknown |
+--------+-------+---------+---------+----+-----------+
Restart the Patroni service in node2 (after the “planned maintenance”). The node rejoins the cluster as a
secondary.
Page 114
Backup and disaster recovery
Page 115
Backup and disaster recovery in Percona
Distribution for PostgreSQL
Summary
Overview
Architecture
Deployment
Testing
Overview
A Disaster Recovery (DR) solution ensures that a system can be quickly restored to a normal operational state if
something unexpected happens. When operating a database, you would back up the data as frequently as
possible and have a mechanism to restore that data when needed. Disaster Recovery is often mistaken for high
availability (HA), but they are two different concepts altogether:
High availability ensures guaranteed service levels at all times. This solution involves configuring one or more
standby systems to an active database, and the ability to switch seamlessly to that standby when the primary
database becomes unavailable, for example, during a power outage or a server crash. To learn more about
high-availability solutions with Percona Distribution for PostgreSQL, refer to High Availability in PostgreSQL
with Patroni.
Disaster Recovery protects the database instance against accidental or malicious data loss or data
corruption. Disaster recovery can be achieved by using either the options provided by PostgreSQL, or external
extensions.
Page 116
PostgreSQL disaster recovery options
This is the basic backup approach. These tools can generate the backup of one or more PostgreSQL databases (either just the
structure, or both the structure and data), then restore them through the pg_restore command.
Advantages Disadvantages
This method involves backing up the PostgreSQL data directory to a different location, and restoring it when needed.
Advantages Disadvantages
Consistent snapshot of the data directory or 1. Requires stopping PostgreSQL in order to copy the files. This is
the whole data disk volume not practical for most production setups.
2. No backup of individual databases or tables.
PostgreSQL pg_basebackup
This backup tool is provided by PostgreSQL. It is used to back up data when the database instance is running. pgasebackup
makes a binary copy of the database cluster files, while making sure the system is put in and out of backup mode
automatically.
Advantages Disadvantages
To achieve a production grade PostgreSQL disaster recovery solution, you need something that can take full or
incremental database backups from a running instance, and restore from those backups at any point in time.
Percona Distribution for PostgreSQL is supplied with pgBackRest: a reliable, open-source backup and recovery
solution for PostgreSQL.
This document focuses on the Disaster recovery solution in Percona Distribution for PostgreSQL. The Deploying
backup and disaster recovery solution in Percona Distribution for PostgreSQL tutorial provides guidelines of how
to set up and test this solution.
Page 117
pgBackRest
pgBackRest is an easy-to-use, open-source solution that can reliably back up even the largest of PostgreSQL
databases. pgBackRest supports the following backup types:
differential backup - includes all data that has changed since the last full backup. While this means the
backup time is slightly higher, it enables a faster restore.
incremental backup - only backs up the files that have changed since the last full or differential backup,
resulting in a quick backup time. To restore to a point in time, however, you will need to restore each
incremental backup in the order they were taken.
When it comes to restoring, pgBackRest can do a full or a delta restore. A full restore needs an empty
PostgreSQL target directory. A delta restore is intelligent enough to recognize already-existing files in the
PostgreSQL data directory, and update only the ones the backup contains.
pgBackRest supports remote repository hosting and can even use cloud-based services like AWS S3, Google
Cloud Services Cloud Storage, Azure Blob Storage for saving backup files. It supports parallel backup through
multi-core processing and compression. By default, backup integrity is verified through checksums, and saved
files can be encrypted for enhanced security.
pgBackRest can restore a database to a specific point in time in the past. This is the case where a database is
not inaccessible but perhaps contains corrupted data. Using the point-in-time recovery, a database administrator
can restore the database to the last known good state.
Finally, pgBackRest also supports restoring PostgreSQL databases to a different PostgreSQL instance or a
separate data directory.
Setup overview
This section describes the architecture of the backup and disaster recovery solution. For the configuration steps,
refer to the Deploying backup and disaster recovery solution in Percona Distribution for PostgreSQL.
System architecture
As the configuration example, we will use a three server architecture where pgBackRest resides on a dedicated
remote host. The servers communicate with each other via passwordless SSH.
Important
Passwordless SSH may not be an ideal solution for your environment. In this case, consider using other methods, for example,
TLS with client certificates .
Page 118
Components:
pg-primary hosts the primary PostgreSQL server. Note that “primary” here means the main database
instance and does not refer to the primary node of a PostgreSQL replication cluster or a HA setup.
pg-repo is the remote backup repository and hosts pgBackRest . It’s important to host the backup
repository on a physically separate instance, to be accessed when the target goes down.
pg-secondary is the secondary PostgreSQL node. Don’t confuse it with a hot standby. “Secondary” in this
context means a PostgreSQL instance that’s idle. We will restore the database backup to this instance when
the primary PostgreSQL instance goes down.
Note
For simplicity, we use a single-node PostgreSQL instance as the primary database server. In a production scenario, you will use
some form of high-availability solution to protect the primary instance. When you are using a high-availability setup, we
recommend configuring pgBackRest to back up the hot standby server so the primary node is not unnecessarily loaded.
Deployment
Refer to the Deploying backup and disaster recovery solution in Percona Distribution for PostgreSQL tutorial.
Page 119
Deploying backup and disaster recovery solution
in Percona Distribution for PostgreSQL
This document provides instructions of how to set up and test the backup and disaster recovery solution in
Percona Distribution for PostgreSQL with pgBackRest . For technical overview and architecture description of
this solution, refer to Backup and disaster recovery in Percona Distribution for PostgreSQL.
Deployment
As the example configuration, we will use the nodes with the following IP addresses:
pg-primary 10.104.0.3
pg-repo 10.104.0.5
pg-secondary 10.104.0.4
Set up hostnames
1. Define the hostname for every server in the /etc/hostname file. The following are the examples of how the
/etc/hostname file in three nodes looks like:
cat /etc/hostname
pg-primary
cat /etc/hostname
pg-repo
cat /etc/hostname
pg-secondary
2. For the nodes to communicate seamlessly across the network, resolve their hostnames to their IP
addresses in the /etc/hosts file. (Alternatively, you can make appropriate entries in your internal DNS
servers)
The /etc/hosts file for the pg-primary node looks like this:
Page 120
```
127.0.1.1 pg-primary pg-primary
127.0.0.1 localhost
10.104.0.5 pg-repo
```
```
127.0.1.1 pg-repo pg-repo
127.0.0.1 localhost
10.104.0.3 pg-primary
10.104.0.4 pg-secondary
```
```
127.0.1.1 pg-secondary pg-secondary
127.0.0.1 localhost
10.104.0.3 pg-primary
10.104.0.5 pg-repo
```
Before setting up passwordless SSH, ensure that the postgres user in all three instances has a password.
1. To set or change the password, run the following command as a root user:
$ passwd postgres
PasswordAuthentication yes
4. In the pg-repo node, restart the sshd service. Without the restart, the SSH server will not allow you to
connect to it using a password while adding the keys.
5. In the pg-primary node, generate an SSH key pair and add the public key to the pg-repo node.
Page 121
Important
$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa
Your public key has been saved in /root/.ssh/id_rsa.pub
The key fingerprint is:
...
6. To verify everything has worked as expected, run the following command from the pg-primary node.
$ ssh postgres@pg-repo
Install Percona Distribution for PostgreSQL in the primary and the secondary nodes from Percona repository.
Page 122
1. Install percona-release .
At this step, configure the PostgreSQL instance on the pg-primary node for continuous archiving of the WAL
files.
Note
Install pgBackRest
Install pgBackRest in all three instances from Percona repository. Use the following command:
Page 123
On Debian / Ubuntu
On RHEL / derivatives
Run the following commands on all three nodes to set up the required configuration file for pgBackRest .
2. Configure the location and permissions for the pgBackRest configuration file:
Configure pgBackRest on the pg-primary node by setting up a stanza. A stanza is a set of configuration
parameters that tells pgBackRest where to backup its files. Edit the /etc/pgbackrest/pgbackrest.conf file
in the pg-primary node to include the following lines:
[global]
repo1-host=pg-repo
repo1-host-user=postgres
process-max=2
log-level-console=info
log-level-file=debug
[prod_backup]
pg1-path=/var/lib/postgresql/14/main
You can see the pg1-path attribute for the prod_backup stanza has been set to the PostgreSQL data folder.
Page 124
Update pgBackRest configuration file in the remote backup repository node
Add a stanza for the pgBackRest in the pg-repo node. Edit the /etc/pgbackrest/pgbackrest.conf
configuration file to include the following lines:
[global]
repo1-path=/home/pgbackrest/pg_backup
repo1-retention-full=2
process-max=2
log-level-console=info
log-level-file=debug
start-fast=y
stop-auto=y
[prod_backup]
pg1-path=/var/lib/postgresql/14/main
pg1-host=pg-primary
pg1-host-user=postgres
pg1-port = 5432
After the configuration files are set up, it’s now time to initialize the pgBackRest stanza. Run the following
command in the remote backup repository node ( pg-repo ).
Once the stanza is created successfully, you can try out the different use cases for disaster recovery.
1. To start our testing, let’s create a table in the postgres database in the pg-primary node and add some
data.
Page 125
CREATE TABLE CUSTOMER (id integer, name text);
INSERT INTO CUSTOMER VALUES (1,'john');
INSERT INTO CUSTOMER VALUES (2,'martha');
INSERT INTO CUSTOMER VALUES (3,'mary');
2. Take a full backup of the database instance. Run the following commands from the pg-repo node:
If you want an incremental backup, you can omit the type attribute. By default, pgBackRest always takes an
incremental backup except the first backup of the cluster which is always a full backup.
If you need a differential backup, use diff for the type field:
1. Run the following command in the pg-primary node to delete the main data directory.
$ rm -rf /var/lib/postgresql/14/main/*
3. After the command executes successfully, you can access PostgreSQL from the psql command line tool
and check if the table and data rows have been restored.
Page 126
If your target PostgreSQL instance has an already existing data directory, the full restore option will fail. You will
get an error message stating there are existing data files. In this case, you can use the --delta option to
restore only the corrupted files.
For example, let’s say one of your developers mistakenly deleted a few rows from a table. You can use
pgBackRest to revert your database to a previous point in time to recover the lost rows.
1. Take a timestamp when the database is stable and error-free. Run the following command from the
psql prompt.
SELECT CURRENT_TIMESTAMP;
current_timestamp
-------------------------------
2021-11-07 11:55:47.952405+00
(1 row)
Note down the above timestamp since we will use this time in the restore command. Note that in a real life
scenario, finding the correct point in time when the database was error-free may require extensive
investigation. It is also important to note that all changes after the selected point will be lost after the roll
back.
3. To recover the data, run a command with the noted timestamp as an argument. Run the commands below to
recover the database up to that time.
4. Check the database table to see if the record has been restored.
Page 127
SELECT * FROM customer;
id | name
----+--------
1 | john
2 | martha
3 | mary
(3 rows)
Sometimes a PostgreSQL server may encounter hardware issues and become completely inaccessible. In such
cases, we will need to recover the database to a separate instance where pgBackRest is not initially configured.
To restore the instance to a separate host, you have to first install both PostgreSQL and pgBackRest in this
host.
In our test setup, we already have PostgreSQL andpgBackRest installed in the third node, pg-secondary .
Change the pgBackRest configuration file in the pg-secondary node as shown below.
[global]
repo1-host=pg-repo
repo1-host-user=postgres
process-max=2
log-level-console=info
log-level-file=debug
[prod_backup]
pg1-path=/var/lib/postgresql/14/main
There should be bidirectional passwordless SSH communication between pg-repo and pg-secondary . Refer
to the Set up passwordless SSH section for the steps, if you haven’t configured it.
Page 128
$ pgbackrest -u postgres --stanza=prod_backup --delta restore
2021-11-07 13:34:08.897 P00 INFO: restore command begin 2.36: --delta --exec-
id=109728-d81c7b0b --log-level-console=info --log-level-file=debug --pg1-
path=/var/lib/postgresql/14/main --process-max=2 --repo1-host=pg-repo --repo1-host-
user=postgres --stanza=prod_backup
2021-11-07 13:34:09.784 P00 INFO: repo1: restore backup set 20211107-
111534F_20211107-131807I, recovery will start at 2021-11-07 13:18:07
2021-11-07 13:34:09.786 P00 INFO: remove invalid files/links/paths from
'/var/lib/postgresql/14/main'
2021-11-07 13:34:11.803 P00 INFO: write updated
/var/lib/postgresql/14/main/postgresql.auto.conf
2021-11-07 13:34:11.819 P00 INFO: restore global/pg_control (performed last to ensure
aborted restores cannot be started)
2021-11-07 13:34:11.819 P00 INFO: restore size = 23.2MB, file total = 937
2021-11-07 13:34:11.820 P00 INFO: restore command end: completed successfully
(2924ms)
Page 129
Spatial data handling
Page 130
Spatial data manipulation
Version added: 15.3
Organizations dealing with spatial data need to store it somewhere and manipulate it. PostGIS is the open
source extension for PostgreSQL that allows doing just that. It adds support for storing the spatial data types
such as:
Geographical data like points, lines, polygons, GPS coordinates that can be mapped on a sphere.
Geometrical data. This is also points, lines and polygons but they apply to a 2D surface.
To operate with spatial data inside SQL queries, PostGIS supports spatial functions like distance, area, union,
intersection. It uses the spatial indexes like R-Tree and Quadtree for efficient processing of database
operations. Read more about supported spatial functions and indexes in PostGIS documentation .
By deploying PostGIS with Percona Distribution for PostgreSQL, you receive the open-source spatial database
that you can use in various areas without vendor lock-in.
To store and manage spatial data, create and store spatial shapes, calculate areas and distances
To work with raster data, such as satellite imagery or digital elevation models.
To integrate spatial and non-spatial data such as demographic or economic data in a database
You need to store only a couple of map locations. Consider using the built-in geometric functions and
operations of PostgreSQL
You need real-time data analysis. While PostGIS can handle real-time spatial data, it may not be the best
option for real-time data analysis on large volumes of data.
You need to acquire spatial data. Use other tools for this purpose and import spatial data into PostGIS to
manipulate it.
Next steps:
Deployment
Page 131
Deploy spatial data with PostgreSQL
The following document provides guidelines how to install PostGIS and how to run the basic queries.
Considerations
1. We assume that you have the basic knowledge of spatial data, GIS (Geographical Information System) and
of shapefiles.
2. For uploading the spatial data and querying the database, we use the same data set as is used in PostGIS
tutorial .
Install PostGIS
Page 132
On Debian and Ubuntu
As other components of Percona Distribution for PostgreSQL, PostGIS is available from Percona
repositories. Use the percona-release repository management tool to enable the repository.
3. The command in the previous step installs the set of PostGIS extensions. To check what extensions are
available, run the following query from the psql terminal:
Note
To enable the postgis_sfcgal-3 extension on Ubuntu 18.04, you need to manually install the required dependency:
1. Check the Platform specific notes and enable required repositories and modules for the dependencies
relevant to your operating system.
As other components of Percona Distribution for PostgreSQL, PostGIS is available from Percona
repositories. Use the percona-release repository management tool to enable the repository.
This installs the set of PostGIS extensions. To check what extensions are available, run the following query from
the psql terminal:
Page 133
SELECT name, default_version,installed_version
FROM pg_available_extensions WHERE name LIKE 'postgis%' or name LIKE 'address%';
From tarballs
PostGIS is included into binary tarball and is a part of the percona-postgresql16 binary. Use the install from
tarballs tutorial to install it.
1. Create a database and a schema for this database to store your data. A schema is a container that logically
segments objects (tables, functions, views, and so on) for better management. Run the following
commands from the psql terminal:
2. To make PostGIS functions and operations work, you need to enable the postgis extension. Make sure you
are connected to the database you created earlier and run the following command:
SELECT postgis_full_version();
postgis_full_version
------------------------------------------------------------------------------------
-----------------------------------------------------------------------------
POSTGIS="3.3.3" [EXTENSION] PGSQL="140" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1"
LIBXML="2.9.13" LIBJSON="0.15" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)"
Page 134
$ curl -LO https://s3.amazonaws.com/s3.cleverelephant.ca/postgis-workshop-2020.zip
2. Unzip the archive and from the folder where the .shp files are located, execute the following command and
replace the dbname value with the name of your database:
shp2pgsql \
-D \
-I \
-s 26918 \
nyc_streets.shp \
nyc_streets \
| psql -U postgres dbname=nyc
\d nyc_streets;
Table "public.nyc_streets"
Column | Type | Collation | Nullable |
Default
--------+---------------------------------+-----------+----------+---------------------
---------------------
gid | integer | | not null |
nextval('nyc_streets_gid_seq'::regclass)
id | double precision | | |
name | character varying(200) | | |
oneway | character varying(10) | | |
type | character varying(50) | | |
geom | geometry(MultiLineString,26918) | | |
Indexes:
"nyc_streets_pkey" PRIMARY KEY, btree (gid)
"nyc_streets_geom_idx" gist (geom)
1. Repeat the command to upload other shapefiles in the data set: nyc_census_blocks ,
nyc_neighborhoods , nyc_subway_stations
Page 135
Query spatial data
After you installed and set up PostGIS, let’s find answers to the following questions by querying the database:
Output:
population
------------
8175032
(1 row)
Output:
st_area
--------------------
3.5198365965413293
(1 row)
By default, the output is given in square meters. To get the value in square kilometers, divide it by 1 000 000.
SELECT ST_Length(geom)
FROM nyc_streets
WHERE name = 'Columbus Cir';
Output:
Page 136
st_length
-------------------
308.3419936909855
(1 row)
Page 137
Spatial database upgrade
When using PostgreSQL and PostGIS for some time, you may eventually come to the decision to upgrade your
spatial database. There can be different reasons for that: to receive improvements and/or bug fixes that come
with a minor version of the database/extension, reaching the end of life of the currently used software and
others.
Important
Upgrade PostGIS
Each version of PostGIS is compatible with several versions of PostgreSQL and vise versa. The best practice is
to first upgrade the PostGIS extension on the source cluster to match the compatible version on the target
cluster and then upgrade PostgreSQL. Please see the PostGIS Support matrix for version compatibility.
PostGIS is enabled on the database level. This means that the upgrade is also done on the database level.
Page 138
PostGIS 3 and above
Connect to the database where it is enabled and run the PostGIS_Extensions_Upgrade() function:
SELECT postgis_extensions_upgrade();
PostGIS 2.5
Connect to the database with the enabled extension and run the following commands:
Starting with version 3, vector and raster functionalities have been separated in two individual extensions. Thus,
to upgrade those, you need to run the postgis_extensions_upgrade(); twice.
SELECT postgis_extensions_upgrade();
TIP: If you don’t need the raster functionality, you can drop the postgis_raster extension after the upgrade.
Upgrade PostgreSQL
Upgrade PostgreSQL either to the latest minor or to the major version.
If you are using long deprecated views and functions and / or need the expertise in upgrading your spatial
database, contact Percona Managed Services for an individual upgrade scenario development.
Page 139
LDAP Authentication
When a client application or a user that runs the client application connects to the database, it must identify
themselves. The process of validating the client’s identity and determining whether this client is permitted to
access the database it has requested is called authentication.
Percona Distribution for PortgreSQL supports several authentication methods , including the LDAP
authentication . The use of LDAP is to provide a central place for authentication - meaning the LDAP server
stores usernames and passwords and their resource permissions.
The LDAP authentication in Percona Distribution for PortgreSQL is implemented the same way as in upstream
PostgreSQL.
Page 140
LDAP authentication
Page 141
Upgrade
Page 142
Upgrading Percona Distribution for PostgreSQL
from 15 to 16
This document describes the in-place upgrade of Percona Distribution for PostgreSQL using the pg_upgrade
tool.
Upgrade to the latest minor version within your current major version (e.g., from 15.11 to 15.13).
Then, perform the major upgrade to your desired version (e.g., from 15.13 to 16.9).
Note
When running a major upgrade for RHEL 8 and compatible derivatives, consider the following:
Percona Distribution for PostgreSQL 16.3, 15.7, 14.12, 13.15 and 12.18 include llvm packages 16.0.6, while its previous
versions 16.2, 15.6, 14.11, 13.14, and 12.17 include llvm 12.0.1. Since llvm libraries differ and are not compatible, the direct
major version upgrade from 15.6 to 16.3 may cause issues.
The in-place upgrade means installing a new version without removing the old version and keeping the data files
on the server.
See also
pg_upgrade Documentation
Similar to installing, we recommend you to upgrade Percona Distribution for PostgreSQL from Percona
repositories.
Important
A major upgrade is a risky process because of many changes between versions and issues that might occur during or after the
upgrade. Therefore, make sure to back up your data first. The backup tools are out of scope of this document. Use the backup
tool of your choice.
The general in-place upgrade flow for Percona Distribution for PostgreSQL is the following:
Page 143
6. Execute the analyze_new_cluster.sh script to generate statistics so the system is usable.
The exact steps may differ depending on the package manager of your operating system.
Note
When installing version 16, if prompted via a pop-up to upgrade to the latest available version, select No.
→ Install percona-release . If you have installed it before, update it to the latest version
$ sudo su postgres
Page 144
$ pg_upgradecluster 15 main --check
# Sample output: pg_upgradecluster pre-upgrade checks ok
The --check flag here instructs pg_upgrade to only check the upgrade without changing any data.
$ pg_upgradecluster 15 main
Page 145
Sample output (click to expand)
Page 146
Upgrading cluster 15/main to 16/main ...
Stopping old cluster...
Restarting old cluster with restricted connections...
...
Success. Please check that the upgraded cluster works. If it does,
you can remove the old cluster with:
pg_dropcluster 15 main
Sample output:
Upgrading cluster 15/main to 16/main ...
Stopping old cluster...
Restarting old cluster with restricted connections...
Notice: extra pg_ctl/postgres options given, bypassing systemctl for start operation
Creating new PostgreSQL cluster 16/main ...
/usr/lib/postgresql/16/bin/initdb -D /var/lib/postgresql/16/main --auth-local peer --auth-host
scram-sha-256 --no-instructions --encoding UTF8 --lc-collate C.UTF-8 --lc-ctype C.UTF-8 --
locale-provider libc
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
(1 row)
set_config
------------
(1 row)
Page 147
Stopping target cluster...
Stopping old cluster...
Disabling automatic startup of old cluster...
Starting upgraded cluster on port 5432...
Running finish phase upgrade hook scripts ...
vacuumdb: processing database "postgres": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "template1": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "postgres": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "template1": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "postgres": Generating default (full) optimizer statistics
vacuumdb: processing database "template1": Generating default (full) optimizer statistics
$ sudo su postgres
Note
Before deleting the old cluster, verify that the newly upgraded cluster is fully operational. Keeping the old cluster does not
negatively affect the functionality or performance of your upgraded cluster.
$ pg_dropcluster 15 main
Page 148
On Red Hat Enterprise Linux and CentOS using yum
Run all commands as root or via sudo:
→ Install percona-release
$ sudo su postgres
export LC_ALL="en_US.UTF-8"
export LC_CTYPE="en_US.UTF-8"
$ /usr/pgsql-16/bin/initdb -D /var/lib/pgsql/16/data
$ sudo su postgres
Page 149
→ Check the ability to upgrade Percona Distribution for PostgreSQL from 15 to 16:
$ /usr/pgsql-16/bin/pg_upgrade \
--old-bindir /usr/pgsql-15/bin \
--new-bindir /usr/pgsql-16/bin \
--old-datadir /var/lib/pgsql/15/data \
--new-datadir /var/lib/pgsql/16/data \
--check
The --check flag here instructs pg_upgrade to only check the upgrade without changing any data.
Sample output
$ /usr/pgsql-16/bin/pg_upgrade \
--old-bindir /usr/pgsql-15/bin \
--new-bindir /usr/pgsql-16/bin \
--old-datadir /var/lib/pgsql/15/data \
--new-datadir /var/lib/pgsql/16/data \
--link
The --link flag creates hard links to the files on the old version cluster so you don’t need to copy data. If
you don’t wish to use the --link option, make sure that you have enough disk space to store 2 copies of
files for both old version and new version clusters.
Page 150
$ systemctl status postgresql-16
10 After the upgrade, the Optimizer statistics are not transferred to the new cluster. Run the vaccumdb
command to analyze the new cluster:
$ sudo su postgres
$ ./delete_old_cluster.sh
$ rm -rf /var/lib/pgsql/15/data
Page 151
Minor Upgrade of Percona Distribution for
PostgreSQL
Minor releases of PostgreSQL include bug fixes and feature enhancements. We recommend that you keep your
Percona Distribution for PostgreSQL updated to the latest minor version.
Though minor upgrades do not change the behavior, we recommend you to back up your data first, in order to be
on the safe side.
Minor upgrade of Percona Distribution for PostgreSQL includes the following steps:
Note
These steps apply if you installed Percona Distribution for PostgreSQL from the Major Release repository. In this case, you are
always upgraded to the latest available release.
If you installed Percona Distribution for PostgreSQL from the Minor Release repository, you will need to enable a new version
repository to upgrade.
For more information about Percona repositories, refer to Installing Percona Distribution for PostgreSQL.
Before the upgrade, update the percona-release utility to the latest version. This is required to install the new version
packages of Percona Distribution for PostgreSQL.
On Debian / Ubuntu
3 Install new version packages. See Installing Percona Distribution for PostgreSQL.
Page 152
4 Restart the postgresql service.
On Debian / Ubuntu
If you wish to upgrade Percona Distribution for PostgreSQL to the major version, refer to Upgrading Percona
Distribution for PostgreSQL from 15 to 16.
Page 153
Migrate from PostgreSQL to Percona Distribution
for PostgreSQL
Percona Distribution for PostgreSQL includes the PostgreSQL database and additional extensions that have
been selected to cover the needs of the enterprise and are guaranteed to work together. Percona Distribution for
PostgreSQL is available as a software collection that is easy to deploy.
We encourage users to migrate from their PostgreSQL deployments based on community binaries to Percona
Distribution for PostgreSQL. This document provides the migration instructions.
Depending on your business requirements, you may migrate to Percona Distribution for PostgreSQL either on the
same server or onto a different server.
Page 154
On Debian and Ubuntu Linux
To ensure that your data is safe during the migration, we recommend to make a backup of your data and all
pg_hba.conf , postgresql.conf , postgresql.auto.conf ) using the tool of
configuration files (such as
your choice. The backup process is out of scope of this document. You can use pg_dumpall or other tools of
your choice. For more information, see the blog post PostgreSQL Upgrade Using pg_dumpall by Avinash
Vallarapu, Fernando Laudares Camargos, Jobin Augustine and Nickolay Ihalainen.
3 Install percona-release
7 Start the postgresql service. The installation process starts and initializes the default cluster
automatically. You can check its status with:
Page 155
To ensure that your data is safe during the migration, we recommend to make a backup of your data and all
pg_hba.conf , postgresql.conf , postgresql.auto.conf ) using the tool of
configuration files (such as
your choice. The backup process is out of scope of this document. You can use pg_dumpall or other tools of
your choice.
3 Install percona-release
To migrate from PostgreSQL Community to Percona Distribution for PostgreSQL on a different server, do the
following:
1 Back up your data and all configuration files (such as pg_hba.conf , postgresql.conf ,
postgresql.auto.conf ) using the tool of your choice.
Page 156
On Debian and Ubuntu
1 Install percona-release
Page 157
Troubleshooting guide
Cannot create a table. Permission denied in schema public
Every database in PostgreSQL has a default schema called public . A schema stores database objects like
tables, views, indexes and allows organizing them into logical groups.
When you create a table without specifying a schema name, it ends up in the public schema by default.
Starting with PostgreSQL 15, non-database owners cannot access the public schema. Therefore, you can
either grant privileges to the database for your user using the GRANT command or create your own schema to
insert the data.
To ensure all tables end up in your newly created schema, use the following statement ot set the schema:
Page 158
Uninstalling Percona Distribution for PostgreSQL
To uninstall Percona Distribution for PostgreSQL, remove all the installed packages and data / configuration
files.
NOTE: Should you need the data files later, back up your data before uninstalling Percona Distribution for
PostgreSQL.
Page 159
On Debian and Ubuntu using apt
To uninstall Percona Distribution for PostgreSQL on platforms that use apt package manager such as Debian or
Ubuntu, complete the following steps.
$ rm -rf /etc/postgresql/16/main
To uninstall Percona Distribution for PostgreSQL on platforms that use yum package manager such as Red Hat
Enterprise Linux or CentOS, complete the following steps.
$ rm -rf /var/lib/pgsql/16/data
Page 160
Uninstall from tarballs
If you installed Percona Distribution for PostgreSQL from binary tarballs, stop the PostgreSQL server and remove
the folder with the binary tarballs.
Sample output
Page 161
Release Notes
Page 162
Percona Distribution for PostgreSQL release notes
Percona Distribution for PostgreSQL 16.9 (2025-05-29)
Page 163
Percona Distribution for PostgreSQL 16.9 (2025-
05-29)
Installation
Percona Distribution for PostgreSQL is a solution that includes PostgreSQL server and the collection of tools
from PostgreSQL community. These tools are tested to work together and serve to assist you in deploying and
managing PostgreSQL.
The aim of Percona Distribution for PostgreSQL is to address the operational issues like High-Availability,
Disaster Recovery, Security, Observability, Spatial data handling, Performance and Scalability, and others that
enterprises are facing.
Release Highlights
Updated Major upgrade topic in documentation
The Upgrading Percona Distribution for PostgreSQL from 15 to 16 guide has been updated with revised steps for
the On Debian and Ubuntu using apt section, improving clarity and reliability of the upgrade process.
The following is the list of extensions available in Percona Distribution for PostgreSQL.
etcd 3.5.21 A distributed, reliable key-value store for setting up high available Patroni clusters
PgAudit 16.0 provides detailed session or object audit logging via the standard logging facility
provided by PostgreSQL
pgAudit 4.1.0 provides an additional layer of logging and control when unprivileged users must
set_user escalate themselves to superusers or object owner roles in order to perform needed
maintenance tasks.
Page 164
Extension Version Description
pg_gather v30 an SQL script for running the diagnostics of the health of PostgreSQL cluster
pgpool2 4.6.0 a middleware between PostgreSQL server and client for high availability, connection
pooling, and load balancing.
PostgreSQL 277 PostgreSQL database-cluster manager. It provides a structure under which multiple
Commons versions of PostgreSQL may be installed and/or multiple clusters maintained at one
time.
For Red Hat Enterprise Linux 8 and compatible derivatives, Percona Distribution for PostgreSQL also includes
the supplemental python3-etcd 0.4.5 packages, which are used for setting up Patroni clusters.
Percona Distribution for PostgreSQL is also shipped with the libpq library. It contains “a set of library functions
that allow client programs to pass queries to the PostgreSQL backend server and to receive the results of these
queries.”
Page 165
Percona Distribution for PostgreSQL 16.8 (2025-
02-27)
Installation
Percona Distribution for PostgreSQL is a solution that includes PostgreSQL server and the collection of tools
from PostgreSQL community. These tools are tested to work together and serve to assist you in deploying and
managing PostgreSQL.
The aim of Percona Distribution for PostgreSQL is to address the operational issues like High-Availability,
Disaster Recovery, Security, Observability, Spatial data handling, Performance and Scalability, and others that
enterprises are facing.
This release of Percona Distribution for PostgreSQL is based on PostgreSQL 16.7 and PostgreSQL 16.8.
Release Highlights
This release fixes CVE-2025-1094, which closed a vulnerability in the libpq PostgreSQL client library but
introduced a regression related to string handling for non-null terminated strings. The error would be visible
based on how a PostgreSQL client implemented this behavior. This regression affects versions 17.3, 16.7, 15.11,
14.16, and 13.19. For this reason, version 16.7 was skipped.
Percona Distribution for PostgreSQL Docker image is now based on Universal Base Image (UBI) version 9,
which includes the latest security fixes. This makes the image compliant with the Red Hat certification and
ensures the seamless work of containers on Red Hat OpenShift Container Platform.
You no longer have to specify the16.9-multi tag when you run Percona Distribution for PostgreSQL in
Docker. Instead, use the percona/percona-distribution-postgresql:16.9 . Docker automatically
identifies the architecture of your operating system and pulls the corresponding image. Refer to Run in Docker
for how to get started.
We have extended Percona Distribution for PostgreSQL tarballs with PostGIS - an open-source extension to
handle spacial data. This way you can install and run PostgreSQL as a geospatial database on hosts without a
direct access to the Internet. Learn more about installing from tarballs and Spacial data manipulation
Meta-packages for Percona Distribution for PostgreSQL are deprecated and will be removed in future releases.
Page 166
Supplied third-party extensions
Review each extension’s release notes for What’s new, improvements, or bug fixes. The following is the list of
extensions available in Percona Distribution for PostgreSQL.
The following is the list of extensions available in Percona Distribution for PostgreSQL.
etcd 3.5.18 A distributed, reliable key-value store for setting up high available Patroni clusters
PgAudit 16 provides detailed session or object audit logging via the standard logging facility
provided by PostgreSQL
pgAudit set_user 4.1.0 provides an additional layer of logging and control when unprivileged users must
escalate themselves to superusers or object owner roles in order to perform
needed maintenance tasks.
pg_gather v29 an SQL script for running the diagnostics of the health of PostgreSQL cluster
pgpool2 4.5.5 a middleware between PostgreSQL server and client for high availability, connection
pooling, and load balancing.
pg_stat_monitor 2.1.1 collects and aggregates statistics for PostgreSQL and provides histogram
information.
PostgreSQL 267 PostgreSQL database-cluster manager. It provides a structure under which multiple
Commons versions of PostgreSQL may be installed and/or multiple clusters maintained at one
time.
Page 167
Extension Version Description
For Red Hat Enterprise Linux 8 and compatible derivatives, Percona Distribution for PostgreSQL also includes
the supplemental python3-etcd 0.4.5 packages, which are used for setting up Patroni clusters.
Percona Distribution for PostgreSQL is also shipped with the libpq library. It contains “a set of library functions
that allow client programs to pass queries to the PostgreSQL backend server and to receive the results of these
queries.”
Page 168
Percona Distribution for PostgreSQL 16.6 (2024-12-
03)
Installation
Percona Distribution for PostgreSQL is a solution that includes PostgreSQL server and the collection of tools
from PostgreSQL community. These tools are tested to work together and serve to assist you in deploying and
managing PostgreSQL.
The aim of Percona Distribution for PostgreSQL is to address the operational issues like High-Availability,
Disaster Recovery, Security, Observability, Spatial data handling, Performance and Scalability, and others that
enterprises are facing.
Release Highlights
This release includes fixes for CVE-2024-10978 and for certain PostgreSQL extensions that break because
they depend on the modified Application Binary Interface (ABI). These regressions were introduced in
PostgreSQL 17.1, 16.5, 15.9, 14.14, 13.17, and 12.21. For this reason, the release of Percona Distribution for
PostgreSQL 16.5 has been skipped.
Percona Distribution for PostgreSQL includes pgvector - an open source extension that enables you to
use PostgreSQL as a vector database. It brings vector data type and vector operations (mainly similarity
search) to PosgreSQL. You can install pgvector from repositories, tarballs, and it is also available as a
Docker image.
Percona Distribution for PostgreSQL now statically links llvmjit.so library for Red Hat Enterprise Linux 8
and 9 and compatible derivatives. This resolves the conflict between the LLVM version required by Percona
Distribution for PostgreSQL and the one supplied with the operating system. This also enables you to use the
LLVM modules supplied with the operating system for other software you require.
The following is the list of extensions available in Percona Distribution for PostgreSQL.
etcd 3.5.16 A distributed, reliable key-value store for setting up high available Patroni clusters
Page 169
Extension Version Description
PgAudit 16 provides detailed session or object audit logging via the standard logging facility
provided by PostgreSQL
pgAudit set_user 4.1.0 provides an additional layer of logging and control when unprivileged users must
escalate themselves to superusers or object owner roles in order to perform
needed maintenance tasks.
pg_gather v28 an SQL script for running the diagnostics of the health of PostgreSQL cluster
pgpool2 4.5.4 a middleware between PostgreSQL server and client for high availability, connection
pooling and load balancing.
pg_stat_monitor 2.1.1 collects and aggregates statistics for PostgreSQL and provides histogram
information.
PostgreSQL 266 PostgreSQL database-cluster manager. It provides a structure under which multiple
Commons versions of PostgreSQL may be installed and/or multiple clusters maintained at one
time.
For Red Hat Enterprise Linux 8 and 9 and compatible derivatives, Percona Distribution for PostgreSQL also
includes the supplemental python3-etcd 0.4.5 packages, which are used for setting up Patroni clusters.
Percona Distribution for PostgreSQL is also shipped with the libpq library. It contains “a set of library functions
that allow client programs to pass queries to the PostgreSQL backend server and to receive the results of these
queries.”
Page 170
Percona Distribution for PostgreSQL 16.4 (2024-
09-10)
Installation
Percona Distribution for PostgreSQL is a solution that includes PostgreSQL server and the collection of tools
from PostgreSQL community. These tools are tested to work together and serve to assist you in deploying and
managing PostgreSQL.
The aim of Percona Distribution for PostgreSQL is to address the operational issues like High-Availability,
Disaster Recovery, Security, Observability, Spatial data handling, Performance and Scalability, and others that
enterprises are facing.
Release Highlights
This release of Percona Distribution for PostgreSQL fixes security vulnerability CVE-2024-7348.
Percona Distribution for PostgreSQL packages and tarballs are now also available for ARM64 architectures.
Thus, users can not only run Percona Distribution for PostgreSQL in Docker containers on ARM-based
workstations but also install the packages on those workstations. The ARM64 packages are available for the
following operating systems:
Debian 11
Debian 12
Percona Distribution for PostgreSQL includes the enhanced telemetry feature and provides comprehensive
information about how telemetry works, its components and metrics as well as updated methods how to
disable telemetry. Read more in Telemetry and data collection
Percona Distribution for PostgreSQL includes pg_stat_monitor 2.1.0 that provides the ability to disable the
application name tracking for a query. This way you can optimize query execution performance.
Packaging Changes
Percona Distribution for PostgreSQL is no longer supported on Debian 10 and Red Hat Enterprise Linux 7 and
compatible derivatives.
Page 171
The following is the list of extensions available in Percona Distribution for PostgreSQL.
etcd 3.5.15 A distributed, reliable key-value store for setting up high available Patroni clusters
PgAudit 16 provides detailed session or object audit logging via the standard logging facility
provided by PostgreSQL
pgAudit set_user 4.0.1 provides an additional layer of logging and control when unprivileged users must
escalate themselves to superusers or object owner roles in order to perform
needed maintenance tasks.
pg_gather v27 an SQL script for running the diagnostics of the health of PostgreSQL cluster
pgpool2 4.5.2 a middleware between PostgreSQL server and client for high availability, connection
pooling and load balancing.
pg_stat_monitor 2.1.1 collects and aggregates statistics for PostgreSQL and provides histogram
information.
PostgreSQL 261 PostgreSQL database-cluster manager. It provides a structure under which multiple
Commons versions of PostgreSQL may be installed and/or multiple clusters maintained at one
time.
Percona Distribution for PostgreSQL Red Hat Enterprise Linux 8 and compatible derivatives also includes the
following packages:
llvm 17.0.6 packages. This fixes compatibility issues with LLVM from upstream.
supplemental python3-etcd packages, which can be used for setting up Patroni clusters.
Page 172
Percona Distribution for PostgreSQL is also shipped with the libpq library. It contains “a set of library functions
that allow client programs to pass queries to the PostgreSQL backend server and to receive the results of these
queries.”
Page 173
Percona Distribution for PostgreSQL 16.3 (2024-
06-06)
Installation
Percona Distribution for PostgreSQL is a solution that includes PostgreSQL server and the collection of tools
from PostgreSQL community. These tools are tested to work together and serve to assist you in deploying and
managing PostgreSQL.
The aim of Percona Distribution for PostgreSQL is to address the operational issues like High-Availability,
Disaster Recovery, Security, Observability, Spatial data handling, Performance and Scalability, and others that
enterprises are facing.
Release Highlights
Percona Distribution for PostgreSQL now includes the etcd distributed configuration store version 3.5.x for all
supported operating systems. This enhancement simplifies deploying high-availability solutions because you
can install all necessary components from a single source, ensuring their seamless compatibility.
Percona Distribution for PostgreSQL is now available on Ubuntu 24.04 LTS Noble Numbat.
Percona Distribution for PostgreSQL on Red Hat Enterprise Linux 8 and compatible derivatives is now fully
compatible with upstream llvm packages and includes the latest version 16.0.6 of them.
To ensure a smooth upgrade process, the recommended approach is to upgrade to the latest minor version
within your current major version before going to the next major version. For example, if you’re currently on
15.6, upgrade to 15.7 first, then you can upgrade to 16.3. This two-step approach avoids any potential
conflicts caused by differing llvm versions on Red Hat Enterprise Linux 8 and compatible derivatives.
The following is the list of extensions available in Percona Distribution for PostgreSQL.
etcd 3.5.13 A distributed, reliable key-value store for setting up high available Patroni clusters
PgAudit 16 provides detailed session or object audit logging via the standard logging facility
provided by PostgreSQL
pgAudit set_user 4.0.1 provides an additional layer of logging and control when unprivileged users must
escalate themselves to superusers or object owner roles in order to perform needed
Page 174
Extension Version Description
maintenance tasks.
pg_gather v26 an SQL script for running the diagnostics of the health of PostgreSQL cluster
pgpool2 4.5.1 a middleware between PostgreSQL server and client for high availability, connection
pooling and load balancing.
pg_stat_monitor 2.0.4 collects and aggregates statistics for PostgreSQL and provides histogram
information.
PostgreSQL 259 PostgreSQL database-cluster manager. It provides a structure under which multiple
Common versions of PostgreSQL may be installed and/or multiple clusters maintained at one
time.
Percona Distribution for PostgreSQL Red Hat Enterprise Linux 8 and compatible derivatives also includes the
following packages:
llvm 16.0.6 packages. This fixes compatibility issues with LLVM from upstream.
supplemental python3-etcd packages, which can be used for setting up Patroni clusters.
Percona Distribution for PostgreSQL is also shipped with the libpq library. It contains “a set of library functions
that allow client programs to pass queries to the PostgreSQL backend server and to receive the results of these
queries.”
Page 175
Percona Distribution for PostgreSQL 16.2 (2024-
02-27)
Installation
Percona Distribution for PostgreSQL is a solution that includes PostgreSQL server and the collection of tools
from PostgreSQL community. These tools are tested to work together and serve to assist you in deploying and
managing PostgreSQL.
The aim of Percona Distribution for PostgreSQL is to address the operational issues like High-Availability,
Disaster Recovery, Security, Observability, Spatial data handling, Performance and Scalability, and others that
enterprises are facing.
Release Highlights
A Docker image for Percona Distribution for PostgreSQL is now available for ARM architectures. This
improves the user experience with the Distribution for developers with ARM-based workstations.
The following is the list of extensions available in Percona Distribution for PostgreSQL.
PgAudit 16 provides detailed session or object audit logging via the standard logging facility
provided by PostgreSQL
pgAudit set_user 4.0.1 provides an additional layer of logging and control when unprivileged users must
escalate themselves to superusers or object owner roles in order to perform
needed maintenance tasks.
pg_gather v25 an SQL script for running the diagnostics of the health of PostgreSQL cluster
Page 176
Extension Version Description
pgpool2 4.5.0 a middleware between PostgreSQL server and client for high availability,
connection pooling and load balancing.
pg_stat_monitor 2.0.4 collects and aggregates statistics for PostgreSQL and provides histogram
information.
llvm 12.0.1 packages for Red Hat Enterprise Linux 8 and compatible derivatives. This fixes compatibility
issues with LLVM from upstream.
supplemental etcd packages which can be used for setting up Patroni clusters. These packages are
available for the following operating systems:
Percona Distribution for PostgreSQL is also shipped with the libpq library. It contains “a set of library
functions that allow client programs to pass queries to the PostgreSQL backend server and to receive the results
of these queries.”
Page 177
Percona Distribution for PostgreSQL 16.1 Update
(2024-01-18)
Installation
Percona Distribution for PostgreSQL is a solution that includes PostgreSQL server and the collection of tools
from PostgreSQL community. These tools are tested to work together and serve to assist you in deploying and
managing PostgreSQL.
The aim of Percona Distribution for PostgreSQL is to address the operational issues like High-Availability,
Disaster Recovery, Security, Observability, Spatial data handling, Performance and Scalability, and others that
enterprises are facing.
This update of Percona Distribution for PostgreSQL includes the new version of pg_stat_monitor 2.0.4 that
fixes the issue with the extension causing the deadlock in the Percona Operator for PostgreSQL when executing
the pgsm_store function.
Page 178
Percona Distribution for PostgreSQL 16.1 (2023-11-
29)
Installation
Percona Distribution for PostgreSQL is a solution that includes PostgreSQL server and the collection of tools
from PostgreSQL community. These tools are tested to work together and serve to assist you in deploying and
managing PostgreSQL.
The aim of Percona Distribution for PostgreSQL is to address the operational issues like High-Availability,
Disaster Recovery, Security, Observability, Spatial data handling, Performance and Scalability, and others that
enterprises are facing.
Release Highlights
Telemetry is now enabled in Percona Distribution for PostgreSQL to fill in the gaps in our understanding of
how you use it and help us improve our products. Participation in the anonymous program is optional. You
can opt-out if you prefer not to share this information. Find more information in the Telemetry on Percona
Distribution for PostgreSQL document.
The percona-postgis33 and percona-pgaudit packages on YUM-based operating systems are renamed
percona-postgis33_16 and percona-pgaudit16 respectively
The following is the list of extensions available in Percona Distribution for PostgreSQL.
PgAudit 16.1 provides detailed session or object audit logging via the standard logging facility
provided by PostgreSQL
pgAudit set_user 4.0.1 provides an additional layer of logging and control when unprivileged users must
escalate themselves to superusers or object owner roles in order to perform
needed maintenance tasks.
Page 179
Extension Version Description
pg_gather v23 an SQL script for running the diagnostics of the health of PostgreSQL cluster
pgpool2 4.4.4 a middleware between PostgreSQL server and client for high availability,
connection pooling and load balancing.
pg_stat_monitor 2.0.3 collects and aggregates statistics for PostgreSQL and provides histogram
information.
llvm 12.0.1 packages for Red Hat Enterprise Linux 8 and compatible derivatives. This fixes compatibility
issues with LLVM from upstream.
supplemental etcd packages which can be used for setting up Patroni clusters. These packages are
available for the following operating systems:
Percona Distribution for PostgreSQL is also shipped with the libpq library. It contains “a set of library
functions that allow client programs to pass queries to the PostgreSQL backend server and to receive the results
of these queries.”
Page 180
Percona Distribution for PostgreSQL 16.0 Update
(2023-11-02)
Installation Upgrade
This update to the release of Percona Distribution for PostgreSQL 16.0 includes the Docker images for x86_64
architectures. It aims to simplify the developers’ experience with the Distribution. Refer to the Docker guide for
how to run Percona Distribution for PostgreSQL in Docker.
Page 181
Percona Distribution for PostgreSQL 16.0 (2023-
09-19)
Installation Upgrade
We are pleased to announce the launch of Percona Distribution for PostgreSQL 16.0 - a solution with the
collection of tools from PostgreSQL community that are tested to work together and serve to assist you in
deploying and managing PostgreSQL. The aim of Percona Distribution for PostgreSQL is to address the
operational issues like High-Availability, Disaster Recovery, Security, Observability, Spatial data handling,
Performance and Scalability and others that enterprises are facing.
Release Highlights
Percona Distribution for PostgreSQL 16 features a lot of new functionalities and enhancements to performance,
replication, monitoring, developer experience and more. Among them are the following:
Performance improvements:
The support for CPU acceleration using SIMD (Single Instruction/Multiple Data) computing method for
both x86 and ARM architectures
Optimized processing of ASCII and JSON strings and array and subtransaction searches improve query
processing performance.
The support of concurrent bulk data loading using COPY boost performance up to 300%.
Load balancing for libpq client simplifies scaling read queries.
The use of lz4 and zstd compression methods for pg_dump results in better compression performance
compared to the default gzip compression method.
Logical replication improvements include the ability to apply large transactions in parallel and the ability to
perform logical decoding on standbys. This enables users to distribute their workloads among nodes.
Developer experience:
The standard for manipulating JSON data now includes the SQL/JSON constructors and the ANY_VALUE
aggregate function, which returns any arbitrary value from the aggregate set.
Developers can now specify non-decimal integers such as 0xff and 0o777.
Added support for the extended query protocol to the psql client.
Privilege administration improvements:
Logical replication subscribers now execute transactions on a table as the table owner, not the superuser.
Now only the users that have the ADMIN OPTION for roles can grant privileges in these roles using the
CREATEROLE function. This allows fine-tuning privileges definition.
Page 182
New predefined role pg_maintain enables non-superusers to run VACUUM, ANALYZE, CLUSTER,
REFRESH MATERIALIZED VIEW, REINDEX, and LOCK TABLE on all tables.
Database administrators can now view insights on I/O statistics in a separate pg_stat_io view provides.
See also
The following is the list of extensions available in Percona Distribution for PostgreSQL.
PgAudit 16.0 provides detailed session or object audit logging via the standard logging facility
provided by PostgreSQL
pgAudit set_user 4.0.1 provides an additional layer of logging and control when unprivileged users must
escalate themselves to superusers or object owner roles in order to perform
needed maintenance tasks.
pg_gather v22 an SQL script for running the diagnostics of the health of PostgreSQL cluster
pgpool2 4.4.4 a middleware between PostgreSQL server and client for high availability,
connection pooling and load balancing.
pg_stat_monitor 2.0.2 collects and aggregates statistics for PostgreSQL and provides histogram
information.
Page 183
Extension Version Description
llvm 12.0.1 packages for Red Hat Enterprise Linux 8 / CentOS 8. This fixes compatibility issues with LLVM
from upstream.
supplemental etcd packages which can be used for setting up Patroni clusters. These packages are
available for the following operating systems:
Percona Distribution for PostgreSQL is also shipped with the libpq library. It contains “a set of library
functions that allow client programs to pass queries to the PostgreSQL backend server and to receive the results
of these queries.”
Page 184
Reference
Page 185
Telemetry and data collection
Percona collects usage data to improve its software. The telemetry feature helps us identify popular features,
detect problems, and plan future improvements. All collected data is anonymized so that it can’t be traced back
to any individual user.
Currently, telemetry is added only to the Percona packages and to Docker images. It is enabled by default so you
must be running the latest version of percona-release to install Percona Distribution for PostgreSQL
packages or update it to the latest version.
The information about the installation environment when you install the software.
The information about the operating system such as name, architecture, the list of Percona packages. See
more in the Telemetry Agent section.
The metrics from the database instance. See more in the percona_pg_telemetry section.
All collected data is anonymous, meaning it can’t be traced back to any individual user. To learn more about how
Percona handles your data, read the Percona Privacy statement.
You control whether to share this information. Participation in this program is completely voluntary. If don’t want
to share anonymous data, you can disable telemetry.
Advantages Description
See how people use Telemetry collects anonymous data on how users interact with our software. This tells
your software developers which features are popular, which ones are confusing, and if anything is causing
crashes.
Identify issues early Telemetry can catch bugs or performance problems before they become widespread.
Page 186
Advantages Description
Faster bug fixes With telemetry data, developers can pinpoint issues affecting specific use cases and prioritize fixing
them quickly.
Improved features Telemetry helps developers understand user needs and preferences. This allows them to focus on
features that will be genuinely useful and improve your overall experience.
Improved user By identifying and resolving issues early, telemetry helps create a more stable and reliable software
experience experience for everyone.
Telemetry components
Percona collects information using the following components:
Telemetry script that sends the information about the software and the environment where it is installed. This
information is collected only once during the installation.
The percona_pg_telemetry extension collects the necessary metrics directly from the database and stores
them in a Metrics File.
The Metrics File stores the metrics and is a standalone file located on the database host’s file system.
The Telemetry Agent is an independent process running on your database host’s operating system and
carries out the following tasks:
Collects the list of installed Percona packages using the local package manager
The telemetry also uses the Percona Platform with the following components:
Telemetry Service - offers an API endpoint for sending telemetry. The service handles incoming requests. This
service saves the data into Telemetry Storage.
Telemetry Storage - stores all telemetry data for the long term.
percona_pg_telemetry
percona_pg_telemetry is an extension to collect telemetry data in PostgreSQL. It is added to Percona
Distribution for PostgreSQL and is automatically loaded when you install a PostgreSQL server.
percona_pg_telemetry collects metrics from the database instance daily to the Metrics File. It creates a new
Metrics File for each collection. You can find the Metrics File in its location to inspect what data is collected.
Page 187
Before generating a new file, the percona_pg_telemetry deletes the Metrics Files that are older than seven
days. This process ensures that only the most recent week’s data is maintained.
The percona_pg_telemetry extension creates a file in the local file system using a timestamp and a randomly
generated token as the name with a .json extension.
Metrics File
The Metrics File is a JSON file with the metrics collected by the percona_pg_telemetry extension.
Locations
Percona stores the Metrics File in one of the following directories on the local file system. The location depends
on the product.
The Metrics File uses the Javascript Object Notation (JSON) format. Percona reserves the right to extend the
current set of JSON structure attributes in the future.
The following is an example of the collected data generated by the percona_pg_telemetry extension:
Page 188
{
"db_instance_id": "7310358902660071382",
"pillar_version": "16.3",
"uptime": "36",
"databases_count": "2",
"settings": [
{
"key": "setting",
"value": [
{
"key": "name",
"value": "allow_in_place_tablespaces"
},
{
"key": "unit",
"value": "NULL"
},
{
"key": "setting",
"value": "off"
},
{
"key": "reset_val",
"value": "off"
},
{
"key": "boot_val",
"value": "off"
}
]
},
...
],
"databases": [
{
"key": "database",
"value": [
{
"key": "database_oid",
"value": "5"
},
{
"key": "database_size",
"value": "7820895"
},
{
"key": "active_extensions",
"value": [
{
"key": "extension_name",
"value": "plpgsql"
},
{
Page 189
"key": "extension_name",
"value": "pg_tde"
},
{
"key": "extension_name",
"value": "percona_pg_telemetry"
}
]
}
]
}
]
}
Telemetry Agent
The Percona Telemetry Agent runs as a dedicated OS daemon process percona-telemetry-agent . It creates,
${telemetry root path} . You can find the agent’s log file at
reads, writes, and deletes JSON files in the
/var/log/percona/telemetry-agent.log .
The agent does not send anything if there are no Percona-specific files in the target directory.
{
"reports": [
{
"id": "B5BDC47B-B717-4EF5-AEDF-41A17C9C18BB",
"createTime": "2023-09-01T10:56:49Z",
"instanceId": "B5BDC47B-B717-4EF5-AEDF-41A17C9C18BA",
"productFamily": "PRODUCT_FAMILY_POSTGRESQL",
"metrics": [
{
"key": "OS",
"value": "Ubuntu"
},
{
"key": "pillar_version",
"value": "16.3"
}
]
}
]
}
Page 190
Key Description
The following operating system-level metrics are sent with each check:
Key Description
Package name
Package version - the same format as Red Hat Enterprise Linux or Debian
percona-*
Percona-*
proxysql*
pmm
etcd*
haproxy
patroni
Page 191
pg*
postgis
wal2json
Disable telemetry
Telemetry is enabled by default when you install the software. It is also included in the software packages
(Telemetry Subsystem and Telemetry Agent) and enabled by default.
If you decide not to send usage data to Percona when you install the software, you can set the
PERCONA_TELEMETRY_DISABLE=1 environment variable for either the root user or in the operating system prior
to the installation process.
Page 192
Debian-derived distribution
Docker
Page 193
You can either disable the Telemetry Agent temporarily or permanently.
Disable temporarily
Turn off Telemetry Agent temporarily until the next server restart with this command:
Disable permanently
Even after stopping the Telemetry Agent service, a different part of the software ( percona_pg_telemetry )
continues to create the Metrics File related to telemetry every day and saves that file for seven days.
If you decide to remove the Telemetry Agent, this also removes the database. That’s because the Telemetry
Agent is a mandatory dependency for Percona Distribution for PostgreSQL.
On YUM-based systems, the system removes the Telemetry Agent package when you remove the last
dependency package.
On APT-based systems, you must use the ‘–autoremove’ option to remove all dependencies, as the system
doesn’t automatically remove the Telemetry Agent when you remove the database package.
The ‘–autoremove’ option only removes unnecessary dependencies. It doesn’t remove dependencies required by
other packages or guarantee the removal of all package-associated dependencies.
To disable the Metrics File creation, stop and drop the percona_pg_telemetry extension. Here’s how to do it:
1. Stop the extension and reapply the configuration for the changes to take effect:
Page 194
3. Remove percona_pg_telemetry from the shared_preload_libraries configuration parameter:
Important
If the shared_preload_libraries parameter includes other modules, specify them all for the ALTER SYSTEM SET
command to keep using them.
Tip
If you wish to re-enable the Telemetry Subsystem, complete the above steps in the reverse order:
Page 195
Copyright and licensing information
Percona Distribution for PostgreSQL is licensed under the PostgreSQL license and licenses of all components
included in the Distribution.
Documentation licensing
Percona Distribution for PostgreSQL documentation is (C)2009-2023 Percona LLC and/or its affiliates and is
distributed under the Creative Commons Attribution 4.0 International License .
Page 196
Trademark Policy
This Trademark Policy is to ensure that users of Percona-branded products or services know that what they
receive has really been developed, approved, tested and maintained by Percona. Trademarks help to prevent
confusion in the marketplace, by distinguishing one company’s or person’s products and services from another’s.
Percona owns a number of marks, including but not limited to Percona, XtraDB, Percona XtraDB, XtraBackup,
Percona XtraBackup, Percona Server, and Percona Live, plus the distinctive visual icons and logos associated
with these marks. Both the unregistered and registered marks of Percona are protected.
Use of any Percona trademark in the name, URL, or other identifying characteristic of any product, service,
website, or other use is not permitted without Percona’s written permission with the following three limited
exceptions.
First, you may use the appropriate Percona mark when making a nominative fair use reference to a bona fide
Percona product.
Second, when Percona has released a product under a version of the GNU General Public License (“GPL”), you
may use the appropriate Percona mark when distributing a verbatim copy of that product in accordance with the
terms and conditions of the GPL.
Third, you may use the appropriate Percona mark to refer to a distribution of GPL-released Percona software that
has been modified with minor changes for the sole purpose of allowing the software to operate on an operating
system or hardware platform for which Percona has not yet released the software, provided that those third
party changes do not affect the behavior, functionality, features, design or performance of the software. Users
who acquire this Percona-branded software receive substantially exact implementations of the Percona
software.
Percona reserves the right to revoke this authorization at any time in its sole discretion. For example, if Percona
believes that your modification is beyond the scope of the limited license granted in this Policy or that your use
of the Percona mark is detrimental to Percona, Percona will revoke this authorization. Upon revocation, you must
immediately cease using the applicable Percona mark. If you do not immediately cease using the Percona mark
upon revocation, Percona may take action to protect its rights and interests in the Percona mark. Percona does
not grant any license to use any Percona mark for any other modified versions of Percona software; such use
will require our prior written permission.
Neither trademark law nor any of the exceptions set forth in this Trademark Policy permit you to truncate, modify
or otherwise use any Percona mark as part of your own brand. For example, if XYZ creates a modified version of
the Percona Server, XYZ may not brand that modification as “XYZ Percona Server” or “Percona XYZ Server”, even
if that modification otherwise complies with the third exception noted above.
In all cases, you must comply with applicable law, the underlying license, and this Trademark Policy, as amended
from time to time. For instance, any mention of Percona trademarks should include the full trademarked name,
with proper spelling and capitalization, along with attribution of ownership to Percona Inc. For example, the full
proper name for XtraBackup is Percona XtraBackup. However, it is acceptable to omit the word “Percona” for
brevity on the second and subsequent uses, where such omission does not cause confusion.
Page 197
In the event of doubt as to any of the conditions or exceptions outlined in this Trademark Policy, please contact
trademarks@percona.com for assistance and we will do our very best to be helpful.
Page 198