PG Logical Replication Explained

This article explains the principles and best practices of logical replication in PostgreSQL 13.

Logical Replication

Logical Replication is a method of replicating data objects and their changes based on the Replica Identity (typically primary keys) of data objects.

The term Logical Replication contrasts with Physical Replication, where physical replication uses exact block addresses and byte-by-byte copying, while logical replication allows fine-grained control over the replication process.

Logical replication is based on a Publication and Subscription model:

  • A Publisher can have multiple publications, and a Subscriber can have multiple subscriptions.
  • A publication can be subscribed to by multiple subscribers, while a subscription can only subscribe to one publisher, but can subscribe to multiple different publications from the same publisher.

Logical replication for a table typically works like this: The subscriber takes a snapshot of the publisher’s database and copies the existing data in the table. Once the data copy is complete, changes (inserts, updates, deletes, truncates) on the publisher are sent to the subscriber in real-time. The subscriber applies these changes in the same order, ensuring transactional consistency in logical replication. This approach is sometimes called transactional replication.

Typical use cases for logical replication include:

  • Migration: Replication across different PostgreSQL versions and operating system platforms.
  • CDC (Change Data Capture): Collecting incremental changes in a database (or a subset of it) and triggering custom logic on subscribers for these changes.
  • Data Integration: Combining multiple databases into one, or splitting one database into multiple, for fine-grained integration and access control.

A logical subscriber behaves like a normal PostgreSQL instance (primary), and can also create its own publications and have its own subscribers.

If the logical subscriber is read-only, there will be no conflicts. However, if writes are performed on the subscriber’s subscription set, conflicts may occur.


Publication

A Publication can be defined on a physical replication primary. The node that creates the publication is called the Publisher.

A Publication is a collection of changes from a set of tables. It can also be viewed as a change set or replication set. Each publication can only exist in one Database.

Publications are different from Schemas and don’t affect how tables are accessed. (Whether a table is included in a publication or not doesn’t affect its access)

Currently, publications can only contain tables (i.e., indexes, sequences, materialized views are not published), and each table can be added to multiple publications.

Unless creating a publication for ALL TABLES, objects (tables) in a publication can only be explicitly added (via ALTER PUBLICATION ADD TABLE).

Publications can filter the types of changes required: any combination of INSERT, UPDATE, DELETE, and TRUNCATE, similar to trigger events. By default, all changes are published.

Replica Identity

Replica Identity

A table included in a publication must have a Replica Identity, which is necessary to locate the rows that need to be updated on the subscriber side for UPDATE and DELETE operations.

By default, the Primary Key is the table’s replica identity. A UNIQUE NOT NULL index can also be used as a replica identity.

If there is no replica identity, it can be set to FULL, meaning the entire row is used as the replica identity. (An interesting case: multiple identical records can be handled correctly, as shown in later examples) Using FULL mode for replica identity is inefficient (because each row modification requires a full table scan on the subscriber, which can easily overwhelm the subscriber), so this configuration should only be used as a last resort. Using FULL mode for replica identity also has a limitation: the columns included in the replica identity on the subscriber’s table must either match the publisher or be fewer than on the publisher.

INSERT operations can always proceed regardless of the replica identity (because inserting a new record doesn’t require locating any existing records on the subscriber; while deletes and updates need to locate records through the replica identity). If a table without a replica identity is added to a publication with UPDATE and DELETE, subsequent UPDATE and DELETE operations will cause errors on the publisher.

The replica identity mode of a table can be checked in pg_class.relreplident and modified via ALTER TABLE.

ALTER TABLE tbl REPLICA IDENTITY 
{ DEFAULT | USING INDEX index_name | FULL | NOTHING };

Although various combinations are possible, in practice, only three scenarios are viable:

  • Table has a primary key, using the default default replica identity
  • Table has no primary key but has a non-null unique index, explicitly configured with index replica identity
  • Table has neither primary key nor non-null unique index, explicitly configured with full replica identity (very inefficient, only as a last resort)
  • All other cases cannot properly complete logical replication functionality. Insufficient information output may result in errors or may not.
  • Special attention: If a table with nothing replica identity is included in logical replication, performing updates or deletes on it will cause errors on the publisher!
Replica Identity Mode\Table Constraints Primary Key(p) Unique NOT NULL Index(u) Neither(n)
default Valid x x
index x Valid x
full Inefficient Inefficient Inefficient
nothing xxxx xxxx xxxx

Managing Publications

CREATE PUBLICATION is used to create a publication, DROP PUBLICATION to remove it, and ALTER PUBLICATION to modify it.

After a publication is created, tables can be dynamically added to or removed from it using ALTER PUBLICATION, and these operations are transactional.

CREATE PUBLICATION name
    [ FOR TABLE [ ONLY ] table_name [ * ] [, ...]
      | FOR ALL TABLES ]
    [ WITH ( publication_parameter [= value] [, ... ] ) ]

ALTER PUBLICATION name ADD TABLE [ ONLY ] table_name [ * ] [, ...]
ALTER PUBLICATION name SET TABLE [ ONLY ] table_name [ * ] [, ...]
ALTER PUBLICATION name DROP TABLE [ ONLY ] table_name [ * ] [, ...]
ALTER PUBLICATION name SET ( publication_parameter [= value] [, ... ] )
ALTER PUBLICATION name OWNER TO { new_owner | CURRENT_USER | SESSION_USER }
ALTER PUBLICATION name RENAME TO new_name

DROP PUBLICATION [ IF EXISTS ] name [, ...];

publication_parameter mainly includes two options:

  • publish: Defines the types of change operations to publish, a comma-separated string, defaulting to insert, update, delete, truncate.
  • publish_via_partition_root: New option in PostgreSQL 13, if true, partitioned tables will use the root partition’s replica identity for logical replication.

Querying Publications

Publications can be queried using the psql meta-command \dRp.

# \dRp
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
----------+------------+---------+---------+---------+-----------+----------
 postgres | t          | t       | t       | t       | t         | f

pg_publication Publication Definition Table

pg_publication contains the original publication definitions, with each record corresponding to a publication.

# table pg_publication;
oid          | 20453
pubname      | pg_meta_pub
pubowner     | 10
puballtables | t
pubinsert    | t
pubupdate    | t
pubdelete    | t
pubtruncate  | t
pubviaroot   | f
  • puballtables: Whether it includes all tables
  • pubinsert|update|delete|truncate: Whether these operations are published
  • pubviaroot: If this option is set, any partitioned table (leaf table) will use the top-level partitioned table’s replica identity. This allows treating the entire partitioned table as one table rather than a series of tables for publication.

pg_publication_tables Publication Content Table

pg_publication_tables is a view composed of pg_publication, pg_class, and pg_namespace, recording the table information included in publications.

postgres@meta:5432/meta=# table pg_publication_tables;
   pubname   | schemaname |    tablename
-------------+------------+-----------------
 pg_meta_pub | public     | spatial_ref_sys
 pg_meta_pub | public     | t_normal
 pg_meta_pub | public     | t_unique
 pg_meta_pub | public     | t_tricky

Use pg_get_publication_tables to get the OIDs of subscribed tables based on the subscription name:

SELECT * FROM pg_get_publication_tables('pg_meta_pub');
SELECT p.pubname,
       n.nspname AS schemaname,
       c.relname AS tablename
FROM pg_publication p,
     LATERAL pg_get_publication_tables(p.pubname::text) gpt(relid),
     pg_class c
         JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.oid = gpt.relid;

Similarly, pg_publication_rel provides similar information but from a many-to-many OID correspondence perspective, containing raw data.

  oid  | prpubid | prrelid
-------+---------+---------
 20414 |   20413 |   20397
 20415 |   20413 |   20400
 20416 |   20413 |   20391
 20417 |   20413 |   20394

It’s important to note the difference between these two: When publishing for ALL TABLES, pg_publication_rel won’t have specific table OIDs, but pg_publication_tables can query the actual list of tables included in logical replication. Therefore, pg_publication_tables should typically be used as the reference.

When creating a subscription, the database first modifies the pg_publication catalog, then fills in the publication table information into pg_publication_rel.


Subscription

A Subscription is the downstream of logical replication. The node that defines the subscription is called the Subscriber.

A subscription defines: how to connect to another database, and which publications from the target publisher to subscribe to.

A logical subscriber behaves like a normal PostgreSQL instance (primary), and can also create its own publications and have its own subscribers.

Each subscriber receives changes through a Replication Slot, and during the initial data replication phase, additional temporary replication slots may be required.

A logical replication subscription can serve as a synchronous replication standby, with the standby’s name defaulting to the subscription name, or a different name can be used by setting application_name in the connection information.

Only superusers can dump subscription definitions using pg_dump, as only superusers can access the pg_subscription view. Regular users attempting to dump will skip and print a warning message.

Logical replication doesn’t replicate DDL changes, so tables in the publication set must already exist on the subscriber side. Only changes to regular tables are replicated; views, materialized views, sequences, and indexes are not replicated.

Tables on the publisher and subscriber are matched by their fully qualified names (e.g., public.table), and replicating changes to a table with a different name is not supported.

Columns on the publisher and subscriber are matched by name. Column order doesn’t matter, and data types don’t have to be identical, as long as the text representation of the two columns is compatible, meaning the text representation of the data can be converted to the target column’s type. The subscriber’s table can contain columns not present on the publisher, and these new columns will be filled with default values.

Managing Subscriptions

CREATE SUBSCRIPTION is used to create a subscription, DROP SUBSCRIPTION to remove it, and ALTER SUBSCRIPTION to modify it.

After a subscription is created, it can be paused and resumed at any time using ALTER SUBSCRIPTION.

Removing and recreating a subscription will result in loss of synchronization information, meaning the relevant data needs to be resynchronized.

CREATE SUBSCRIPTION subscription_name
    CONNECTION 'conninfo'
    PUBLICATION publication_name [, ...]
    [ WITH ( subscription_parameter [= value] [, ... ] ) ]

ALTER SUBSCRIPTION name CONNECTION 'conninfo'
ALTER SUBSCRIPTION name SET PUBLICATION publication_name [, ...] [ WITH ( set_publication_option [= value] [, ... ] ) ]
ALTER SUBSCRIPTION name REFRESH PUBLICATION [ WITH ( refresh_option [= value] [, ... ] ) ]
ALTER SUBSCRIPTION name ENABLE
ALTER SUBSCRIPTION name DISABLE
ALTER SUBSCRIPTION name SET ( subscription_parameter [= value] [, ... ] )
ALTER SUBSCRIPTION name OWNER TO { new_owner | CURRENT_USER | SESSION_USER }
ALTER SUBSCRIPTION name RENAME TO new_name

DROP SUBSCRIPTION [ IF EXISTS ] name;

subscription_parameter defines some options for the subscription, including:

  • copy_data(bool): Whether to copy data after replication starts, defaults to true
  • create_slot(bool): Whether to create a replication slot on the publisher, defaults to true
  • enabled(bool): Whether to enable the subscription, defaults to true
  • connect(bool): Whether to attempt to connect to the publisher, defaults to true. Setting to false will force the above options to false.
  • synchronous_commit(bool): Whether to enable synchronous commit, reporting progress information to the primary.
  • slot_name: The name of the replication slot associated with the subscription. Setting to empty will disassociate the subscription from the replication slot.

Managing Replication Slots

Each active subscription receives changes from the remote publisher through a replication slot.

Typically, this remote replication slot is automatically managed, created automatically during CREATE SUBSCRIPTION and deleted during DROP SUBSCRIPTION.

In specific scenarios, it may be necessary to operate on the subscription and the underlying replication slot separately:

  • When creating a subscription, if the required replication slot already exists. In this case, you can associate with the existing replication slot using create_slot = false.

  • When creating a subscription, if the remote host is unreachable or its state is unclear, you can avoid accessing the remote host using connect = false. This is what pg_dump does. In this case, you must manually create the replication slot on the remote side before enabling the subscription locally.

  • When removing a subscription, if you need to retain the replication slot. This typically happens when the subscriber is being moved to another machine where you want to restart the subscription. In this case, you need to first disassociate the subscription from the replication slot using ALTER SUBSCRIPTION.

  • When removing a subscription, if the remote host is unreachable. In this case, you need to disassociate the replication slot from the subscription before deleting the subscription.

    If the remote instance is no longer in use, it’s fine. However, if the remote instance is only temporarily unreachable, you should manually delete its replication slot; otherwise, it will continue to retain WAL and may cause the disk to fill up.

Querying Subscriptions

Subscriptions can be queried using the psql meta-command \dRs.

# \dRs
     Name     |  Owner   | Enabled |  Publication
--------------+----------+---------+----------------
 pg_bench_sub | postgres | t       | {pg_bench_pub}

pg_subscription Subscription Definition Table

Each logical subscription has one record. Note that this view is cluster-wide, and each database can see the subscription information for the entire cluster.

Only superusers can access this view because it contains plaintext passwords (connection information).

oid             | 20421
subdbid         | 19356
subname         | pg_test_sub
subowner        | 10
subenabled      | t
subconninfo     | host=10.10.10.10 user=replicator password=DBUser.Replicator dbname=meta
subslotname     | pg_test_sub
subsynccommit   | off
subpublications | {pg_meta_pub}
  • subenabled: Whether the subscription is enabled
  • subconninfo: Hidden from regular users because it contains sensitive information.
  • subslotname: The name of the replication slot used by the subscription, also used as the logical replication origin name for deduplication.
  • subpublications: List of publication names subscribed to.
  • Other status information: Whether synchronous commit is enabled, etc.

pg_subscription_rel Subscription Content Table

pg_subscription_rel records information about each table in the subscription, including status and progress.

  • srrelid: OID of the relation in the subscription
  • srsubstate: State of the relation in the subscription: i initializing, d copying data, s synchronization completed, r normal replication.
  • srsublsn: When in i|d state, it’s empty. When in s|r state, it’s the LSN position on the remote side.

When Creating a Subscription

When a new subscription is created, the following operations are performed in sequence:

  • Store the publication information in the pg_subscription catalog, including connection information, replication slot, publication names, and some configuration options.
  • Connect to the publisher, check replication permissions (note that it does not check if the corresponding publication exists),
  • Create a logical replication slot: pg_create_logical_replication_slot(name, 'pgoutput')
  • Register the tables in the replication set to the subscriber’s pg_subscription_rel catalog.
  • Execute the initial snapshot synchronization. Note that existing data in the subscriber’s tables is not deleted.

Replication Conflicts

Logical replication behaves like normal DML operations, updating data even if it has been locally changed on the user node. If the replicated data violates any constraints, replication stops, a phenomenon known as conflicts.

When replicating UPDATE or DELETE operations, missing data (i.e., data to be updated/deleted no longer exists) doesn’t cause conflicts, and such operations are simply skipped.

Conflicts cause errors and abort logical replication. The logical replication management process will retry at 5-second intervals. Conflicts don’t block SQL operations on tables in the subscription set on the subscriber side. Details about conflicts can be found in the user’s server logs, and conflicts must be manually resolved by the user.

Possible Conflicts in Logs

Conflict Mode Replication Process Output Log
Missing UPDATE/DELETE Object Continue No Output
Table/Row Lock Wait Wait No Output
Violation of Primary Key/Unique/Check Constraints Abort Output
Target Table/Column Missing Abort Output
Cannot Convert Data to Target Column Type Abort Output

To resolve conflicts, you can either modify the data on the subscriber side to avoid conflicts with incoming changes, or skip transactions that conflict with existing data.

Use the subscription’s node_name and LSN position to call the pg_replication_origin_advance() function to skip transactions. The current ORIGIN position can be seen in the pg_replication_origin_status system view.

Limitations

Logical replication currently has the following limitations or missing features. These issues may be resolved in future versions.

Database schemas and DDL commands are not replicated. Existing schemas can be manually replicated using pg_dump --schema-only, and incremental schema changes need to be manually kept in sync (the schemas on both publisher and subscriber don’t need to be absolutely identical). Logical replication remains reliable for online DDL changes: after executing DDL changes in the publisher database, replicated data reaches the subscriber but replication stops due to table schema mismatch. After updating the subscriber’s schema, replication continues. In many cases, executing changes on the subscriber first can avoid intermediate errors.

Sequence data is not replicated. The data in identity columns served by sequences and SERIAL types is replicated as part of the table, but the sequences themselves remain at their initial values on the subscriber. If the subscriber is used as a read-only database, this is usually fine. However, if you plan to perform some form of switchover or failover to the subscriber database, you need to update the sequences to their latest values, either by copying the current data from the publisher (perhaps using pg_dump -t *seq*), or by determining a sufficiently high value from the table’s data content (e.g., max(id)+1000000). Otherwise, if you perform operations that obtain sequence values as identities on the new database, conflicts are likely to occur.

Logical replication supports replicating TRUNCATE commands, but special care is needed when TRUNCATE involves a group of tables linked by foreign keys. When executing a TRUNCATE operation, the group of associated tables on the publisher (through explicit listing or cascade association) will all be TRUNCATEd, but on the subscriber, tables not in the subscription set won’t be TRUNCATEd. This is logically reasonable because logical replication shouldn’t affect tables outside the replication set. But if there are tables not in the subscription set that reference tables in the subscription set through foreign keys, the TRUNCATE operation will fail.

Large objects are not replicated

Only tables can be replicated (including partitioned tables). Attempting to replicate other types of tables will result in errors (views, materialized views, foreign tables, unlogged tables). Specifically, only tables with pg_class.relkind = 'r' can participate in logical replication.

When replicating partitioned tables, replication is done by default at the child table level. By default, changes are triggered according to the leaf partitions of the partitioned table, meaning that every partition child table on the publisher needs to exist on the subscriber (of course, this partition child table on the subscriber doesn’t have to be a partition child table, it could be a partition parent table itself, or a regular table). The publication can declare whether to use the replica identity from the partition root table instead of the replica identity from the partition leaf table. This is a new feature in PostgreSQL 13 and can be specified through the publish_via_partition_root option when creating the publication.

Trigger behavior is different. Row-level triggers fire, but UPDATE OF cols type triggers don’t. Statement-level triggers only fire during initial data copying.

Logging behavior is different. Even with log_statement = 'all', SQL statements generated by replication won’t be logged.

Bidirectional replication requires extreme caution: It’s possible to have mutual publication and subscription as long as the table sets on both sides don’t overlap. But once there’s an intersection of tables, WAL infinite loops will occur.

Replication within the same instance: Logical replication within the same instance requires special caution. You must manually create logical replication slots and use existing logical replication slots when creating subscriptions, otherwise it will hang.

Only possible on primary: Currently, logical decoding from physical replication standbys is not supported, and replication slots cannot be created on standbys, so standbys cannot be publishers. But this issue may be resolved in the future.


Architecture

Logical replication begins by taking a snapshot of the publisher’s database and copying the existing data in tables based on this snapshot. Once the copy is complete, changes (inserts, updates, deletes, etc.) on the publisher are sent to the subscriber in real-time.

Logical replication uses an architecture similar to physical replication, implemented through a walsender and apply process. The publisher’s walsender process loads the logical decoding plugin (pgoutput) and begins logical decoding of WAL logs. The Logical Decoding Plugin reads changes from WAL, filters changes according to the publication definition, transforms changes into a specific format, and transmits them using the logical replication protocol. Data is transmitted to the subscriber’s apply process using the streaming replication protocol. This process maps changes to local tables when received and reapplies these changes in transaction order.

Initial Snapshot

During initialization and data copying, tables on the subscriber side are handled by a special apply process. This process creates its own temporary replication slot and copies the existing data in tables.

Once data copying is complete, the table enters synchronization mode (pg_subscription_rel.srsubstate = 's'), which ensures that the main apply process can apply changes that occurred during the data copying period using standard logical replication methods. Once synchronization is complete, control of table replication is transferred back to the main apply process, returning to normal replication mode.

Process Structure

The publisher creates a corresponding walsender process for each connection from the subscriber, sending decoded WAL logs. On the subscriber side, it creates an apply process to receive and apply changes.

Replication Slots

When creating a subscription, a logical replication slot is created on the publisher. This slot ensures that WAL logs are retained until they are successfully applied on the subscriber.

Logical Decoding

Logical decoding is the process of converting WAL records into a format that can be understood by logical replication. The pgoutput plugin is the default logical decoding plugin in PostgreSQL.

Synchronous Commit

Synchronous commit in logical replication is completed through SIGUSR1 communication between Backend and Walsender.

Temporary Data

Temporary data from logical decoding is written to disk as local log snapshots. When the walsender receives a SIGUSR1 signal from the walwriter, it reads WAL logs and generates corresponding logical decoding snapshots. These snapshots are deleted when transmission ends.

The file location is: $PGDATA/pg_logical/snapshots/{LSN Upper}-{LSN Lower}.snap

Monitoring

Logical replication uses an architecture similar to physical stream replication, so monitoring a logical replication publisher node is not much different from monitoring a physical replication primary.

Subscriber monitoring information can be obtained through the pg_stat_subscription view.

pg_stat_subscription Subscription Statistics Table

Each active subscription will have at least one record in this view, representing the Main Worker (responsible for applying logical logs).

The Main Worker has relid = NULL. If there are processes responsible for initial data copying, they will also have a record here, with relid being the table being copied.

subid                 | 20421
subname               | pg_test_sub
pid                   | 5261
relid                 | NULL
received_lsn          | 0/2A4F6B8
last_msg_send_time    | 2021-02-22 17:05:06.578574+08
last_msg_receipt_time | 2021-02-22 17:05:06.583326+08
latest_end_lsn        | 0/2A4F6B8
latest_end_time       | 2021-02-22 17:05:06.578574+08
  • received_lsn: The most recently received log position.
  • latest_end_lsn: The last LSN position reported to the walsender, i.e., the confirmed_flush_lsn on the primary. However, this value is not updated very frequently.

Typically, an active subscription will have an apply process running, while disabled or crashed subscriptions won’t have records in this view. During initial synchronization, synchronized tables will have additional worker process records.

pg_replication_slot Replication Slots

postgres@meta:5432/meta=# table pg_replication_slots ;
-[ RECORD 1 ]-------+------------
slot_name           | pg_test_sub
plugin              | pgoutput
slot_type           | logical
datoid              | 19355
database            | meta
temporary           | f
active              | t
active_pid          | 89367
xmin                | NULL
catalog_xmin        | 1524
restart_lsn         | 0/2A08D40
confirmed_flush_lsn | 0/2A097F8
wal_status          | reserved
safe_wal_size       | NULL

The replication slots view contains both logical and physical replication slots. The main characteristics of logical replication slots are:

  • plugin field is not empty, identifying the logical decoding plugin used. Logical replication defaults to using the pgoutput plugin.
  • slot_type = logical, while physical replication slots are of type physical.
  • datoid and database fields are not empty because physical replication is associated with the cluster, while logical replication is associated with the database.

Logical subscribers also appear as standard replication standbys in the pg_stat_replication view.

pg_replication_origin Replication Origin

Replication origin

table pg_replication_origin_status;
-[ RECORD 1 ]-----------
local_id    | 1
external_id | pg_19378
remote_lsn  | 0/0
local_lsn   | 0/6BB53640
  • local_id: The local ID of the replication origin, represented efficiently in 2 bytes.
  • external_id: The ID of the replication origin, which can be referenced across nodes.
  • remote_lsn: The most recent commit position on the source.
  • local_lsn: The LSN of locally persisted commit records.

Detecting Replication Conflicts

The most reliable method of detection is always from the logs on both publisher and subscriber sides. When replication conflicts occur, you can see replication connection interruptions on the publisher:

LOG:  terminating walsender process due to replication timeout
LOG:  starting logical decoding for slot "pg_test_sub"
DETAIL:  streaming transactions committing after 0/xxxxx, reading WAL from 0/xxxx

While on the subscriber side, you can see the specific cause of the replication conflict, for example:

logical replication worker PID 4585 exited with exit code 1
ERROR: duplicate key value violates unique constraint "pgbench_tellers_pkey","Key (tid)=(9) already exists.",,,,"COPY pgbench_tellers, line 31",,,,"","logical replication worker"

Additionally, some monitoring metrics can reflect the state of logical replication:

For example: pg_replication_slots.confirmed_flush_lsn consistently lagging behind pg_cureent_wal_lsn. Or significant growth in pg_stat_replication.flush_ag/write_lag.

Security

Tables participating in subscriptions must have their Ownership and Trigger permissions controlled by roles trusted by the superuser (otherwise, modifying these tables could cause logical replication to stop).

On the publisher node, if untrusted users have table creation permissions, publications should explicitly specify table names rather than using the wildcard ALL TABLES. That is, FOR ALL TABLES should only be used when the superuser trusts all users who have permission to create tables (non-temporary) on either the publisher or subscriber side.

The user used for replication connections must have the REPLICATION permission (or be a SUPERUSER). If this role lacks SUPERUSER and BYPASSRLS, row security policies on the publisher may be executed. If the table owner sets row-level security policies after replication starts, this configuration may cause replication to stop directly rather than the policy taking effect. The user must have LOGIN permission, and HBA rules must allow access.

To be able to replicate initial table data, the role used for replication connections must have SELECT permission on the published tables (or be a superuser).

Creating a publication requires CREATE permission in the database, and creating a FOR ALL TABLES publication requires superuser permission.

Adding tables to a publication requires owner permission on the tables.

Creating a subscription requires superuser permission because the subscription’s apply process runs with superuser privileges in the local database.

Permissions are only checked when establishing the replication connection, not when reading each change record on the publisher side, nor when applying each record on the subscriber side.

Configuration Options

Logical replication requires some configuration options to work properly.

On the publisher side, wal_level must be set to logical, max_replication_slots needs to be at least the number of subscriptions + the number used for table data synchronization. max_wal_senders needs to be at least max_replication_slots + the number reserved for physical replication.

On the subscriber side, max_replication_slots also needs to be set, with max_replication_slots needing to be at least the number of subscriptions.

max_logical_replication_workers needs to be configured to at least the number of subscriptions, plus some for data synchronization worker processes.

Additionally, max_worker_processes needs to be adjusted accordingly, at least to max_logical_replication_worker + 1. Note that some extensions and parallel queries will also use connections from the worker process pool.

Configuration Parameter Example

For a 64-core machine with 1-2 publications and subscriptions, up to 6 synchronization worker processes, and up to 8 physical standbys, a sample configuration might look like this:

First, determine the number of slots: 2 subscriptions, 6 synchronization worker processes, 8 physical standbys, so configure for 16. Sender = Slot + Physical Replica = 24.

Limit synchronization worker processes to 6, 2 subscriptions, so set the total logical replication worker processes to 8.

wal_level: logical                      # logical	
max_worker_processes: 64                # default 8 -> 64, set to CPU CORE 64
max_parallel_workers: 32                # default 8 -> 32, limit by max_worker_processes
max_parallel_maintenance_workers: 16    # default 2 -> 16, limit by parallel worker
max_parallel_workers_per_gather: 0      # default 2 -> 0,  disable parallel query on OLTP instance
# max_parallel_workers_per_gather: 16   # default 2 -> 16, enable parallel query on OLAP instance

max_wal_senders: 24                     # 10 -> 24
max_replication_slots: 16               # 10 -> 16 
max_logical_replication_workers: 8      # 4 -> 8, 6 sync worker + 1~2 apply worker
max_sync_workers_per_subscription: 6    # 2 -> 6, 6 sync worker

Quick Setup

First, set the configuration option wal_level = logical on the publisher side. This parameter requires a restart to take effect. Other parameters’ default values don’t affect usage.

Then create a replication user and add pg_hba.conf configuration items to allow external access. A typical configuration is:

CREATE USER replicator REPLICATION BYPASSRLS PASSWORD 'DBUser.Replicator';

Note that logical replication users need SELECT permission. In Pigsty, replicator has already been granted the dbrole_readonly role.

host     all          replicator     0.0.0.0/0     md5
host     replicator   replicator     0.0.0.0/0     md5

Then execute in the publisher’s database:

CREATE PUBLICATION mypub FOR TABLE <tablename>;

Then execute in the subscriber’s database:

CREATE SUBSCRIPTION mysub CONNECTION 'dbname=<pub_db> host=<pub_host> user=replicator' PUBLICATION mypub;

The above configuration will start replication, first copying the initial data of the tables, then beginning to synchronize incremental changes.

Sandbox Example

Using the Pigsty standard 4-node two-cluster sandbox as an example, there are two database clusters pg-meta and pg-test. Now we’ll use pg-meta-1 as the publisher and pg-test-1 as the subscriber.

PGSRC='postgres://dbuser_admin@meta-1/meta'           # Publisher
PGDST='postgres://dbuser_admin@node-1/test'           # Subscriber
pgbench -is100 ${PGSRC}                               # Initialize Pgbench on publisher
pg_dump -Oscx -t pgbench* -s ${PGSRC} | psql ${PGDST} # Sync table structure on subscriber

# Create a **publication** on the publisher, adding default `pgbench` related tables to the publication set.
psql ${PGSRC} -AXwt <<-'EOF'
CREATE PUBLICATION "pg_meta_pub" FOR TABLE
  pgbench_accounts,pgbench_branches,pgbench_history,pgbench_tellers;
EOF

# Create a **subscription** on the subscriber, subscribing to the publisher's publication.
psql ${PGDST} <<-'EOF'
CREATE SUBSCRIPTION pg_test_sub
  CONNECTION 'host=10.10.10.10 dbname=meta user=replicator' 
  PUBLICATION pg_meta_pub;
EOF

Replication Process

After the subscription creation, if everything is normal, logical replication will automatically start, executing the replication state machine logic for each table in the subscription.

As shown in the following figure.

stateDiagram-v2 [*] --> init : Table is added to subscription set init --> data : Start synchronizing initial snapshot of table data --> sync : Incremental changes application completed during synchronization, enter ready state

When all tables are completed and enter r (ready) state, the logical replication’s existing data synchronization stage is completed, and the publisher and subscriber sides enter synchronization state as a whole.

Therefore, logically speaking, there are two state machines: Table Level Replication Small State Machine and Global Replication Large State Machine. Each Sync Worker is responsible for a small state machine on one table, while an Apply Worker is responsible for a logical replication large state machine.

Logical Replication State Machine

Logical replication has two Workers: Sync and Apply. Sync

Therefore, logical replication is logically divided into two parts: Each Table Independently Replicating,When the replication progress catches up to the latest position, by

When creating or refreshing a subscription, the table will be added to the subscription set, and each table in the subscription set will have a corresponding record in the pg_subscription_rel view, showing the current replication status of this table. The newly added table is initially in i,即initializeInitial State.

If the subscription’s copy_data option is true (default),And there is an idle Worker in the worker pool, PostgreSQL will allocate a synchronization worker for this table, synchronize the existing data on this table, and the table state enters d,即Copying Data. Synchronizing table data is similar to basebackup for database cluster, Sync Worker will create a temporary replication slot on the publisher, get the snapshot of the table through COPY, and complete basic data synchronization.

When the basic data copy of the table is completed, the table will enter sync mode, that is, Data Synchronization, the synchronization process will catch up with incremental changes during synchronization. When the catch-up is complete, the synchronization process will mark this table as r (ready) state, turn over the management of changes to the logical replication main Apply process, indicating that this table is in normal replication.

2.4 Waiting for Logical Replication Synchronization

After creating a subscription, first must monitor Ensure no errors are generated on both publisher and subscriber sides’ database logs.

2.4.1 Logical Replication State Machine

2.4.2 Synchronization Progress Tracking

Data synchronization (d) stage may take some time, depending on network card, network, disk, table size and distribution, logical replication synchronization worker quantity factors.

As a reference, 1TB database, 20 tables, containing 250GB large table, dual 10G network card, under the responsibility of 6 data synchronization workers, it takes about 6~8 hours to complete replication.

During data synchronization, each table synchronization task will create a temporary replication slot on the source end. Please ensure that logical replication initial synchronization period does not put unnecessary write pressure on the source primary, so as not to cause WAL to burst disk.

pg_stat_replicationpg_replication_slots,subscriber’s pg_stat_subscriptionpg_subscription_rel provide logical replication status related information, need to pay attention.

Last modified 2025-04-04: bump to v3.4.1 (98648b1)