Streaming1

In PostgreSQL, streaming replication is a process that allows data from a primary server (also called the master) to be replicated in real-time to one or more secondary servers (also called standby servers). This replication ensures that secondary servers are nearly up-to-date with the primary server, providing high availability and disaster recovery.

Here’s how it works, step-by-step:

Key Components:

  • Primary Server (Master): The main PostgreSQL server where data is actively written.
  • Secondary Server (Standby): The replica server(s) that maintain a copy of the primary server’s data.
  • WAL (Write-Ahead Log) Files: PostgreSQL uses WAL to log every change made to the database. These logs are central to the replication process.

How Streaming Replication Works:

  1. Write-Ahead Logging (WAL):
    • Every time a change (like an INSERT, UPDATE, or DELETE) is made to the primary server’s database, it’s first written to a WAL file.
    • WAL ensures data consistency by logging the changes before they are applied to the actual database files.
  2. Sending WAL to Secondary Servers:
    • The primary server continuously streams these WAL files to the secondary servers.
    • A WAL sender process on the primary server transmits the WAL entries to the secondary servers.
    • The secondary server, running a WAL receiver process, receives the stream of WAL data in near real-time.
  3. Replaying WAL on Secondary Servers:
    • The secondary servers apply (or “replay”) the changes from the WAL files to their own copies of the database, thus staying in sync with the primary.
    • This ensures that the secondary server has the same data as the primary server, except for the small delay caused by network transmission and processing time.
  4. Synchronous vs. Asynchronous Replication:
    • Asynchronous Replication: WAL data is sent to the secondary server after it’s written to the WAL on the primary server, but the primary server does not wait for confirmation from the secondary before acknowledging transactions to the client. This results in low latency but can lead to minor data loss if the primary fails.
    • Synchronous Replication: The primary server waits for acknowledgment from the secondary server before confirming a transaction. This ensures no data loss at the expense of higher latency.
  5. Failover:
    • In the event of a primary server failure, a secondary server can be promoted to the primary role. This process is called failover.
    • After the failover, the new primary server can begin accepting write transactions, and other secondary servers can start replicating from this new primary.
  6. Hot Standby:
    • PostgreSQL’s streaming replication also supports hot standby, which means the secondary servers can be used for read-only queries while still replicating from the primary server. This helps distribute the read workload and ensures better availability.

Flow of Streaming Replication:

  1. Primary server writes all changes to WAL files.
  2. WAL sender on the primary transmits WAL data to secondary servers.
  3. WAL receiver on secondary servers receives the WAL data and replays it to keep the database in sync.
  4. Secondary servers are continuously updated, with the ability to be promoted in case of primary failure.

Summary:

  • WAL files record all changes made to the primary server’s database.
  • The primary server sends WAL files to secondary servers in real time.
  • The secondary servers apply the WAL data to keep their copy of the database up-to-date.
  • The replication can be either synchronous (no data loss, higher latency) or asynchronous (lower latency, potential for minimal data loss).

Streaming replication ensures that the secondary servers are ready to take over in case of failure, providing a robust high-availability solution.

Streaming 2

How Streaming Replication Works in PostgreSQL

Streaming replication in PostgreSQL is a powerful feature that enables the creation of a highly available system capable of continuing operation even in the event of a failure. It involves the real-time transfer of updated information from a primary server to one or more standby servers, keeping the databases in sync across these servers. This mechanism leverages the Write-Ahead Log (WAL), a transaction log that records changes made to the database, to facilitate replication.

Key Components:

  • Primary Server: The main database server where transactions occur and changes are initially recorded in the WAL.
  • Standby Servers: Secondary servers that receive the WAL records from the primary server and apply them to replicate the database. They can serve read-only queries and take over as the primary server in case of a failure.
  • WAL Files: The Write-Ahead Log files contain all changes made to the database. These files are crucial for both crash recovery and replication.

Process Overview:

  1. WAL Generation: Whenever a change is made to the database on the primary server, PostgreSQL writes these changes to the WAL before they are committed to the database. This ensures data integrity and durability.
  1. WAL Shipping: The WAL records are sent from the primary server to the standby servers in real-time. This process is managed by the WAL sender on the primary server and the WAL receiver on the standby server.
  1. Applying WAL Records: Upon receiving the WAL records, the standby servers apply these changes to their own databases, thus maintaining consistency with the primary server.

Configuration and Setup:

  • Primary Server Configuration: Requires adjustments to postgresql.conf and pg_hba.conf. Key settings include enabling WAL archiving (archive_mode = on), specifying the maximum number of concurrent connections (max_connections) and WAL senders (max_wal_senders), and configuring the WAL level (wal_level = replica).
  • Standby Server Configuration: Also involves modifications to postgresql.conf, including enabling hot standby mode (hot_standby = on), setting the connection information to the primary server (primary_conninfo), and defining the restore command for retrieving WAL archives if needed.

Benefits and Considerations:

  • High Availability: Streaming replication provides a robust solution for achieving high availability by allowing standby servers to take over seamlessly in case the primary server fails.
  • Load Balancing: Standby servers can handle read-only queries, distributing the load and improving query performance.
  • Disaster Recovery: By replicating data to geographically distant locations, streaming replication supports disaster recovery strategies.

Synchronous vs. Asynchronous Replication:

  • Synchronous Replication: Ensures that transactions are committed on the primary server only after they have been replicated to the standby servers, providing strong consistency but potentially impacting performance.
  • Asynchronous Replication: Transactions are committed on the primary server without waiting for confirmation from the standby servers, offering better performance at the risk of potential data loss in case of a failure.

In summary, streaming replication in PostgreSQL leverages WAL files to replicate changes from a primary server to one or more standby servers, enhancing system availability, supporting load balancing, and facilitating disaster recovery. Proper configuration of both primary and standby servers is essential for effective replication.

Streaming 3

Streaming replication in PostgreSQL is a mechanism that allows a primary PostgreSQL server to continuously replicate its data to one or more standby servers in real time. This ensures high availability and redundancy by keeping the standby servers synchronized with the primary.

Key Components:

  • Primary Server: The primary server is the main source of data. It processes transactions and writes changes to the Write-Ahead Log (WAL).
  • Secondary Server: Also known as a standby server, it receives WAL records from the primary server and applies them to its own database, keeping it in sync.
  • Write-Ahead Log (WAL): A journal file that records changes made to the database. When a transaction commits, PostgreSQL writes a record to the WAL before acknowledging the transaction as successful.

How It Works:

  1. WAL Generation: The primary server generates WAL records as transactions are committed.
  2. WAL Streaming: The primary server streams the WAL records to the secondary server(s) over a network connection.
  3. WAL Application: The secondary server receives the WAL records and applies them to its own database, ensuring that the data on the secondary server remains consistent with the primary.

Benefits of Streaming Replication:

  • High Availability: If the primary server fails, a standby server can take over its role, ensuring minimal downtime.
  • Disaster Recovery: Streaming replication provides a mechanism for recovering data in case of a disaster.
  • Read-Only Load Balancing: Read-only queries can be directed to standby servers to reduce load on the primary server.
  • Asynchronous or Synchronous Replication: PostgreSQL supports both asynchronous and synchronous replication modes. In asynchronous mode, the primary server acknowledges a transaction as successful as soon as the WAL record is written to disk. In synchronous mode, the primary server waits for the WAL record to be applied on a standby server before acknowledging the transaction.

Streaming replication is a powerful feature of PostgreSQL that provides high availability, disaster recovery, and performance benefits. It is essential for mission-critical applications that require a high level of data reliability and availability.

By Rudy