PostgreSQL: Backup, Recovery, and Replication in PostgreSQL

Logical vs Physical Backups

Backups. The thing nobody cares about—until they desperately need it. Imagine your database crashes, and you’re left staring at a blank screen, contemplating life choices. Let’s make sure that never happens.

Differences Between Logical and Physical Backups

  • Logical backups: Export data as SQL statements. Think of it as a “copy-paste” version of your database.
  • Physical backups: Copy raw database files. More like a “clone everything” approach.

When to Use Logical vs. Physical Backups

  • Use logical backups when migrating databases, copying subsets of data, or running away from bad schema decisions.
  • Use physical backups when you need a full, bit-perfect copy of your database, including indexes and transaction logs.

Advantages and Limitations of Each Approach

Backup TypeProsCons
LogicalPortable, human-readableSlow for large databases
PhysicalFast, completeNot portable between versions

pg_dump and pg_restore

If you’re not backing up your database with pg_dump, you might as well keep your life savings in Monopoly money.

Using pg_dump for Logical Backups

pg_dump -U postgres -d mydatabase -F c -f backup.dump

Restoring Databases with pg_restore

pg_restore -U postgres -d mydatabase -F c backup.dump

Exporting and Importing Data Using psql

pg_dump -U postgres -d mydatabase -f backup.sql
psql -U postgres -d newdatabase -f backup.sql

Strategies for Large Database Backups

  • Use parallel dumps: pg_dump -j 4
  • Compress backups: pg_dump -Fc | gzip > backup.gz
  • Automate with cron jobs—because manual backups are for masochists.

Continuous Archiving and Point-In-Time Recovery (PITR)

Because sometimes, “undo” is the greatest feature of all time.

Understanding Write-Ahead Logging (WAL)

PostgreSQL writes every change to a log file before actually applying it. This means you can replay history, like a time traveler fixing past mistakes.

Enabling Continuous Archiving

wal_level = replica
archive_mode = on
archive_command = 'cp %p /backup_location/%f'

Performing a Point-In-Time Recovery (PITR)

  1. Restore base backup.
  2. Copy WAL files back.
  3. Use recovery.conf to tell PostgreSQL when to stop.
recovery_target_time = '2024-03-21 12:00:00'

Automating Backup and Recovery Processes

  • Schedule backups using cron.
  • Use pgBackRest for automated disaster recovery.
  • Monitor logs—because blind faith never ends well.

Replication Strategies

Replication: Because a single point of failure is a terrible idea.

Overview of Database Replication

Replication means having a second (or third, or fourth) copy of your database running elsewhere, ready to take over if disaster strikes.

Setting Up Streaming Replication

wal_level = replica
max_wal_senders = 3
hot_standby = on

On the standby server:

pg_basebackup -h primary_host -D /var/lib/postgresql/data -P -U replication_user

Configuring Logical Replication

CREATE PUBLICATION my_pub FOR TABLE users;
CREATE SUBSCRIPTION my_sub CONNECTION 'host=primary dbname=mydb' PUBLICATION my_pub;

Using Third-Party Replication Tools: Slony-I, Bucardo

If built-in replication doesn’t cut it, try:

  • Slony-I: Table-based replication.
  • Bucardo: Multi-master replication.

High Availability and Failover

Because uptime is money, and downtime is just embarrassing.

Importance of High Availability in PostgreSQL

Nobody wants to explain why the database was down for five hours.

Implementing Failover Mechanisms

  • Manual failover: Old-school but dangerous.
  • Automatic failover: Smart, scalable, and saves your sanity.

Tools for Automatic Failover: Patroni, repmgr

  • Patroni: Automates failover with Etcd/Consul.
  • repmgr: A simpler but effective option.

Monitoring and Maintaining Replication Setups

SELECT * FROM pg_stat_replication;

Hands-On Exercise

  1. Perform a logical backup using pg_dump and restore it with pg_restore.
  2. Configure continuous archiving and simulate a Point-In-Time Recovery (PITR).
  3. Set up streaming replication between two PostgreSQL instances.
  4. Test logical replication by replicating specific tables.
  5. Implement a failover mechanism using Patroni or repmgr.

Backup before it’s too late. Replicate before disaster strikes. Automate unless you enjoy all-nighters. Your future self will be grateful.