Skip to content

Database Backups with pgBackRest

The Identity Operations Platform includes a built-in PostgreSQL backup and recovery solution based on pgBackRest.

This documentation is intended for customers, system integrators, and operators who deploy the platform using Docker Compose and want a reliable, production ready backup strategy.

Purpose

The backup system is designed to:

  • create consistent PostgreSQL backups
  • support Point in Time Recovery
  • detect backup failures early
  • operate fully automatically once deployed

No manual backup steps are required during normal operation.

High Level Architecture

The backup solution consists of the following components:

  • PostgreSQL with WAL archiving enabled
  • pgBackRest core service
  • pgBackRest backup worker
  • pgBackRest check worker
  • Persistent backup repository

Each component has a clearly defined responsibility.

How Backups Work

Write Ahead Log Archiving

PostgreSQL writes all changes to Write Ahead Log files before they are committed.

With WAL archiving enabled:

  • completed WAL segments are automatically forwarded to pgBackRest
  • every database change can be replayed during recovery
  • Point in Time Recovery becomes possible

WAL archiving runs continuously in the background.

Backup Types

The system uses two backup types:

  • Full backups
    Complete database snapshots

  • Differential backups
    Contain only changes since the last full backup

Differential backups reduce storage usage and backup time while maintaining restore flexibility.

Backup Schedule

Backups are executed automatically by a dedicated backup worker.

  • The backup worker runs continuously
  • Backup frequency is controlled by a simple schedule loop
  • The decision whether to create a full or differential backup is handled internally

No external scheduler is required.

Backup Repository

All backups and WAL archives are stored in a persistent repository volume.

The repository contains:

  • full backups
  • differential backups
  • WAL archives
  • metadata required for restore operations

The repository must be preserved to guarantee recoverability.

In Docker Compose, the repository and spool directories are persisted via pg_backrest_repo and pg_backrest_spool volumes.

Retention Policy

Backup retention is enforced automatically.

The default policy defines:

  • how many full backups are kept
  • how many differential backups are kept
  • how long WAL archives are retained

When retention limits are reached, obsolete data is removed safely without affecting recoverability.

Retention values can be adjusted to match storage capacity and recovery requirements.

Health Checks and Validation

A dedicated check worker periodically validates the backup system.

The check process verifies:

  • backup repository accessibility
  • PostgreSQL connectivity
  • WAL archiving functionality
  • basic configuration consistency

These checks are non-destructive and do not modify any data.

They are intended to detect issues early, before backups become unusable.

Customizing Backup Cycles via Environment Variables

The backup worker supports cycle customization through environment variables.
This allows integrators to change backup frequency without editing scripts.

Configuration in Docker Compose

The backup worker exposes the following variables:

  • PGBACKREST_FREQUENCY
    Controls how often backup.sh is executed.
  • DATABASE_BACKUP_FREQUENCY
    Convenience variable used in Docker Compose to set PGBACKREST_FREQUENCY.

Example configuration:

environment:
  PGBACKREST_STANZA: ${PGBACKREST_STANZA:-iop}
  PGBACKREST_FREQUENCY: ${DATABASE_BACKUP_FREQUENCY:-3600}

Restore and Recovery

The platform supports two restore modes: full restore and point in time recovery.

Info

All restore operations are destructive and protected by an explicit confirmation step.

Full Restore

A full restore recreates the database from the most recent valid backup.

Typical use cases include:

  • infrastructure failures
  • corrupted database files
  • complete system recovery

A full restore replaces the entire database state with the latest available backup.

Point in Time Recovery

Point in Time Recovery restores the database to a specific timestamp.

This is useful for:

  • accidental data deletion
  • application level errors
  • failed deployments

Point in Time Recovery is possible as long as the required WAL archives are still within the configured retention window.

Restore Safety Confirmation

All restore operations are protected by a mandatory confirmation token.

This mechanism prevents accidental execution of destructive restore commands.

During a restore, the following actions are performed:

  1. the database service is stopped
  2. the PostgreSQL data directory volume is cleared
  3. data is restored from the pgBackRest repository
  4. the database service is started again

Because this permanently deletes existing data, a confirmation token is required.

Default Confirmation Token

By default, the required confirmation token is:

I_UNDERSTAND_THIS_WILL_DELETE_DATABASE_DATA

The token must be passed exactly as the first argument to the restore command.

How the Confirmation Works

  • if the token is missing, the restore is blocked
  • if the token does not match exactly, the restore is blocked
  • only after a successful match, the restore workflow continues

This applies to all restore types.

The Docker Compose stack provides two helper services for this workflow:

  • database_wipe clears the PostgreSQL data directory
  • pgbackrest_restore runs the restore operation

Usage Examples

Restore the latest backup:

./control.sh restore "I_UNDERSTAND_THIS_WILL_DELETE_DATABASE_DATA"
./control.sh restore "I_UNDERSTAND_THIS_WILL_DELETE_DATABASE_DATA" time "2026-01-15 21:30:00"

Operational Responsibilities

To ensure reliable backups, operators should:

  • ensure backup volumes are persistent and not deleted
  • keep pgbackrest.conf, backup.sh, and check.sh available in the mounted ./pgbackrest directory
  • monitor backup and check logs
  • verify that WAL archiving is active
  • perform restore tests periodically in a non production environment

Backups should not be considered reliable until a restore has been tested at least once.

Common Failure Scenarios

The backup system is designed to detect common issues such as:

  • unreachable backup repository
  • broken WAL archiving
  • permission or ownership problems
  • insufficient disk space

Ignoring these issues can lead to incomplete or unusable backups.

Summary

The pgBackRest based backup system provides:

  • automated PostgreSQL backups
  • Point in Time Recovery support
  • early detection of backup failures
  • a clear operational model for customers and integrators

It is suitable for production deployments and can be adapted to different environments and retention requirements.