High Availability
Enterprise-grade high availability with automatic failover and self-healing capabilities
Pigsty uses Patroni to achieve high availability for PostgreSQL, ensuring automatic failover.
Primary Failure RTO ≈ 30s, RPO < 1MB, Replica Failure RTO≈0 (reset current conn)
Overview
Pigsty's PostgreSQL cluster has battery-included high-availability powered by Patroni, Etcd, and HAProxy.
When you have two or more instances in the PostgreSQL cluster, you have the ability to self-heal from hardware failures without any further configuration — as long as any instance within the cluster survives, the cluster can serve its services. Clients simply need to connect to any node in the cluster to obtain full services without worrying about replication topology changes.
By default, the recovery time objective (RTO) for primary failure is approximately 30s ~ 60s, and the data recovery point objective (RPO) is < 1MB; for standby failure, RPO = 0, RTO ≈ 0 (instantaneous). In consistency-first mode, zero data loss during failover is guaranteed: RPO = 0. These metrics can be configured as needed based on your actual hardware conditions and reliability requirements.
Pigsty incorporates an HAProxy load balancer for automatic traffic switching, offering multiple access methods for clients such as DNS/VIP/LVS. Failovers and switchover are almost imperceptible to the business side except for sporadic interruptions, meaning applications do not need connection string modifications or restarts.
Key Metrics
What High Availability Solves
High availability addresses critical operational challenges:
Data Safety
Elevates availability: RPO ≈ 0, RTO < 30s for enhanced data protection
Rolling Maintenance
Seamless maintenance: Minimize maintenance windows for operational convenience
Hardware Failures
Self-healing: Automatic recovery from hardware failures without human intervention
Load Distribution
Read scaling: Distribute read-only queries across standby instances
Specific Benefits
- Enhanced data safety: Improves the availability aspect of data safety CIA to a new height
- Rolling maintenance capabilities: Enables seamless maintenance with minimal downtime
- Hardware failure recovery: Self-healing from hardware failures without human intervention
- Load sharing: Read-only requests can be distributed across standby instances
Costs of High Availability
Implementing HA introduces certain trade-offs and requirements:
Infrastructure Requirements: HA requires at least 3 nodes and additional infrastructure dependencies.
Resource Requirements
- Minimum cluster size: At least 3 nodes for proper consensus
- Additional infrastructure: Requires consensus store (Etcd) and load balancer
- Resource overhead: Additional CPU, memory, and network resources
- Operational complexity: Increased monitoring and management requirements
Limitations
High availability cannot prevent:
- Human errors and operational mistakes
- Software defects causing data corruption
- Logical data deletion or corruption
For these scenarios, additional recovery strategies are needed:
- Delayed clusters for protection against logical corruption
- Point-in-time recovery for fine-grained data restoration
- Regular backups for disaster recovery scenarios
Architecture
Pigsty's HA architecture leverages a multi-component design that eliminates single points of failure:
Patroni
Cluster Management: Orchestrates PostgreSQL processes and handles automatic failover
Etcd
Consensus Store: Provides distributed configuration and leader election
HAProxy
Load Balancer: Routes traffic and provides service discovery
VIP Manager
Virtual IP: Optional Layer 2 VIP binding for seamless connectivity
Component Roles
Cluster Orchestrator
- Manages PostgreSQL server processes
- Handles automatic failover and switchover
- Monitors cluster health and topology
- Configures streaming replication
- Provides REST API for cluster management
# Patroni configuration example
patron:
name: pg-test-1
scope: pg-test
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 30
Distributed Configuration Store
- Stores cluster configuration and state
- Provides leader election mechanism
- Ensures consistent view across all nodes
- Handles network partitions gracefully
- Maintains cluster membership information
# Etcd cluster configuration
etcd_cluster: etcd
etcd_safeguard: false
etcd_clean: true
Traffic Router and Load Balancer
- Routes read/write traffic to appropriate nodes
- Provides health checking for database instances
- Offers multiple service endpoints
- Handles connection pooling and load distribution
- Supports SSL termination and connection limits
# HAProxy service endpoints
primary:5433 # Read-write traffic to primary
replica:5434 # Read-only traffic to replicas
default:5436 # Failover-aware connection
offline:5438 # Dedicated offline queries
Virtual IP Management
- Manages Layer 2 Virtual IP addresses
- Provides seamless client connectivity
- Handles VIP migration during failover
- Supports multiple VIP interfaces
- Optional component for simplified client access
# VIP configuration
vip_enabled: true
vip_address: 10.10.10.99/24
vip_interface: eth0
Implementation
Pigsty's HA implementation follows proven patterns for PostgreSQL clustering:
Replication Architecture
Streaming Replication
PostgreSQL uses built-in streaming replication for data synchronization between primary and standby nodes.
Consensus-Based Leadership
Patroni uses Etcd for distributed consensus to elect cluster leader and manage topology changes.
Automatic Failover
When primary fails, Patroni automatically promotes the most up-to-date standby to become the new primary.
Traffic Rerouting
HAProxy detects topology changes and automatically routes traffic to the new primary instance.
Failure Scenarios
Primary Node Failure Process
- Detection: Patroni detects primary node failure (15-30 seconds)
- Leader Election: Etcd coordinates new leader selection
- Promotion: Most up-to-date standby is promoted to primary
- Reconfiguration: Remaining standbys reconfigure to new primary
- Traffic Switch: HAProxy redirects traffic to new primary
Write Service Interruption: 15-30 seconds during failover process
Standby Node Failure Process
- Detection: Immediate detection of standby failure
- Traffic Rerouting: HAProxy removes failed node from pool
- Service Continuity: Read queries continue on remaining standbys
- Automatic Recovery: Node rejoins cluster when restored
Minimal Impact: Read-only queries experience brief interruption only
Network Partition Handling
- Split-Brain Prevention: Etcd consensus prevents multiple primaries
- Quorum Requirements: Majority of nodes required for operations
- Graceful Degradation: Read-only mode in minority partitions
- Automatic Recovery: Normal operations resume when partition heals
Quorum Dependency: Requires majority of consensus nodes to remain operational
Trade-Offs
Pigsty provides configurable parameters to balance between recovery speed and data consistency:
Recovery Time Objective (RTO)
The pg_rto
parameter controls failover timing and sensitivity:
# RTO Configuration
pg_rto: 30 # Default: 30 seconds
Lower RTO values:
- ✅ Faster failover response
- ✅ Reduced service interruption
- ❌ Higher risk of false positives
- ❌ May cause unnecessary failovers
Higher RTO values:
- ✅ More stable, fewer false alarms
- ✅ Better tolerance for network hiccups
- ❌ Longer service interruption
- ❌ Delayed response to real failures
Recovery Point Objective (RPO)
The pg_rpo
parameter limits potential data loss during failover:
# RPO Configuration
pg_rpo: 1048576 # Default: 1MB
Lower RPO values:
- ✅ Better data consistency
- ✅ Minimal data loss risk
- ❌ May delay failover
- ❌ Could impact availability
Higher RPO values:
- ✅ Faster failover process
- ✅ Better availability
- ❌ Potential for more data loss
- ❌ Consistency trade-offs
Configuration Examples
# Zero data loss configuration
synchronous_mode: true
synchronous_mode_strict: true
pg_rpo: 0
pg_rto: 60
synchronous_standby_names: 'ANY 1 (*)'
Use Case: Financial systems, critical transactional data
# Fast failover configuration
synchronous_mode: false
pg_rpo: 16777216 # 16MB
pg_rto: 15
max_replication_slots: 16
Use Case: High-traffic applications, read-heavy workloads
# Default balanced configuration
synchronous_mode: false
pg_rpo: 1048576 # 1MB
pg_rto: 30
max_replication_slots: 8
Use Case: Most production environments
Network Quality Impact
Network conditions significantly affect HA behavior:
- High-quality networks: Can use lower RTO values safely
- Unstable networks: Require higher RTO to prevent false positives
- WAN deployments: Need careful tuning of timeout parameters
- Local networks: Can optimize for faster failover
Monitoring and Observability
Pigsty provides comprehensive monitoring for HA cluster health:
Key Metrics
Cluster State
Monitor cluster topology, leader status, and member health
Replication Lag
Track replication lag and sync status across all replicas
Failover Events
Log and analyze failover events and their impact
Performance
Monitor query performance and connection health
Dashboard Integration
Pigsty includes pre-built Grafana dashboards for HA monitoring:
- Cluster Overview: Real-time cluster topology and health
- Replication Monitoring: Lag metrics and sync status
- Failover Analysis: Historical failover events and timing
- Performance Metrics: Query performance during normal and failover scenarios
Best Practices
Deployment Recommendations
Anti-Affinity: Deploy cluster nodes across different physical hosts, racks, or availability zones.
- Hardware diversity: Use different hardware configurations to avoid common failure modes
- Network redundancy: Ensure multiple network paths between cluster nodes
- Storage considerations: Use local storage for best performance, shared storage for specific use cases
- Monitoring setup: Implement comprehensive monitoring before going to production
Operational Guidelines
- Regular testing: Perform controlled failover tests in non-production environments
- Capacity planning: Size cluster nodes appropriately for failover scenarios
- Backup strategy: Maintain regular backups independent of HA setup
- Documentation: Keep runbooks updated for emergency procedures
Common Pitfalls
Avoid These Common Mistakes:
- Insufficient network bandwidth between nodes
- Inadequate monitoring of replication lag
- Not testing failover procedures regularly
- Incorrect firewall configurations
Summary
Pigsty's high availability solution provides:
- Automatic failover with sub-minute RTO
- Configurable consistency with RPO control
- Self-healing capabilities for hardware failures
- Load balancing for read scaling
- Minimal operational overhead with automated management
The combination of Patroni, Etcd, and HAProxy creates a robust, production-ready HA solution that handles the majority of failure scenarios automatically while providing the flexibility to tune behavior based on specific requirements.
High availability is not just about technology—it's about building resilient systems that your business can depend on.