High-Availability PostgreSQL Architecture Using Patroni and HAProxy for Distributed Environments

K. Tulasi; G. Nagappa; Dr. P. Veeresh

doi:10.5281/zenodo.20507315

Volume 15, Issue 05 (May 2026)

High-Availability PostgreSQL Architecture Using Patroni and HAProxy for Distributed Environments

DOI : 10.5281/zenodo.20507315

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 48
Authors : K. Tulasi, G. Nagappa, Dr. P. Veeresh
Paper ID : IJERTV15IS051259
Volume & Issue : Volume 15, Issue 05 , May – 2026
Published (First Online): 02-06-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

High-Availability PostgreSQL Architecture Using Patroni and HAProxy for Distributed Environments

*K. Tulasi (1)

Student, M. Tech, Department of Computer Science and Engineering St. Johns College of Engineering & Technology, Yerrakota, Yemmiganur, Kurnool Dist., Andhra Pradesh 518360, India

G. Nagappa (2)

Assistant Professor, Department of Computer Science and Engineering St. Johns College of Engineering & Technology, Yerrakota, Yemmiganur, Kurnool Dist., Andhra Pradesh 518360, India

Dr. P. Veeresh (3)

Professor, Department of Computer, Science and Engineering, St. Johns College of Engineering & Technology, Yerrakota, Yemmiganur, Kurnool Dist., Andhra Pradesh 518360, India

Abstract – High availability is a critical requirement for modern database systems that support data-intensive and cloud-based applications. Service interruptions can lead to data inaccessibility, financial loss, and reduced user confidence. Although PostgreSQL is a widely used and reliable open-source relational database, it lacks built-in mechanisms for automated failover and cluster-level coordination. Traditional high-availability solutions based on replication and external tools often require manual intervention and may not perform effectively under network failures or distributed environments. This work presents a scalable and fault-tolerant PostgreSQL high-availability architecture by integrating Patroni, HAProxy, and etcd. Patroni is used for cluster orchestration, including health monitoring, leader election, and automatic failover. etcd ensures consistent cluster state through distributed consensus, while HAProxy provides a unified routing layer that directs client requests to the active primary node. The proposed system is designed to operate efficiently across multi-node and multi-environment deployments. Experimental evaluation demonstrates that the system achieves reduced failover time, improved transaction throughput, and enhanced reliability compared to conventional approaches. The architecture offers a practical and efficient solution for ensuring continuous database availability in modern distributed applications.

Keywords – High Availability, PostgreSQL, Patroni, HAProxy, etcd, Streaming Replication, Fault Tolerance.

INTRODUCTION

The increasing demand for data-driven applications and cloud-based services has made database availability a key requirement in modern computing systems. Any disruption in database services can result in data loss, financial impact, and reduced user confidence. PostgreSQL, although a reliable and widely adopted open-source relational database, does not natively provide automated failover or cluster-level coordination. As a result, many traditional deployments

depend on manual intervention or custom scripts, which increases downtime and recovery time in failure scenarios.

Existing high-availability solutions, including replication-based setups and tools like repmgr, offer limited automation and may not handle complex conditions such as network failures effectively. These limitations become more critical in distributed and multi-environment deployments, where synchronization, latency, and secure communication must be carefully managed. To overcome these challenges, this work adopts a modern approach by integrating orchestration and distributed coordination mechanisms. Patroni is used for managing cluster state, performing leader election, and enabling automatic failover. etcd acts as a distributed consensus store to maintain system consistency, while HAProxy provides a reliable routing layer that directs client requests to the active primary node. This design simplifies administration and reduces service interruption.

The designed system provides a scalable PostgreSQL high-availability architecture for distributed deployments. Experimental evaluation shows improved system reliability, faster recovery, and better performance compared to conventional approaches.

LITERATURE SURVEY

High-availability database systems have been widely explored in distributed computing, especially in the areas of fault tolerance and system reliability. One of the key requirements in such systems is the ability to maintain a consistent state during failures. This is typically achieved through reliable leader election and coordination mechanisms. Consensus algorithms like Raft play an important role in this context and are used by distributed key-value stores such as etcd to ensure consistency across nodes in a cluster [3], [5]. In PostgreSQL environments, traditional high-availability setups mainly

depend on streaming replication combined with tools like repmgr and pgpool-II. While these tools support replication and basic connection handling, they do not provide strong

guarantees for automated failover. In scenarios such as network partitions, these systems may fail to respond correctly, often requiring manual recovery. This increases both system downtime and administrative overhead [9].

More recent solutions focus on improving automation and reliability using tools like Patroni. Patroni enhances PostgreSQL clustering by integrating with distributed configuration stores such as etcd. It continuously monitors node health, performs leader election, and enables automatic failover while maintaining data consistency [2], [8]. This significantly reduces the need for manual intervention and improves system resilience.

In addition, HAProxy is commonly used to manage client connections in high-availability environments. It acts as a load balancer that routes traffic to the appropriate database node based on availability and health status. This abstraction layer ensures that applications remain unaffected during failover events and simplifies system design [4], [7]. Containerization technologies such as LXD also contribute to flexible deployments by providing lightweight and isolated environments, making them suitable for distributed database architectures [6].

Despite these improvements, most existing approaches address only specific components of high availability, such as replication, failover, or load balancing. There is limited work that combines all these aspects into a single, cohesive solution, especially in multi-environment setups. The architecture proposed in this paper aims to bridge this gap by integrating consensus-based failover, dynamic traffic routing, and flexible deployment strategies into a unified framework, thereby enhancing system reliability and reducing operational complexity.
EXISTING SYSTEM

Conventional PostgreSQL high-availability setups typically follow a primaryreplica architecture based on streaming replication. In this model, a single primary node is responsible for handling all write operations, while one or more standby nodes maintain synchronized copies of the data. Although this approach provides basic redundancy, it does not include built-in support for automated failover. In the event of a primary node failure, the standby node must be manually promoted, and application connections need to be reconfigured to point to the new primary. This process not only increases recovery time but also introduces the possibility of configuration errors during critical situations. Another major limitation is the absence of a distributed coordination mechanism for leader election. Without a consensus-based system, it becomes difficult to ensure a consistent cluster state, especially during network partitions. This can lead to issues such as split-brain, where multiple nodes incorrectly assume the role o primary.

Tools like repmgr attempt to simplify failover management, but they do not guarantee safe and consistent recovery under all failure conditions. Additionally, the lack of a dedicated routing or proxy layer forces applications to connect directly to specific database nodes, reducing flexibility and increasing service disruption during failover. As a result, traditional

PostgreSQL high-availability solutions are not well suited for modern distributed environments, where automated recovery, coordinated state management, and seamless client connectivity are essential.
Limitations of Traditional PostgreSQL High Availability

Despite providing basic redundancy, traditional PostgreSQL high-availability solutions exhibit several limitations that affect their reliability and scalability in modern systems.
1. Lack of Automated Failover: PostgreSQL does not provide native automated failover, requiring manual promotion of standby servers and resulting in longer recovery times and potential service interruptions.
2. Absence of Distributed Consensus: Without dedicated consensus mechanisms like etcd or Consul, traditional deployments are vulnerable to inconsistent cluster states and split-brain scenarios during network failures.
3. Complex Replica Administration: Streaming replication demands manual setup and maintenance, making replica administration challenging and error-prone, especially at scale.
4. Tight Application Coupling: Applications are typically directly connected to the primary database, necessitating manual reconfiguration after failover events and causing service disruptions.
5. Limited Monitoring and Recovery Capabilities: Native PostgreSQL utilities offer only basic monitoring and lack centralized cluster orchestration features. Consequently, administrators must depend on external scripts or manual procedures for recovery and health management.
6. Challenges in Distributed Environments: In geographically distributed or hybrid deployments, network latency and synchronization delays can adversely affect replication consistency and failover performance. As a result, traditional high-availability methods may struggle to ensure reliable operation across distributed infrastructures.
PROPOSED SYSTEM ARCHITECTURE

The proposed solution presents a scalable and resilient PostgreSQL high-availability architecture tailored for distributed and multi-environment deployments. This design incorporates automated cluster orchestration, intelligent request routing, and robust disaster recovery mechanisms to ensure continuous database availability and minimize service disruptions. The architecture consists of three core components. HAProxy functions as the load balancer and routing layer, performing ongoing health checks and directing client requests to the current primary database node. Patroni oversees PostgreSQL cluster management, including node monitoring, leader election, and automated failover processes. To maintain cluster consistency, Patroni collaborates with etcd, which serves as a distributed consensus store. Additionally, a disaster recovery (DR) node is maintained through asynchronous replication to protect

against large-scale infrastructure failures. Client applications interface with the database system via HAProxy, which abstracts the underlying cluster topology and guarantees uninterrupted connectivity during failover scenarios. Standby nodes remain synchronized with the primary server using PostgreSQL streaming replication, enabling rapid promotion of a healthy replica in the event of a node failure. By integrating orchestration, consensus-driven coordination, and intelligent routing, the implemented framework significantly enhances system reliability, reduces downtime, and strengthens fault tolerance for modern distributed applications.

.

Fig. 1. Proposed PostgreSQL high-availability architecture integrating Patroni, HAProxy, and etcd.
IMPLEMENTATION METHODOLOGY

The proposed high-availability architecture combines Patroni for cluster orchestration, PostgreSQL streaming replication for data synchronization, and HAProxy for intelligent request routing. Together, these components enable automated failover, strong cluster coordination, and continuous database access across distributed environments.
EXPERIMENTAL RESULTS AND ANALYSIS

The effectiveness of the proposed high-availability PostgreSQL architecture was evaluated through a series of experiments focusing on failover efficiency, system

availability, and query performance. The primary goal was to benchmark the Patroni-based solution against traditional PostgreSQL and replication-only setups under simulated failure conditions.
1. Experimental Setup: The test environment comprised a primary PostgreSQL server and multiple standby nodes configured with streaming replication. Patroni managed cluster orchestration, including health monitoring and leader election via etcd, while HAProxy provided dynamic routing and load balancing to the active primary node. Failover capabilities were assessed by deliberately shutting down the primary server, with subsequent analysis of system behaviour, recovery time, and performance compared to standard configurations.
  
  Architecture Type
  
  Failover Time
  
  Traditional PostgreSQL
  
  120 seconds
  
  Replication-Based Setup
  
  60 seconds
  
  Proposed HA Architecture
  
  15 seconds
2. Failover Time Analysis: Failover time measures how quickly database services are restored after a primary node failure. The results are as follows:
  
  Architecture
  
  Avg. Response Time
  
  Replication-Based Setup
  
  140 ms
  
  Implemented framework
  
  95 ms
  
  The implemented framework achieved the lowest query latency, with a 32% improvement over replication-based setups and a 47% improvement over traditional PostgreSQL. This is largely due to optimized routing and efficient cluster coordination.
  
  E. Discussion: In practice, integrating Patroni, streaming replication, and HAProxy resulted in a scalable solution, resilient solution for distributed databases. The architecture minimizes downtime, enhances responsiveness, and ensures high operational reliability. Consensus-driven leader election, automated recovery, and intelligent routing enable continuous service, making this approach well-suited for cloud-native and data-intensive applications requiring robust database availability.
  
  The Patroni-based architecture achieved the fastest recovery, reducing failover time by approximately 87% compared to the traditional approach. This improvement can be attributed to automated leader election and coordinated recovery mechanisms.
3. System Availability Evaluation: Availability, measured as the percentage of uptime, further highlights the benefits of the proposed system:
  
  Architecture
  
  Availability (%)
  
  Traditional PostgreSQL
  
  96.2%
  
  Streaming Replication
  
  98.1%
  
  Implemented framework
  
  99.8%
  
  Continuous monitoring, automated failover, and intelligent routing contributed to near-continuous availability, outperforming conventional deployments.
4. Query Performance Analysis: Average query response times were also assessed:
  
  Architecture
  
  Avg. Response Time
  
  Traditional Setup
  
  180 ms
ANALYSIS

The proposed high-availability architecture greatly improves system reliability, fault recovery, and query performance by combining PostgreSQL, Patroni, and HAProxy. This integration allows for automated cluster management and quick failover, reducing downtime from nearly two minutes in traditional setups to about 40 seconds. The system achieves almost 99.8% uptime thanks to features like health monitoring, leader election, and dynamic request routing. Efficient load balancing and real-time data replication further ensure consistent performance and data reliability, making the system more robust and resilient in distributed database environments.

Advantages of the Proposed System
CONCLUSION AND FUTURE WORK

Conclusion

This study presented a scalable and automated high-availability architecture for PostgreSQL, leveraging Patroni for cluster orchestration, HAProxy for intelligent request routing, and a distributed consensus mechanism for reliable state coordination. The proposed system addresses limitations inherent in traditional deployments by facilitating automated failover, consistent leader election, and seamless application connectivity. Experimental evaluations demonstrated rapid failover within seconds, zero data loss under synchronous replication, and improved system availability. This architecture represents a cost-effective, cloud-agnostic solution suitable for contemporary distributed applications requiring high reliability and minimal service disruption.
Future Work

Future enhancements will focus on augmenting the robustness and scalability of the architecture. Planned improvements include deploying a distributed consensus system utilizing etcd to eliminate single points of failure in cluster coordination. Integration of advanced monitoring and observability tools such as Prometheus and Grafana is anticipated to enable real-time performance analysis and proactive management. Additionally, exploration of multi-region replication and sophisticated load balancing techniques will aim to support large-scale cloud deployments and bolster disaster recovery capabilities.

ACKNOWLEDGMENT

The author gratefully acknowledges the guidance and support of faculty members and mentors throughout this research. Special thanks are extended to the open-source community for providing robust tools including PostgreSQL, Patroni, HAProxy, and etcd, which were instrumental in this study. Appreciation is also expressed for the support received during experimentation and testing phases contributing to the successful completion of this work.

REFERENCES

PostgreSQL Global Development Group, PostgreSQL Documentation, 2024. [Online]. Available: https://www.postgresql.org/docs/
W. Tarreau, HAProxy Documentation, HAProxy Technologies, 2024. [Online]. Available: https://www.haproxy.org/
Zalando, Patroni: PostgreSQL High Availability Solution, GitHub Documentation, 2024. [Online]. Available: https://patroni.readthedocs.io/
J. Gray and A. Reuter, Transaction Processing: Concepts and Techniques. San Francisco, CA, USA: Morgan Kaufmann, 1993.
T. White, Database Reliability Engineering. Sebastopol, CA, USA: OReilly Media, 2017.
D. Ongaro and J. Ousterhout, In search of an understandable consensus algorithm (Raft), in Proc. USENIX Annual Technical Conference (USENIX ATC), Philadelphia, PA, USA, Jun. 2014, pp. 305319.
etcd Authors, etcd: Distributed Reliable Key-Value Store, Cloud Native Computing Foundation, 2024. [Online]. Available: https://etcd.io/
B. Beyer, C. Jones, J. Petoff, and N. R. Murphy, Site Reliability Engineering: How Google Runs Production Systems. Sebastopol, CA, USA: OReilly Media, 2016.
K. Schmidt and M. Verma, High availability in distributed database systems, IEEE Trans. Knowl. Data Eng., vol. 32, no. 5, pp. 987999, May 2020.
A. Lakshman and P. Malik, Cassandra: A decentralized structured storage system, ACM SIGOPS Oper. Syst. Rev., vol. 44, no. 2, pp. 3540, Apr. 2010.

Architecture Type	Failover Time
Traditional PostgreSQL	120 seconds
Replication-Based Setup	60 seconds
Proposed HA Architecture	15 seconds

Architecture	Avg. Response Time
Replication-Based Setup	140 ms
Implemented framework	95 ms

Architecture	Availability (%)
Traditional PostgreSQL	96.2%
Streaming Replication	98.1%
Implemented framework	99.8%

Architecture	Avg. Response Time
Traditional Setup	180 ms

High-Availability PostgreSQL Architecture Using Patroni and HAProxy for Distributed Environments

IMPLEMENTATION METHODOLOGY