Part 1. Introduction.
Part 2. Replication. Repmgr.
Part 3. Auto-failover. Repmgr daemons.
Part 4. HA. Backup. Repmgr daemons. Barman.
Part 5. Testing. Ansible. Final thoughts.
At this part, we will create two repmgr daemons on backup server and setup auto-failover.
Why not place daemons on db servers? Because in this case we don’t have voting majority, which dispose to split-brain. If db1 losing connection with db2, we get unnecessary switchover and two masters. Since we are using backup server as the only entry point, we can switch servers only if we actually need it. It’s not a split-brainless solution, but much better.
First, we need to create passwordless ssh connection to db1 and db2 servers, letting repmgr daemons automatically switchover them. ssh-copy-id will ask you for a postgres user password we changed at Part 1.
Execute this on backup server.
su - postgres ssh-keygen -b 2048 -t rsa -N "" -C "postgres@10.8.3.1" ssh-copy-id postgres@10.8.1.1 ssh-copy-id postgres@10.8.2.1
Let’s allow repmgr to connect to our PostgreSQL instances without password. Add this line to beginning of pg_hba.conf on db1 and db2./etc/postgresql/11/main/pg_hba.conf
host repmgr repmgr 10.8.3.1/32 md5
Restart PostgreSQL.
systemctl restart postgresql
Add pgpass file on backup server, so repmgr daemons can connect to database without a password.
su - postgres echo "*:*:*:repmgr:hdf456hw" > ~/.pgpass chmod 600 ~/.pgpass
Disable OpenVPN and PostgreSQL autostart on db1 and db2 servers. Why? If, after reboot, OpenVPN will start automatically, repmgr daemon will wait forever until PostgreSQL start, which means no auto switchover. If PostgreSQL will start automatically, you will get an instant split-brain. We will learn how to properly return PostgreSQL instance back to cluster in Part 5.
update-rc.d openvpn disable update-rc.d postgresql disable
Let’s create two repmgr configs on backup server./etc/repmgrd_db1.conf
node_id=1 node_name=db1 conninfo='host=10.8.1.1 user=repmgr dbname=repmgr connect_timeout=2' data_directory='/var/lib/postgresql/11/main' failover=automatic promote_command='ssh postgres@10.8.1.1 /usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file' follow_command='ssh postgres@10.8.1.1 /usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n' service_promote_command='ssh postgres@10.8.1.1 /usr/lib/postgresql/11/bin/pg_ctl -w -D /var/lib/postgresql/11/main promote' monitoring_history=yes monitor_interval_secs=2 log_facility=LOCAL0
/etc/repmgrd_db2.conf
node_id=2 node_name=db2 conninfo='host=10.8.2.1 user=repmgr dbname=repmgr connect_timeout=2' data_directory='/var/lib/postgresql/11/main' failover=automatic promote_command='ssh postgres@10.8.2.1 /usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file' follow_command='ssh postgres@10.8.2.1 /usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n' service_promote_command='ssh postgres@10.8.2.1 /usr/lib/postgresql/11/bin/pg_ctl -w -D /var/lib/postgresql/11/main promote' monitoring_history=yes monitor_interval_secs=2 log_facility=LOCAL0
Create two systemd daemon configs./etc/systemd/system/repmgrd_db1.service
[Unit] Description=repmgrd_db1 After=syslog.target After=network.target [Service] Type=forking Restart=on-failure RestartSec=10 User=postgres Group=postgres TimeoutStopSec=5 ExecStart=/usr/lib/postgresql/11/bin/repmgrd -d --config-file /etc/repmgrd_db1.conf --pid-file=/var/run/repmgr/repmgrd_db1.pid StandardOutput=syslog StandardError=syslog PIDFile=/var/run/repmgr/repmgrd_db1.pid [Install] WantedBy=multi-user.target
/etc/systemd/system/repmgrd_db2.service
[Unit] Description=repmgrd db2 After=syslog.target After=network.target [Service] Type=forking Restart=on-failure RestartSec=10 User=postgres Group=postgres TimeoutStopSec=5 ExecStart=/usr/lib/postgresql/11/bin/repmgrd -d --config-file /etc/repmgrd_db2.conf --pid-file=/var/run/repmgr/repmgrd_db2.pid StandardOutput=syslog StandardError=syslog PIDFile=/var/run/repmgr/repmgrd_db2.pid [Install] WantedBy=multi-user.target
Now, create PID directory config./usr/lib/tmpfiles.d/repmgr.conf
d /run/repmgr 2775 postgres postgres -
Create PID directory.
systemd-tmpfiles --create
Enable and restart both daemons.
systemctl enable repmgrd_db1 systemctl restart repmgrd_db1 systemctl enable repmgrd_db2 systemctl restart repmgrd_db2
Check daemons running.
ps aux | grep repmgrd
Check cluster is running using postgres user.
su - postgres repmgr -f /etc/repmgrd_db1.conf cluster show repmgr -f /etc/repmgrd_db2.conf cluster show
At the Part 4 we will set up barman and make cluster high available.