Fault tolerant PostgreSQL cluster. Part 2. Replication. Repmgr.


Part 1. Introduction.
Part 2. Replication. Repmgr.
Part 3. Auto-failover. Repmgr daemons.
Part 4. HA. Backup. Repmgr daemons. Barman.
Part 5. Testing. Ansible. Final thoughts.

At this part, we will setup master (db1) and standby (db2) replication servers using repmgr.

First, create repmgr user and database on db1 server.

su - postgres
createuser -s repmgr
psql -c "alter user repmgr with password 'hdf456hw';"
createdb repmgr -O repmgr
psql -c 'alter user repmgr set search_path TO repmgr, "$user", public;'

Prepare PostgreSQL instances for replication. Add this lines to PostgreSQL config on db1 and db2 servers.
/etc/postgresql/11/main/postgresql.conf

listen_addresses = '*'
max_wal_senders = 5
wal_keep_segments = 5
wal_level = replica
fsync = on
wal_log_hints = on
archive_mode = on
archive_command = '/bin/true'
hot_standby = on
shared_preload_libraries = 'repmgr'

listen_addresses = ‘*’ will allow you to connect to PostgreSQL using all server addresses, including localhost and OpenVPN local address (10.8.X.1). It means, server socket will be open for all connections from internet. It’s not a critical issue for a testing purposes, because pg_hba file will restrict such connections by IP range. But for a production environment it’s just a new DDOS entry point, so please, always use firewall (like iptables) and keep your listening interfaces strict.
max_wal_senders = 5 specify how much WAL sender processes can spawn. We need just two – one for standby and one for backup server. I set 5 in case we will want to fast expand our cluster, or one of processes will hang smells illogical.
wal_keep_segments = 5 sets how much old, currently needless WAL files will guaranteed to keep. Standby server use master base backup and WAL files (retrieved online) for deploying. If your system is under load, it’s necessary to keep some.
wal_level = replica – write enough WAL information to enable replication.
fsync = on – make sure data actually written to disk.
wal_log_hints = on – among others, allow to use pg_rewind.
archive_mode = on and archive_command = ‘/bin/true’ – enable WAL archiving and simply remove unnecessary WALs. We don’t need to archive them, because we will implement barman streaming backup soon.
hot_standby = on – enable hot standby if server is actually in standby mode.

Now, we should allow repmgr access PostgreSQL instances without password. Add the following lines to beginning of pg_hba.conf on db1 and db2.
/etc/postgresql/11/main/pg_hba.conf

local replication repmgr md5
host replication repmgr 127.0.0.1/32 md5
host replication repmgr 10.8.1.1/32 md5
host replication repmgr 10.8.2.1/32 md5

local repmgr repmgr md5
host repmgr repmgr 127.0.0.1/32 md5
host repmgr repmgr 10.8.1.1/32 md5
host repmgr repmgr 10.8.2.1/32 md5

IMPORTANT. Repmgr is a superuser! Don’t use trust auth method, which is used on many how-to’s over the internet. Is not safe, because any OS user can login to database without a password from specified machines.

Restart PostgreSQL.

systemctl restart postgresql

Add pgpass file on both servers, so repmgr can connect to database without a password.

su - postgres
echo "*:*:*:repmgr:hdf456hw" > ~/.pgpass
chmod 600 ~/.pgpass

Create db1 repmgr config.
/etc/repmgr.conf

node_id=1
node_name=db1
conninfo='host=10.8.1.1 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/11/main'
pg_bindir=/usr/lib/postgresql/11/bin/

service_start_command='sudo pg_ctlcluster 11 main start'
service_stop_command='sudo pg_ctlcluster 11 main stop'
service_restart_command='sudo pg_ctlcluster 11 main restart'
service_reload_command='sudo pg_ctlcluster 11 main reload'

log_facility=LOCAL0

Create db2 repmgr config.
/etc/repmgr.conf

node_id=2
node_name=db2
conninfo='host=10.8.2.1 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/11/main'
pg_bindir=/usr/lib/postgresql/11/bin/

service_start_command='sudo pg_ctlcluster 11 main start'
service_stop_command='sudo pg_ctlcluster 11 main stop'
service_restart_command='sudo pg_ctlcluster 11 main restart'
service_reload_command='sudo pg_ctlcluster 11 main reload'

log_facility=LOCAL0

Debian-based distros controls PostgreSQL using pg_ctlcluster util. We should allow postgres user to run it without a password. Add this to sudoers.
/etc/sudoers.d/pg_sudoers

Defaults:postgres !requiretty
postgres ALL = NOPASSWD: /usr/bin/pg_ctlcluster

Turn db1 to master server.

su - postgres
repmgr -f /etc/repmgr.conf primary register

Check db1 act as primary.

repmgr -f /etc/repmgr.conf cluster show

Turn db2 to standby server.

su - postgres
sudo pg_ctlcluster 11 main stop
rm -r /var/lib/postgresql/11/main #move, but not remove, if you need to keep old data
repmgr -h 10.8.1.1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone -F
sudo pg_ctlcluster 11 main start
repmgr standby register

Check replication cluster works.

repmgr -f /etc/repmgr.conf cluster show

For now, replication is asynchronous. We will make it synchronous, once barman will be setted up. Regardless, you can now promote and follow nodes using repmgr manually, but we will automate this process.

Leave a Comment

Your email address will not be published. Required fields are marked *