Part 1. Introduction.
Part 2. Replication. Repmgr.
Part 3. Auto-failover. Repmgr daemons.
Part 4. HA. Backup. Repmgr daemons. Barman.
Part 5. Testing. Ansible. Final thoughts.
At this part, we will setup master (db1) and standby (db2) replication servers using repmgr.
First, create repmgr user and database on db1 server.
su - postgres createuser -s repmgr psql -c "alter user repmgr with password 'hdf456hw';" createdb repmgr -O repmgr psql -c 'alter user repmgr set search_path TO repmgr, "$user", public;'
Prepare PostgreSQL instances for replication. Add this lines to PostgreSQL config on db1 and db2 servers.
listen_addresses = '*' max_wal_senders = 5 wal_keep_segments = 5 wal_level = replica fsync = on wal_log_hints = on archive_mode = on archive_command = '/bin/true' hot_standby = on shared_preload_libraries = 'repmgr'
listen_addresses = ‘*’ will allow you to connect to PostgreSQL using all server addresses, including localhost and OpenVPN local address (10.8.X.1). It means, server socket will be open for all connections from internet. It’s not a critical issue for a testing purposes, because pg_hba file will restrict such connections by IP range. But for a production environment it’s just a new DDOS entry point, so please, always use firewall (like iptables) and keep your listening interfaces strict.
max_wal_senders = 5 specify how much WAL sender processes can spawn. We need just two – one for standby and one for backup server. I set 5 in case we will want to fast expand our cluster, or one of processes will hang
wal_keep_segments = 5 sets how much old, currently needless WAL files will guaranteed to keep. Standby server use master base backup and WAL files (retrieved online) for deploying. If your system is under load, it’s necessary to keep some.
wal_level = replica – write enough WAL information to enable replication.
fsync = on – make sure data actually written to disk.
wal_log_hints = on – among others, allow to use pg_rewind.
archive_mode = on and archive_command = ‘/bin/true’ – enable WAL archiving and simply remove unnecessary WALs. We don’t need to archive them, because we will implement barman streaming backup soon.
hot_standby = on – enable hot standby if server is actually in standby mode.
Now, we should allow repmgr access PostgreSQL instances without password. Add the following lines to beginning of pg_hba.conf on db1 and db2.
local replication repmgr md5 host replication repmgr 127.0.0.1/32 md5 host replication repmgr 10.8.1.1/32 md5 host replication repmgr 10.8.2.1/32 md5 local repmgr repmgr md5 host repmgr repmgr 127.0.0.1/32 md5 host repmgr repmgr 10.8.1.1/32 md5 host repmgr repmgr 10.8.2.1/32 md5
IMPORTANT. Repmgr is a superuser! Don’t use trust auth method, which is used on many how-to’s over the internet. Is not safe, because any OS user can login to database without a password from specified machines.
systemctl restart postgresql
Add pgpass file on both servers, so repmgr can connect to database without a password.
su - postgres echo "*:*:*:repmgr:hdf456hw" > ~/.pgpass chmod 600 ~/.pgpass
Create db1 repmgr config.
node_id=1 node_name=db1 conninfo='host=10.8.1.1 user=repmgr dbname=repmgr connect_timeout=2' data_directory='/var/lib/postgresql/11/main' pg_bindir=/usr/lib/postgresql/11/bin/ service_start_command='sudo pg_ctlcluster 11 main start' service_stop_command='sudo pg_ctlcluster 11 main stop' service_restart_command='sudo pg_ctlcluster 11 main restart' service_reload_command='sudo pg_ctlcluster 11 main reload' log_facility=LOCAL0
Create db2 repmgr config.
node_id=2 node_name=db2 conninfo='host=10.8.2.1 user=repmgr dbname=repmgr connect_timeout=2' data_directory='/var/lib/postgresql/11/main' pg_bindir=/usr/lib/postgresql/11/bin/ service_start_command='sudo pg_ctlcluster 11 main start' service_stop_command='sudo pg_ctlcluster 11 main stop' service_restart_command='sudo pg_ctlcluster 11 main restart' service_reload_command='sudo pg_ctlcluster 11 main reload' log_facility=LOCAL0
Debian-based distros controls PostgreSQL using pg_ctlcluster util. We should allow postgres user to run it without a password. Add this to sudoers.
Defaults:postgres !requiretty postgres ALL = NOPASSWD: /usr/bin/pg_ctlcluster
Turn db1 to master server.
su - postgres repmgr -f /etc/repmgr.conf primary register
Check db1 act as primary.
repmgr -f /etc/repmgr.conf cluster show
Turn db2 to standby server.
su - postgres sudo pg_ctlcluster 11 main stop rm -r /var/lib/postgresql/11/main #move, but not remove, if you need to keep old data repmgr -h 10.8.1.1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone -F sudo pg_ctlcluster 11 main start repmgr standby register
Check replication cluster works.
repmgr -f /etc/repmgr.conf cluster show
For now, replication is asynchronous. We will make it synchronous, once barman will be setted up. Regardless, you can now promote and follow nodes using repmgr manually, but we will automate this process.