Fault tolerant PostgreSQL cluster. Part 5. Testing. Ansible. Final thoughts.

Part 1. Introduction.
Part 2. Replication. Repmgr.
Part 3. Auto-failover. Repmgr daemons.
Part 4. HA. Backup. Repmgr daemons. Barman.
Part 5. Testing. Ansible. Final thoughts.

Manual switchover

On backup server run.

su - postgres
repmgr standby switchover -f /etc/repmgrd_db2.conf --log-to-file

Auto-failover

db2 is a new master now. Gracefully shutdown db2 server.

shutdown -h now

Wait for db1 server promotion. Run on backup server.

tail -50 /var/log/syslog | grep repmgr

repmgrd[6872]: unable to ping "host=10.8.2.1 user=repmgr dbname=repmgr connect_timeout=2"
repmgrd[6872]: unable to connect to upstream node "db2" (node ID: 2)
repmgrd[6872]: checking state of node 2, 1 of 6 attempts
repmgrd[6872]: unable to ping "user=repmgr connect_timeout=2 dbname=repmgr host=10.8.2.1 fallback_application_name=repmgr"
repmgrd[6872]: sleeping 10 seconds until next reconnection attempt
repmgrd[6872]: checking state of node 2, 2 of 6 attempts
repmgrd[6872]: unable to ping "user=repmgr connect_timeout=2 dbname=repmgr host=10.8.2.1 fallback_application_name=repmgr"
repmgrd[6872]: sleeping 10 seconds until next reconnection attempt
repmgrd[6872]: checking state of node 2, 3 of 6 attempts
repmgrd[6872]: unable to ping "user=repmgr connect_timeout=2 dbname=repmgr host=10.8.2.1 fallback_application_name=repmgr"
repmgrd[6872]: sleeping 10 seconds until next reconnection attempt
repmgrd[6872]: checking state of node 2, 4 of 6 attempts
repmgrd[6872]: unable to ping "user=repmgr connect_timeout=2 dbname=repmgr host=10.8.2.1 fallback_application_name=repmgr"
repmgrd[6872]: sleeping 10 seconds until next reconnection attempt
repmgrd[6872]: checking state of node 2, 5 of 6 attempts
repmgrd[6872]: unable to ping "user=repmgr connect_timeout=2 dbname=repmgr host=10.8.2.1 fallback_application_name=repmgr"
repmgrd[6872]: sleeping 10 seconds until next reconnection attempt
repmgrd[6872]: checking state of node 2, 6 of 6 attempts
repmgrd[6872]: unable to ping "user=repmgr connect_timeout=2 dbname=repmgr host=10.8.2.1 fallback_application_name=repmgr"
repmgrd[6872]: unable to reconnect to node 2 after 6 attempts
repmgrd[6872]: primary and this node have the same location ("default")
repmgrd[6872]: no other sibling nodes - we win by default
repmgrd[6872]: this node is the only available candidate and will now promote itself
repmgrd[6872]: promote_command is:#012  "ssh postgres@10.8.1.1 /usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file"
repmgrd[6872]: executing notification command for event "repmgrd_failover_promote"
repmgrd[6872]: 0 followers to notify
repmgrd[6872]: switching to primary monitoring mode
repmgrd[6872]: executing notification command for event "repmgrd_reload"
repmgrd[6872]: monitoring cluster primary "db1" (node ID: 1)

Check cluster.

su - postgres
repmgr -f /etc/repmgrd_db1.conf cluster show

Return ex-master back to cluster

db1 is a master again. Let’s return db2 back to cluster.

First, start db2 server.
Second, start OpenVPN.

systemctl start openvpn

Third, rejoin db2 server.

su - postgres
repmgr -f /etc/repmgr.conf node rejoin -d "postgres://repmgr@10.8.1.1:5432/repmgr" --force-rewind

Repmgr will use pg_rewind tool, so rejoining will be faster than classic cloning.

IMPORTANT. Be sure repmgr daemon is connected to a freshly recovered server every time you want to degrade cluster again.

tail -50 /var/log/syslog | grep repmgr

repmgrd[10061]: monitoring primary node "db2" (node ID: 2) in normal state
repmgrd[10061]: attempting to reconnect to node "db2" (ID: 2)
repmgrd[10061]: reconnected to node "db2" (ID: 2)
repmgrd[10061]: node is now a standby, switching to standby monitoring
repmgrd[10061]: executing notification command for event "repmgrd_standby_reconnect"

Return corrupted ex-master back to cluster

OK, imagine your database files on db2 server was corrupted, but OS still works. The recovering is very similar to standby server creation.
Start OpenVPN.

systemctl start openvpn

Perform cloning and re-register standby. Run the following commands.

su - postgres
rm -r /var/lib/postgresql/11/main/* #not remove, but move if you need to
repmgr -h 10.8.1.1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone -F
chmod 700 /var/lib/postgresql/11/main/
sudo pg_ctlcluster 11 main start
repmgr standby register --force

If your repmgr is configured to use barman (optional from Part 4), you will retrieve backup files without affecting master server.

Check backup

Imitate database load. Login to database as repmgr on backup server.

su - postgres
psql -U repmgr -d repmgr -h 10.8.3.1 -p 54322

Create test table and run sample script.

create table t(n integer);

DO
$
begin 
	for i in 1..100000 loop 
		insert into t values(i); 
		RAISE NOTICE 'Val: %',i; 
		commit; 
	end loop; 
end;
$;

Right after that, hard reset db1(master) server.

Wait for db2 promotion as mentioned above.

psql -U repmgr -d repmgr -h 10.8.3.1 -p 54322
insert into t values (0);

Stop test instance on backup server.

systemctl stop postgresql

Prepare PostgreSQL dir for barman backup.

rm -r /var/lib/postgresql/11/main/*
chown barman:barman /var/lib/postgresql/11/main/

Recover latest backup. Barman doesn’t support latest keyword as target-time, so you can specify some time in the future, we will override it soon.

barman recover main_backup --target-time="2030-01-01 00:00:00" latest /var/lib/postgresql/11/main/

Copy the latest partial WAL and rename it.

cp /var/lib/barman/main_backup/streaming/*.partial /var/lib/postgresql/11/main/pg_wal/
rename 's/.partial//' /var/lib/postgresql/11/main/pg_wal/*

Override target timeline specified by barman.

echo "recovery_target_timeline = 'latest'" >> /var/lib/postgresql/11/main/recovery.conf

Change ownership back and start instance.

chown -R postgres:postgres /var/lib/postgresql/11/main/
systemctl start postgresql

Successfully starting instance not means it recovered to the latest time point. Check database logs and wait until database will be ready to accept rw-connections.

tail -50 /var/log/postgresql/postgresql-11-main.log

...
[31625] LOG:  restored log file "0000000D000000000000002B" from archive
[31625] LOG:  restored log file "0000000E000000000000002C" from archive
...
[31624] LOG:  database system is ready to accept connections

su - postgres
psql -U repmgr -d repmgr -h 127.0.0.1
select count(*) from t;

I want to ride my Ansible! Ansible!

HERE

Final thoughts

OK, what can I do next?

Install HAproxy to set up a port for read-only connections. Redirect all ro queries to standby with failover to master server. You can use external check script and select statement to repmgr over port 54322 to determine server role.
Run pgbouncer over 54322 port
Think twice about backup server security. It is the weakest server in out cluster, because it contains all of your backups and has passwordless ssh connections to both db servers.
Think twice about both db servers security. In case you use barman as repmgr’s recovery method, db servers has passwordless ssh connections to backup server.
Install monitoring system, which will monitor repmgr daemons, barman and degraded cluster state after auto-failover.
Thank me. That was not easy.

15 thoughts on “Fault tolerant PostgreSQL cluster. Part 5. Testing. Ansible. Final thoughts.”

Alex
24.12.2020 at 11:49 am

Большое спасибо за отличную статью!
Подскажите пожалуйста, когда выполняю ручное переключение мастера с db1 на db2, db2 становится новым мастером, но db1 не становится слейвом.
На db1 не может стартовать кластер main.
В логах db1 вижу информацию, что идет восстановление файлов журнала, к примеру последний 00000006.history восстановлен успешно, а следующий 00000007.history (которого еще нет на самом backup сервер) очевидно не восстанавливается.
Ошибка в логах:
ERROR: WAL file ‘00000007.history’ not found in server ‘main_backup’ (SSH host: 192.168.6.188)
ERROR: Remote ‘barman get-wal’ command has failed!
Если отключить восстановление из бэкапа barman, а тянуть состояние с нового мастера то ошибок нет…

1. InnerLife
  24.12.2020 at 12:22 pm
  
  Здравствуйте!
  Возможно, проблема в конфигурации и ни одно из звеньев репликации данных не было синхронным. Если это не так, могу написать вам на почту для дальнейшего исследования проблемы.
  
aamst
04.03.2021 at 3:09 pm

В целом, очень хорошо написано! У меня получилось завести на тестовом стенде.

1. InnerLife
  04.03.2021 at 3:14 pm
  
  Отлично! Спасибо.
  
aamst
17.03.2021 at 1:15 pm

а как вы мониторите постгресс если не секрет?
Я использую обычно postgres_exporter для прометея, вот этот
https://github.com/wrouesnel/postgres_exporter, однако ему требуется shared_preload_libraries = ‘pg_stat_statements’, но
в случае с repmgr shared_preload_libraries = ‘repmgr’ и он не работает.

1. InnerLife
  17.03.2021 at 1:41 pm
  Парметр shared_preload_libraries умеет принимать несколько либ через запятую.
```
 shared_preload_libraries = 'repmgr, pg_stat_statements'
```
ITX
12.08.2021 at 10:21 am

Добрый день.
repmgr standby switchover -f /etc/repmgrd_db2.conf –log-to-file
Чет в это пункте он мне ругается.
ERROR: ownership error for file “/var/lib/postgresql/11/main/recovery.conf”
DETAIL: file owner is “root”, data directory owner is “postgres”
где я что мог упустит везде с правами норм тем боле на recovery.conf во всех нодах postgres стоит. А на самих нодах переключение выполняется на ура. в чем мож быть проблема?

1. InnerLife
  12.08.2021 at 10:30 am
  
  И всё же попрошу вывод ls -l /var/lib/postgresql/11/main/recovery.conf, чтобы было понятно наверняка 🙂
  
  1. ITX
    12.08.2021 at 10:46 am
    
    postgres@pgdbslave:~/11/main$ ls -la /var/lib/postgresql/11/main/recovery.conf
    -rw——- 1 postgres postgres 147 Aug 12 11:17 /var/lib/postgresql/11/main/recovery.conf
    
    1. InnerLife
      12.08.2021 at 11:19 am
      
      Интересно.
      А если вручную зайти под пользователем postgres и попробовать открыть файл, то всё в порядке и он открывается, верно?
      Если да, то какого-то решения в голову больше не приходит.
      
      1. ITX
        13.08.2021 at 12:54 pm
        
        Да все открывается без проблем под пользователем postgres.
Sandro
13.12.2021 at 9:24 am

Hi.
Thank you for your excellent work and documentation.
I just want to clarify one thing that i tried to get from this howto and i could not.
You said that this can work without any HAproxy setup on the client side because, if i understood correctly, all the queries are routed through the backup server to the correct dbX server.
But.. what happen if the backup server dies?

1. InnerLife
  13.12.2021 at 11:40 am
  
  You understand it right. Backup server is the main entry point for such cluster configuration. So, if it dies, all you can do is to manually reconfigure app to use primary postgres instance. The question is it acceptable to do so if you have no backups.
  
Андрей
20.03.2023 at 8:47 am

Добрый день!
Поясните роль open-vpn в этой схеме. Можно
ли обойтись без него?

1. InnerLife
  20.03.2023 at 8:53 am
  
  Здравствуйте.
  Использование OpenVPN опционально и отсылает к первой статье цикла.
  В тестах оно носит роль восстановления сетевой связности, т.к. тестирование было на несвязанных локальной сетью машинах.
  Если у вас машины в одной сети, то инструкции для OpenVPN можно (скорее всего) проигнорировать.

Part 1. Introduction.Part 2. Replication. Repmgr.Part 3. Auto-failover. Repmgr daemons.Part 4. HA. Backup. Repmgr daemons. Barman.Part 5. Testing. Ansible. Final thoughts.