Fault tolerant PostgreSQL cluster. Part 5. Testing. Ansible. Final thoughts.


Part 1. Introduction.
Part 2. Replication. Repmgr.
Part 3. Auto-failover. Repmgr daemons.
Part 4. HA. Backup. Repmgr daemons. Barman.
Part 5. Testing. Ansible. Final thoughts.

Manual switchover

On backup server run.

su - postgres
repmgr standby switchover -f /etc/repmgrd_db2.conf --log-to-file

Auto-failover

db2 is a new master now. Gracefully shutdown db2 server.

shutdown -h now

Wait for db1 server promotion. Run on backup server.

tail -50 /var/log/syslog | grep repmgr
repmgrd[6872]: unable to ping "host=10.8.2.1 user=repmgr dbname=repmgr connect_timeout=2"
repmgrd[6872]: unable to connect to upstream node "db2" (node ID: 2)
repmgrd[6872]: checking state of node 2, 1 of 6 attempts
repmgrd[6872]: unable to ping "user=repmgr connect_timeout=2 dbname=repmgr host=10.8.2.1 fallback_application_name=repmgr"
repmgrd[6872]: sleeping 10 seconds until next reconnection attempt
repmgrd[6872]: checking state of node 2, 2 of 6 attempts
repmgrd[6872]: unable to ping "user=repmgr connect_timeout=2 dbname=repmgr host=10.8.2.1 fallback_application_name=repmgr"
repmgrd[6872]: sleeping 10 seconds until next reconnection attempt
repmgrd[6872]: checking state of node 2, 3 of 6 attempts
repmgrd[6872]: unable to ping "user=repmgr connect_timeout=2 dbname=repmgr host=10.8.2.1 fallback_application_name=repmgr"
repmgrd[6872]: sleeping 10 seconds until next reconnection attempt
repmgrd[6872]: checking state of node 2, 4 of 6 attempts
repmgrd[6872]: unable to ping "user=repmgr connect_timeout=2 dbname=repmgr host=10.8.2.1 fallback_application_name=repmgr"
repmgrd[6872]: sleeping 10 seconds until next reconnection attempt
repmgrd[6872]: checking state of node 2, 5 of 6 attempts
repmgrd[6872]: unable to ping "user=repmgr connect_timeout=2 dbname=repmgr host=10.8.2.1 fallback_application_name=repmgr"
repmgrd[6872]: sleeping 10 seconds until next reconnection attempt
repmgrd[6872]: checking state of node 2, 6 of 6 attempts
repmgrd[6872]: unable to ping "user=repmgr connect_timeout=2 dbname=repmgr host=10.8.2.1 fallback_application_name=repmgr"
repmgrd[6872]: unable to reconnect to node 2 after 6 attempts
repmgrd[6872]: primary and this node have the same location ("default")
repmgrd[6872]: no other sibling nodes - we win by default
repmgrd[6872]: this node is the only available candidate and will now promote itself
repmgrd[6872]: promote_command is:#012  "ssh postgres@10.8.1.1 /usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file"
repmgrd[6872]: executing notification command for event "repmgrd_failover_promote"
repmgrd[6872]: 0 followers to notify
repmgrd[6872]: switching to primary monitoring mode
repmgrd[6872]: executing notification command for event "repmgrd_reload"
repmgrd[6872]: monitoring cluster primary "db1" (node ID: 1)

Check cluster.

su - postgres
repmgr -f /etc/repmgrd_db1.conf cluster show

Return ex-master back to cluster

db1 is a master again. Let’s return db2 back to cluster.

First, start db2 server.
Second, start OpenVPN.

systemctl start openvpn

Third, rejoin db2 server.

su - postgres
repmgr -f /etc/repmgr.conf node rejoin -d "postgres://repmgr@10.8.1.1:5432/repmgr" --force-rewind

Repmgr will use pg_rewind tool, so rejoining will be faster than classic cloning.

IMPORTANT. Be sure repmgr daemon is connected to a freshly recovered server every time you want to degrade cluster again.

tail -50 /var/log/syslog | grep repmgr
repmgrd[10061]: monitoring primary node "db2" (node ID: 2) in normal state
repmgrd[10061]: attempting to reconnect to node "db2" (ID: 2)
repmgrd[10061]: reconnected to node "db2" (ID: 2)
repmgrd[10061]: node is now a standby, switching to standby monitoring
repmgrd[10061]: executing notification command for event "repmgrd_standby_reconnect"

Return corrupted ex-master back to cluster

OK, imagine your database files on db2 server was corrupted, but OS still works. The recovering is very similar to standby server creation.
Start OpenVPN.

systemctl start openvpn

Perform cloning and re-register standby. Run the following commands.

su - postgres
rm -r /var/lib/postgresql/11/main/* #not remove, but move if you need to
repmgr -h 10.8.1.1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone -F
chmod 700 /var/lib/postgresql/11/main/
sudo pg_ctlcluster 11 main start
repmgr standby register --force

If your repmgr is configured to use barman (optional from Part 4), you will retrieve backup files without affecting master server.

Check backup

Imitate database load. Login to database as repmgr on backup server.

su - postgres
psql -U repmgr -d repmgr -h 10.8.3.1 -p 54322

Create test table and run sample script.

create table t(n integer);

DO
$
begin 
	for i in 1..100000 loop 
		insert into t values(i); 
		RAISE NOTICE 'Val: %',i; 
		commit; 
	end loop; 
end;
$;

Right after that, hard reset db1(master) server.

Wait for db2 promotion as mentioned above.

Log in to a new master(db2) and insert one more row to t table.

psql -U repmgr -d repmgr -h 10.8.3.1 -p 54322
insert into t values (0);

Stop test instance on backup server.

systemctl stop postgresql

Prepare PostgreSQL dir for barman backup.

rm -r /var/lib/postgresql/11/main/*
chown barman:barman /var/lib/postgresql/11/main/

Recover latest backup. Barman doesn’t support latest keyword as target-time, so you can specify some time in the future, we will override it soon.

barman recover main_backup --target-time="2030-01-01 00:00:00" latest /var/lib/postgresql/11/main/

Copy the latest partial WAL and rename it.

cp /var/lib/barman/main_backup/streaming/*.partial /var/lib/postgresql/11/main/pg_wal/
rename 's/.partial//' /var/lib/postgresql/11/main/pg_wal/*

Override target timeline specified by barman.

echo "recovery_target_timeline = 'latest'" >> /var/lib/postgresql/11/main/recovery.conf

Change ownership back and start instance.

chown -R postgres:postgres /var/lib/postgresql/11/main/
systemctl start postgresql

Successfully starting instance not means it recovered to the latest time point. Check database logs and wait until database will be ready to accept rw-connections.

tail -50 /var/log/postgresql/postgresql-11-main.log
...
[31625] LOG:  restored log file "0000000D000000000000002B" from archive
[31625] LOG:  restored log file "0000000E000000000000002C" from archive
...
[31624] LOG:  database system is ready to accept connections

Login to psql and check all inserted rows are here.

su - postgres
psql -U repmgr -d repmgr -h 127.0.0.1
select count(*) from t;

I want to ride my Ansible! Ansible!


HERE

Final thoughts

OK, what can I do next?

  • Install HAproxy to set up a port for read-only connections. Redirect all ro queries to standby with failover to master server. You can use external check script and select statement to repmgr over port 54322 to determine server role.
  • Run pgbouncer over 54322 port
  • Think twice about backup server security. It is the weakest server in out cluster, because it contains all of your backups and has passwordless ssh connections to both db servers.
  • Think twice about both db servers security. In case you use barman as repmgr’s recovery method, db servers has passwordless ssh connections to backup server.
  • Install monitoring system, which will monitor repmgr daemons, barman and degraded cluster state after auto-failover.
  • Thank me. That was not easy.

15 thoughts on “Fault tolerant PostgreSQL cluster. Part 5. Testing. Ansible. Final thoughts.”

  1. Большое спасибо за отличную статью!
    Подскажите пожалуйста, когда выполняю ручное переключение мастера с db1 на db2, db2 становится новым мастером, но db1 не становится слейвом.
    На db1 не может стартовать кластер main.
    В логах db1 вижу информацию, что идет восстановление файлов журнала, к примеру последний 00000006.history восстановлен успешно, а следующий 00000007.history (которого еще нет на самом backup сервер) очевидно не восстанавливается.
    Ошибка в логах:
    ERROR: WAL file ‘00000007.history’ not found in server ‘main_backup’ (SSH host: 192.168.6.188)
    ERROR: Remote ‘barman get-wal’ command has failed!
    Если отключить восстановление из бэкапа barman, а тянуть состояние с нового мастера то ошибок нет…

    1. Здравствуйте!
      Возможно, проблема в конфигурации и ни одно из звеньев репликации данных не было синхронным. Если это не так, могу написать вам на почту для дальнейшего исследования проблемы.

  2. В целом, очень хорошо написано! У меня получилось завести на тестовом стенде.

  3. а как вы мониторите постгресс если не секрет?
    Я использую обычно postgres_exporter для прометея, вот этот
    https://github.com/wrouesnel/postgres_exporter, однако ему требуется shared_preload_libraries = ‘pg_stat_statements’, но
    в случае с repmgr shared_preload_libraries = ‘repmgr’ и он не работает.

    1. Парметр shared_preload_libraries умеет принимать несколько либ через запятую.

       shared_preload_libraries = 'repmgr, pg_stat_statements'
      
  4. Добрый день.
    repmgr standby switchover -f /etc/repmgrd_db2.conf –log-to-file
    Чет в это пункте он мне ругается.
    ERROR: ownership error for file “/var/lib/postgresql/11/main/recovery.conf”
    DETAIL: file owner is “root”, data directory owner is “postgres”
    где я что мог упустит везде с правами норм тем боле на recovery.conf во всех нодах postgres стоит. А на самих нодах переключение выполняется на ура. в чем мож быть проблема?

    1. И всё же попрошу вывод ls -l /var/lib/postgresql/11/main/recovery.conf, чтобы было понятно наверняка 🙂

      1. postgres@pgdbslave:~/11/main$ ls -la /var/lib/postgresql/11/main/recovery.conf
        -rw——- 1 postgres postgres 147 Aug 12 11:17 /var/lib/postgresql/11/main/recovery.conf

        1. Интересно.
          А если вручную зайти под пользователем postgres и попробовать открыть файл, то всё в порядке и он открывается, верно?
          Если да, то какого-то решения в голову больше не приходит.

          1. Да все открывается без проблем под пользователем postgres.

  5. Hi.
    Thank you for your excellent work and documentation.
    I just want to clarify one thing that i tried to get from this howto and i could not.
    You said that this can work without any HAproxy setup on the client side because, if i understood correctly, all the queries are routed through the backup server to the correct dbX server.
    But.. what happen if the backup server dies?

    1. You understand it right. Backup server is the main entry point for such cluster configuration. So, if it dies, all you can do is to manually reconfigure app to use primary postgres instance. The question is it acceptable to do so if you have no backups.

  6. Андрей

    Добрый день!
    Поясните роль open-vpn в этой схеме. Можно
    ли обойтись без него?

    1. Здравствуйте.
      Использование OpenVPN опционально и отсылает к первой статье цикла.
      В тестах оно носит роль восстановления сетевой связности, т.к. тестирование было на несвязанных локальной сетью машинах.
      Если у вас машины в одной сети, то инструкции для OpenVPN можно (скорее всего) проигнорировать.

Leave a Comment

Your email address will not be published. Required fields are marked *