Pravidelne se delala denni zaloha - takze mohu rict s jistou, ze ten problem s dumpem nastal az po restartu. Zaroven nedoslo k minoritnimu updatu, to jsem overoval v yum/dnf logu. V te systemd unite je neco takoveho:
ExecStartPre=/usr/pgsql-14/bin/postgresql-14-check-db-dir ${PGDATA}
ExecStart=/usr/pgsql-14/bin/postmaster -D ${PGDATA}
ExecReload=/bin/kill -HUP $MAINPID
KillMode=mixed
KillSignal=SIGINT
FinalKillSignal=SIGQUIT #tohle jsem pridal ja...
# Do not set any timeout value, so that systemd will not kill postmaster
# during crash recovery.
TimeoutSec=0
# 0 is the same as infinity, but "infinity" needs systemd 229
TimeoutStartSec=0
TimeoutStopSec=1h
jsou tam ty timeouty sice, ale stejne jsem videl v logu systemd ten SIGKILL - ze pokud jsem mel klienta s
lbstat=# SELECT pg_backend_pid();
pg_backend_pid
----------------
116
(1 řádka)
lbstat=# SELECT pg_sleep(10000);
pak:
[root@ed46d0c1e0d4 /]# ps -elf | grep postgres
4 S postgres 48 1 0 80 0 - 210730 - 22:03 ? 00:00:00 /usr/pgsql-14/bin/postmaster -D /home/pgsql/data/
5 S postgres 58 48 0 80 0 - 70474 - 22:03 ? 00:00:00 postgres: logger
1 S postgres 60 48 0 80 0 - 210730 - 22:03 ? 00:00:00 postgres: checkpointer
1 S postgres 61 48 0 80 0 - 210730 - 22:03 ? 00:00:00 postgres: background writer
5 S postgres 62 48 0 80 0 - 210730 - 22:03 ? 00:00:00 postgres: walwriter
5 S postgres 63 48 0 80 0 - 210870 - 22:03 ? 00:00:00 postgres: autovacuum launcher
5 S postgres 64 48 0 80 0 - 70508 - 22:03 ? 00:00:00 postgres: stats collector
5 S postgres 65 48 0 80 0 - 210835 - 22:03 ? 00:00:00 postgres: logical replication launcher
4 S root 115 91 0 80 0 - 58389 - 22:03 pts/2 00:00:00 psql lbstat postgres
5 S postgres 116 48 0 80 0 - 211090 - 22:03 ? 00:00:00 postgres: postgres lbstat [local] SELECT
0 S root 150 67 0 80 0 - 55419 - 22:05 pts/1 00:00:00 grep --color=auto postgres
[root@ed46d0c1e0d4 /]# service postgresql-14 restart
Redirecting to /bin/systemctl restart postgresql-14.service
[root@ed46d0c1e0d4 /]# journalctl -u postgresql-14.service
úno 12 22:05:28 ed46d0c1e0d4 systemd[1]: Stopping PostgreSQL 14 database server...
úno 12 22:05:28 ed46d0c1e0d4 systemd[1]: postgresql-14.service: Killing process 58 (postmaster) with signal SIGKILL.
úno 12 22:05:28 ed46d0c1e0d4 systemd[1]: postgresql-14.service: Deactivated successfully.
úno 12 22:05:28 ed46d0c1e0d4 systemd[1]: postgresql-14.service: Unit process 58 (postmaster) remains running after unit stopped.
úno 12 22:05:28 ed46d0c1e0d4 systemd[1]: Stopped PostgreSQL 14 database server.
úno 12 22:05:28 ed46d0c1e0d4 systemd[1]: Starting PostgreSQL 14 database server...
úno 12 22:05:28 ed46d0c1e0d4 postmaster[169]: 2025-02-12 22:05:28.659 CET @:(169) LOG: redirecting log output to logging collector process
úno 12 22:05:28 ed46d0c1e0d4 postmaster[169]: 2025-02-12 22:05:28.659 CET @:(169) HINT: Future log output will appear in directory "log".
úno 12 22:05:28 ed46d0c1e0d4 systemd[1]: Started PostgreSQL 14 database server.
a udelal restart tak v logu systemd bylo videt ze pouzil signal SIGKILL - jestli to systemd pise spravne, kazdopadne po zmene FinaKillSignal tam bylo to SIGQUIT.
Tohle je test v dockeru se systemd unitou bez zmeny FinalKillSignal na SIGQUIT. Jak je videt, systemd opet posila SIGKILL na proces 58 - coz je postgres: logger
v logu PG:
2025-02-12 22:05:28.433 CET,,,48,,67ad0c9c.30,8,,2025-02-12 22:03:24 CET,,0,LOG,00000,"received fast shutdown request",,,,,,,,,"","postmaster",,0
2025-02-12 22:05:28.439 CET,,,48,,67ad0c9c.30,9,,2025-02-12 22:03:24 CET,,0,LOG,00000,"aborting any active transactions",,,,,,,,,"","postmaster",,0
2025-02-12 22:05:28.440 CET,"postgres","lbstat",116,"[local]",67ad0caf.74,4,"SELECT",2025-02-12 22:03:43 CET,3/3,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,"SELECT pg_sleep(10000);",,,"psql","client backend"
,,0
2025-02-12 22:05:28.440 CET,"postgres","lbstat",116,"[local]",67ad0caf.74,5,"SELECT",2025-02-12 22:03:43 CET,,0,LOG,00000,"disconnection: session time: 0:01:45.030 user=postgres database=lbstat host=[local]",,,,,,,,,"psql","client backen
d",,0
2025-02-12 22:05:28.443 CET,,,48,,67ad0c9c.30,10,,2025-02-12 22:03:24 CET,,0,LOG,00000,"background worker ""logical replication launcher"" (PID 65) exited with exit code 1",,,,,,,,,"","postmaster",,0
2025-02-12 22:05:28.445 CET,,,163,"[local]",67ad0d18.a3,1,"",2025-02-12 22:05:28 CET,,0,LOG,00000,"connection received: host=[local]",,,,,,,,,"","not initialized",,0
2025-02-12 22:05:28.445 CET,"postgres","lbstat",163,"[local]",67ad0d18.a3,2,"",2025-02-12 22:05:28 CET,,0,FATAL,57P03,"the database system is shutting down",,,,,,,,,"","client backend",,0
2025-02-12 22:05:28.447 CET,,,60,,67ad0c9c.3c,1,,2025-02-12 22:03:24 CET,,0,LOG,00000,"shutting down",,,,,,,,,"","checkpointer",,0
2025-02-12 22:05:28.455 CET,,,60,,67ad0c9c.3c,2,,2025-02-12 22:03:24 CET,,0,LOG,00000,"checkpoint starting: shutdown immediate",,,,,,,,,"","checkpointer",,0
2025-02-12 22:05:28.528 CET,,,60,,67ad0c9c.3c,3,,2025-02-12 22:03:24 CET,,0,LOG,00000,"checkpoint complete: wrote 4 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.035 s, sync=0.010 s, total=0.081 s; sync files=3, lon
gest=0.006 s, average=0.004 s; distance=1 kB, estimate=1 kB",,,,,,,,,"","checkpointer",,0
2025-02-12 22:05:28.551 CET,,,48,,67ad0c9c.30,11,,2025-02-12 22:03:24 CET,,0,LOG,00000,"database system is shut down",,,,,,,,,"","postmaster",,0
na prvni pohled korektni vypnuti ... nevim, mozna ten logical replication launcher .. ale mozna to je normalni chovani ...
novy log po restartu:
2025-02-12 22:05:28.659 CET,,,169,,67ad0d18.a9,1,,2025-02-12 22:05:28 CET,,0,LOG,00000,"ending log output to stderr",,"Future log output will go to log destination ""csvlog"".",,,,,,,"","postmaster",,0
2025-02-12 22:05:28.659 CET,,,169,,67ad0d18.a9,2,,2025-02-12 22:05:28 CET,,0,LOG,00000,"starting PostgreSQL 14.15 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2), 64-bit",,,,,,,,,"","postmaster",,0
2025-02-12 22:05:28.660 CET,,,169,,67ad0d18.a9,3,,2025-02-12 22:05:28 CET,,0,LOG,00000,"listening on IPv4 address ""0.0.0.0"", port 5432",,,,,,,,,"","postmaster",,0
2025-02-12 22:05:28.660 CET,,,169,,67ad0d18.a9,4,,2025-02-12 22:05:28 CET,,0,LOG,00000,"listening on IPv6 address ""::"", port 5432",,,,,,,,,"","postmaster",,0
2025-02-12 22:05:28.666 CET,,,169,,67ad0d18.a9,5,,2025-02-12 22:05:28 CET,,0,LOG,00000,"listening on Unix socket ""/run/postgresql/.s.PGSQL.5432""",,,,,,,,,"","postmaster",,0
2025-02-12 22:05:28.675 CET,,,169,,67ad0d18.a9,6,,2025-02-12 22:05:28 CET,,0,LOG,00000,"listening on Unix socket ""/tmp/.s.PGSQL.5432""",,,,,,,,,"","postmaster",,0
2025-02-12 22:05:28.684 CET,,,171,,67ad0d18.ab,1,,2025-02-12 22:05:28 CET,,0,LOG,00000,"database system was shut down at 2025-02-12 22:05:28 CET",,,,,,,,,"","startup",,0
2025-02-12 22:05:28.695 CET,,,169,,67ad0d18.a9,7,,2025-02-12 22:05:28 CET,,0,LOG,00000,"database system is ready to accept connections",,,,,,,,,"","postmaster",,0
2025-02-12 22:10:28.749 CET,,,172,,67ad0d18.ac,1,,2025-02-12 22:05:28 CET,,0,LOG,00000,"checkpoint starting: time",,,,,,,,,"","checkpointer",,0
2025-02-12 22:10:28.787 CET,,,172,,67ad0d18.ac,2,,2025-02-12 22:05:28 CET,,0,LOG,00000,"checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.014 s, sync=0.004 s, total=0.039 s; sync files=2, longest=0.002 s, average=0.002 s; distance=0 kB, estimate=0 kB",,,,,,,,,"","checkpointer",,0
nic o recovery, takze netusim jestli systemd keca a SIGKILL nakonec nedoslo nebo doslo a postgres nepoznal ze k nemu doslo protoze uz mel vse ulozeno ... ale nevidim do chovani systemd, prikladam verzi systemd:
systemd-252-46.el9_5.2.alma.1.x86_64
David