PROBLEM:

I have received below error , while starting efm agent on standby server.



[root@dbhost41 edbdata]# systemctl start edb-efm-3.9
Job for edb-efm-3.9.service failed because the control process exited with error code. See "systemctl status edb-efm-3.9.service" and "journalctl -xe" for details.


cat /var/log/efm-3.9/startup-efm.log
2021-06-15 14:58:06 Trigger file validation failed. Could not start agent as standby. See logs for more details.




SOLUTION:

1. Check the trigger_file parameter value.

postgres=# \x
Expanded display is on.
postgres=#  select * from pg_settings where name='promote_trigger_file';
-[ RECORD 1 ]---+-------------------------------------------------------------------
name            | promote_trigger_file
setting         |                    --- >> It is blank, means no values is set.
unit            |
category        | Replication / Standby Servers
short_desc      | Specifies a file name whose presence ends recovery in the standby.
extra_desc      |
context         | sighup
vartype         | string
source          | default
min_val         |
max_val         |
enumvals        |
boot_val        |
reset_val       |
sourcefile      |
sourceline      |
pending_restart | f

Above output shows, promote_trigger_file parameter in not set in the config file.

2. uncomment and update the promote_trigger_file parameter in postgres.conf file


vi postgres.conf

promote_trigger_file='/postgres/edbdata/trigger5444'

3. Reload the configuration:

postgres=# select pg_reload_conf();
 pg_reload_conf
----------------
 t
(1 row)


postgres=# \x
Expanded display is on.
postgres=#  select * from pg_settings where name='promote_trigger_file';
-[ RECORD 1 ]---+-------------------------------------------------------------------
name            | promote_trigger_file
setting         | /pgdata/edbdata/trigger_file
unit            |
category        | Replication / Standby Servers
short_desc      | Specifies a file name whose presence ends recovery in the standby.
extra_desc      |
context         | sighup
vartype         | string
source          | configuration file
min_val         |
max_val         |
enumvals        |
boot_val        |
reset_val       | /pgdata/edbdata/trigger_file
sourcefile      | /pgdata/edbdata/postgresql.conf
sourceline      | 318
pending_restart | f

Now we can see the value is update, lets restart the efm.

4. Start efm and check status :

[root@dbhost41 edbdata]# systemctl start edb-efm-3.9


[root@dbhost41 ~]# systemctl status edb-efm-3.9
● edb-efm-3.9.service - EnterpriseDB Failover Manager 3.9
   Loaded: loaded (/usr/lib/systemd/system/edb-efm-3.9.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2021-06-15 15:08:20 +03; 3h 20min ago
  Process: 1660 ExecStart=/bin/bash -c /usr/edb/efm-3.9/bin/runefm.sh start ${CLUSTER} (code=exited, status=0/SUCCESS)
 Main PID: 1740 (java)
    Tasks: 27
   CGroup: /system.slice/edb-efm-3.9.service
           └─1740 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.292.b10-1.el7_9.x86_64/jre/bin/java -cp /usr/edb/efm-3.9/lib/EFM-3.9.jar -Xmx128m com.enterprisedb.efm.main.ServiceCom...

Jun 15 15:08:16 dbhost41 systemd[1]: Starting EnterpriseDB Failover Manager 3.9...
Jun 15 15:08:17 dbhost41 sudo[1757]:      efm : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/edb/efm-3.9/bin/efm_root_functions validatedbowner efm
Jun 15 15:08:17 dbhost41 sudo[1777]:      efm : TTY=unknown ; PWD=/ ; USER=enterprisedb ; COMMAND=/usr/edb/efm-3.9/bin/efm_db_functions validaterecoveryconf efm
Jun 15 15:08:17 dbhost41 sudo[1795]:      efm : TTY=unknown ; PWD=/ ; USER=enterprisedb ; COMMAND=/usr/edb/efm-3.9/bin/efm_db_functions validatedbconf efm
Jun 15 15:08:17 dbhost41 sudo[1813]:      efm : TTY=unknown ; PWD=/ ; USER=enterprisedb ; COMMAND=/usr/edb/efm-3.9/bin/efm_db_functions validatepgbin efm
Jun 15 15:08:17 dbhost41 sudo[1849]:      efm : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/edb/efm-3.9/bin/efm_root_functions dbservicestatus efm
Jun 15 15:08:17 dbhost41 sudo[1873]:      efm : TTY=unknown ; PWD=/ ; USER=enterprisedb ; COMMAND=/usr/edb/efm-3.9/bin/efm_db_functions validatepromotetriggerfil...rigger_file
Jun 15 15:08:20 dbhost41 systemd[1]: Started EnterpriseDB Failover Manager 3.9.
Hint: Some lines were ellipsized, use -l to show in full.
[root@dbhost88 ~]#




[root@dbhost41 ~]# /usr/edb/efm-3.9/bin/efm cluster-status efm
Cluster Status: efm

        Agent Type  Address              Agent  DB       VIP
        -----------------------------------------------------------------------
        Master      10.20.30.40         UP     UP
        Standby     10.20.30.41         UP     UP

Allowed node host list:
        10.20.30.40 10.20.30.41

Membership coordinator: 10.20.30.40

Standby priority host list:
        10.20.30.41

Promote Status:

        DB Type     Address              WAL Received LSN   WAL Replayed LSN   Info
        ---------------------------------------------------------------------------
        Master      10.20.30.40                            0/70001C0
        Standby     10.20.30.41         0/7000000          0/70001C0

        Standby database(s) in sync with master. It is safe to promote.