PROBLEM:

Database response was very slow (almost frozen) with database spending most of the time in wait event “log file switch (checkpoint incomplete)’ Waits event”

ANALYSIS:

As per Oracle docs:

This event indicates that Oracle needs to reuse a redo log file, but the current checkpoint position is still in that log. In this case, Oracle must wait until the checkpoint position passes that log. When the database waits on checkpoints, redo generation is stopped until the log switch is done.

I will explain in  simple manner, 

Let’s say LGWR finished writing to log file 2 and ready to switch to log file 1 and start writing. However the DBWR is still writing checkpoint related redo information of logfile 1 to disk.  So it cannot write to logfile 1 unless that checkpoint is completed .

< To complete a checkpoint, dbwr must write every associated dirty buffer to disk and every datafile and controlfile should be updated to latest checkpoint number>

 

So I checked the status of redolog files:

SQL> @log_member.sql

GROUP# THREAD# MEMBER                                                                 ARCHIVED   STATUS     Size (MB)
------ ------- ---------------------------------------------------------------------- ---------- ---------- ---------
     1       1 /dataprod01/oradata/PRODB/redo01.log                                  YES        ACTIVE            50
     2       1 /dataprod01/oradata/PRODB/redo02.log                                  NO         CURRENT           50
     3       1 /dataprod01/oradata/PRODB/redo03.log                                  YES        ACTIVE            50

We can see two are in ACTIVE and one is CURRENT, no log is in either INACTIVE or UNUSED state. i.e it is trying to do a log switch. But not able to do see.

So I just add two additional redo log files:

alter database add logfile group 4 ('/dataprod01/oradata/PRODB/redo04.log') size 50M;
alter database add logfile group 5 ('/dataprod01/oradata/PRODB/redo05.log') size 50M;


GROUP# THREAD# MEMBER                                                                 ARCHIVED   STATUS     Size (MB)
------ ------- ---------------------------------------------------------------------- ---------- ---------- ---------
     1       1 /dataprod01/oradata/PRODB/redo01.log                                  YES        ACTIVE            50
     2       1 /dataprod01/oradata/PRODB/redo02.log                                  YES        ACTIVE            50
     3       1 /dataprod01/oradata/PRODB/redo03.log                                  YES        ACTIVE            50
     4       1 /dataprod01/oradata/PRODB/redo04.log                                  NO         CURRENT           50
     5       1 /dataprod01/oradata/PRODB/redo05.log                                  No         UNUSED            50


As soon as, the additional redo log is added, all the stock transactions are cleared and the wait event is gone completed.

CONCLUSION:

So, if you find log file switch (checkpoint incomplete)’ Wait events, then add additional redo log groups to fix it.  You can even add redologs of higher size. (and drop the old ones). If still issue is there, then generate an AWR report and analyze.