During Upgrade of CRS from 12c to 18c -> MGMTDB causing a faulty CRS state [UPGRADE FINAL]

During the upgrade/patch of our grid from 12.2 to 18.8 with the below command, we got a failure during the “upgradeSchema -fromversion 12.2.0.1.0” section of the executeConfigTools.

/u01/app/180/grid/gridSetup.sh -executeConfigTools -responseFile /setup/cluster_upgrade/grid_upgrade.rsp -silent

This failure caused a faulty state for the cluster -> [UPGRADE FINAL] which can be seen by the below command:

[grid]$ crsctl query crs activeversion -f

Oracle Clusterware active version on the cluster is [18.0.0.0.0]. The cluster upgrade state is [UPGRADE FINAL]. The cluster active patch level is [118218107].

And the logfile for the GridSetupActions was like:

INFO:  [Jan 30, 2020 4:42:22 PM] Started Plugin named: rhpupgrade
 INFO:  [Jan 30, 2020 4:42:22 PM] Found associated job
 INFO:  [Jan 30, 2020 4:42:22 PM] Starting 'Upgrading RHP Repository'
 INFO:  [Jan 30, 2020 4:42:22 PM] Starting 'Upgrading RHP Repository'
 INFO:  [Jan 30, 2020 4:42:22 PM] Executing RHPUPGRADE
 INFO:  [Jan 30, 2020 4:42:22 PM] Command /u01/app/180/grid/bin/rhprepos upgradeSchema -fromversion 12.2.0.1.0
 INFO:  [Jan 30, 2020 4:42:22 PM] … GenericInternalPlugIn.handleProcess() entered.
 INFO:  [Jan 30, 2020 4:42:22 PM] … GenericInternalPlugIn: getting configAssistantParmas.
 INFO:  [Jan 30, 2020 4:42:22 PM] … GenericInternalPlugIn: checking secretArguments.
 INFO:  [Jan 30, 2020 4:42:22 PM] No arguments to pass to stdin
 INFO:  [Jan 30, 2020 4:42:22 PM] … GenericInternalPlugIn: starting read loop.
 INFO:  [Jan 30, 2020 4:42:23 PM] Completed Plugin named: rhpupgrade
 INFO:  [Jan 30, 2020 4:42:23 PM] ConfigClient.saveSession method called
 INFO:  [Jan 30, 2020 4:42:23 PM] Upgrading RHP Repository failed.
 INFO:  [Jan 30, 2020 4:42:23 PM] Upgrading RHP Repository failed.

CAUSE:
Having look at all the logs, we realized that, the reason for the failure was due to the fact that the MGMTDB didn’t upgrade to 18.3 nor 18.8 and stayed in the 12.2 version. However, the patch routine – datapatch step was looking for this database in the new path and failing to locate the database since it was still running on the old Oracle home 12c. This caused the upgrade/path to fail.

This can be seen clearly in the following log (/u01/app/grid/cfgtoollogs/sqlpatch/sqlpatch_34673_2020_01_30_20_19_58/sqlpatch_invocation.log):

Connecting to database…
 Error: prereq checks failed!
 Database connect failed with: ORA-01034: ORACLE not available
 ORA-27101: shared memory realm does not exist
 Linux-x86_64 Error: 2: No such file or directory
 Additional information: 4150
 Additional information: -580491661 (DBD ERROR: OCISessionBegin)
 Please refer to MOS Note 1609718.1 and/or the invocation log
 /u01/app/grid/cfgtoollogs/sqlpatch/sqlpatch_34673_2020_01_30_20_19_58/sqlpatch_invocation.log
 for information on how to resolve the above errors.
 SQL Patching tool complete on Thu Jan 30 20:19:58 2020
 2020/01/30 20:19:58 CLSRSC-488: Patching the Grid Infrastructure Management Repository database failed. 
 After fixing the cause of failure Run opatchauto resume
 ]
 OPATCHAUTO-68061: The orchestration engine failed.
 OPATCHAUTO-68061: The orchestration engine failed with return code 1
 OPATCHAUTO-68061: Check the log for more details.
 OPatchAuto failed.
 OPatchauto session completed at Thu Jan 30 20:19:59 2020
 Time taken to complete the session 3 minutes, 19 seconds
 opatchauto failed with error code 42

SOLUTION:
For the solution, I decided to manually upgrade the MGMTDB to 18c and then re-initiate the executeConfigTools command.

First, set the old ORACLE_HOME and then startup the MGMTDB in upgrade mode

[grid]$ export ORACLE_SID=-MGMTDB
[grid]$ export ORACLE_HOME=/u01/app/12.2.0.1/grid 
[grid]$ sqlplus / as sysdba 
SQL> shutdown immediate;
SQL> startup upgrade;

Then, initiate the UPGRADE manually as follows:

[grid]$ export ORACLE_HOME=/u01/app/180/grid
[grid]$ export ORACLE_SID=-MGMTDB
[grid]$ cd $ORACLE_HOME/rdbms/admin
[grid]$ $ORACLE_HOME/perl/bin/perl catctl.pl catupgrd.sql

When the upgrade is over, we should also fix the cluster resource ora.mgmtdb to make it point to the correct ORACLE_HOME. We can have a look at the current value and confirm that it is pointing to the wrong ORACLE_HOME with the below command:

[grid]$ crsctl stat res ora.mgmtdb -p|grep ORACLE_HOME

ORACLE_HOME=/u01/app/12.2.0.1/grid
ORACLE_HOME_OLD=

We can modify it for as follows:

[grid]$ crsctl modify resource ora.mgmtdb -attr "ORACLE_HOME=/u01/app/180/grid" -unsupported

Now the MGMTDB should be in the correct version and the resource is pointing to the correct home. Now we can re-initiate the executeConfigTools:

/u01/app/180/grid/gridSetup.sh -executeConfigTools -responseFile /setup/cluster_upgrade/grid_upgrade.rsp -silent

This should conclude the upgrade/patch:

[grid]$ crsctl query crs activeversion -f

Oracle Clusterware active version on the cluster is [18.0.0.0.0]. The cluster upgrade state is [NORMAL]. The cluster active patch level is [118218107].

If you still want to perform a clean MGMTDB Installation, you can drop-recreate the MGMTDB as follows:

# Delete the MGMTDB (on the server where it is running)

[grid]$ dbca -silent -deleteDatabase -sourceDB -MGMTDB

# Re-Create the database
At this stage you need to download the mdbutil.pl from Doc ID 2065175.1
You can also change the ASM diskgroup and configure the MGMTDB on a different diskgroup (best practice)

[grid]$ ./mdbutil.pl --addmdb --target=+MGMTDG

During Upgrade of CRS from 12c to 18c -> MGMTDB causing a faulty CRS state [UPGRADE FINAL]

Related Posts

Update Resource Plan Directive

How to Apply Oracle PSU to Various Oracle Database Versions

Oracle Database 11.2 RAC Installation on OEL 5.8 using ZFS [VirtualBox]