Home » Server Options » RAC & Failsafe » 12.1.0.2 root.sh fails to start after deconfiguring clusterware (12.1.0.2, redhat 7.3)
12.1.0.2 root.sh fails to start after deconfiguring clusterware [message #668119] Wed, 07 February 2018 17:28 Go to next message
juniordbanewbie
Messages: 241
Registered: April 2014
Senior Member
Dear all,

as the customer have additional requirements on bonding, crs daemon does not start. As a result I deconfigure the whole clusterware according to

https://docs.oracle.com/database/121/CWLIN/rem_orcl.htm#CWLIN349

unfortunately I did not zero out the disk that is used to store ocr and voting disk after deconfiguring

when I reconfigured again, ora.asm did not even start. in fact the root.sh script did not even finish completing on the 1st node

so I try to deconfigured again,

unfortunately this time I could not deconfigured again.

here's the output of deconfigure

PRCR-1068 : Failed to query resources
CRS-0184 : Cannot communicate with the CRS daemon.
PRCR-1068 : Failed to query resources
CRS-0184 : Cannot communicate with the CRS daemon.
PRCR-1070 : Failed to check if resource ora.net1.network is registered
CRS-0184 : Cannot communicate with the CRS daemon.
PRCR-1070 : Failed to check if resource ora.helper is registered
CRS-0184 : Cannot communicate with the CRS daemon.
PRCR-1070 : Failed to check if resource ora.ons is registered
CRS-0184 : Cannot communicate with the CRS daemon.

CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.evmd' on 'dwhdb1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'dwhdb1'
CRS-2676: Start of 'ora.mdnsd' on 'dwhdb1' succeeded
CRS-2676: Start of 'ora.evmd' on 'dwhdb1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'dwhdb1'
CRS-2676: Start of 'ora.gpnpd' on 'dwhdb1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'dwhdb1'
CRS-2672: Attempting to start 'ora.gipcd' on 'dwhdb1'
CRS-2676: Start of 'ora.cssdmonitor' on 'dwhdb1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'dwhdb1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'dwhdb1'
CRS-2672: Attempting to start 'ora.diskmon' on 'dwhdb1'
CRS-2676: Start of 'ora.diskmon' on 'dwhdb1' succeeded
CRS-2676: Start of 'ora.cssd' on 'dwhdb1' succeeded
2018/02/07 16:15:32 CLSRSC-115: Start of resource 'ora.asm' failed

2018/02/07 16:15:32 CLSRSC-558: failed to deconfigure ASM

Died at /u01/app/12.1.0.2/grid/crs/install/crsdeconfig.pm line 1039.
The command '/u01/app/12.1.0.2/grid/perl/bin/perl -I/u01/app/12.1.0.2/grid/perl/lib -I/u01/app/12.1.0.2/grid/crs/install /u01/app/12.1.0.2/grid/crs/install/rootcrs.pl -deconfig -force -lastnode' execution failed




here's the alert log

2018-02-07 16:41:17.993 [OCTSSD(22351)]CRS-8504: Oracle Clusterware OCTSSD process with operating system process ID 22351 is exiting
2018-02-07 16:41:19.107 [ORAROOTAGENT(22641)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 22641
2018-02-07 16:41:19.137 [OCTSSD(22654)]CRS-8500: Oracle Clusterware OCTSSD process is starting with operating system process ID 22654
2018-02-07 16:41:20.220 [OCTSSD(22654)]CRS-2407: The new Cluster Time Synchronization Service reference node is host dwhdb1.
2018-02-07 16:41:20.220 [OCTSSD(22654)]CRS-2401: The Cluster Time Synchronization Service started on host dwhdb1.
2018-02-07 16:42:19.117 [ORAROOTAGENT(22641)]CRS-5818: Aborted command 'start' for resource 'ora.cluster_interconnect.haip'. Details at (:CRSAGF00113:) {0:9:4} in /u01/app/grid/diag/crs/dwhdb1/crs/trace/ohasd_orarootagent_root.trc.
2018-02-07 16:42:19.225 [ORAROOTAGENT(22641)]CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:
2018-02-07 16:42:19.225+Start action for HAIP aborted. For details refer to "(:CLSN00107:)" in "/u01/app/grid/diag/crs/dwhdb1/crs/trace/ohasd_orarootagent_root.trc".
2018-02-07 16:42:23.119 [OHASD(21751)]CRS-2757: Command 'Start' timed out waiting for response from the resource 'ora.cluster_interconnect.haip'. Details at (:CRSPE00163:) {0:9:4} in /u01/app/grid/diag/crs/dwhdb1/crs/trace/ohasd.trc.
2018-02-07 16:42:23.148 [OCTSSD(22654)]CRS-2405: The Cluster Time Synchronization Service on host dwhdb1 is shutdown by user
2018-02-07 16:42:23.148 [OCTSSD(22654)]CRS-8504: Oracle Clusterware OCTSSD process with operating system process ID 22654 is exiting
2018-02-07 16:42:24.153 [OHASD(21751)]CRS-2878: Failed to restart resource 'ora.asm'


here's the output of ohasd_orarootagent_root.trc


2018-02-07 16:42:23.148079 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] (:CLSN00108:) clsn_agent::stop {
2018-02-07 16:42:23.148118 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] Utils::getOracleHomeAttrib getEnvVar oracle_home:/u01/app/12.1.0.2/grid
2018-02-07 16:42:23.148124 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] Utils::getOracleHomeAttrib oracle_home:/u01/app/12.1.0.2/grid
2018-02-07 16:42:23.148344 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] PID 22654 from /u01/app/12.1.0.2/grid/ctss/init/dwhdb1.pid
2018-02-07 16:42:23.148351 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] CLSDM Based stop action
2018-02-07 16:42:23.148365 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] Using Timeout value of 18000 for stop message
2018-02-07 16:42:23.148670 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] ClsdmClient::sendMessage clsdmc_respget return: status=0, ecode=0
2018-02-07 16:42:23.151393 : USRTHRD:2424530688: {0:9:4} Thread:[DaemonCheck:ctssd] Thread exiting
2018-02-07 16:42:23.151405 : USRTHRD:2424530688: {0:9:4} Thread:[DaemonCheck:ctssd] Skipping Agent Initiated a check action
2018-02-07 16:42:23.151410 : USRTHRD:2424530688: {0:9:4} Thread:[DaemonCheck:ctssd] isRunning is reset to false here
2018-02-07 16:42:24.149128 :GIPCXCPT:2439239424:  gipcInternalSend: connection not valid for send operation endp 0x7f1c7c051180 [000000000000046b] { gipcEndpoint : localAddr 'ipc', remoteAddr 'ipc://dwhdb1_DBG_CTSSD', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 22654, readyRef (nil), ready 0, wobj 0x7f1c7c05d7b0, sendp 0x7f1c7c05d570 status 0flags 0x2000a61e, flags-2 0x1, usrFlags 0x20020 }, ret gipcretConnectionLost (12)
2018-02-07 16:42:24.149160 :GIPCXCPT:2439239424:  gipcSendF [clsdmc_send : clsdmc.c : 728]: EXCEPTION[ ret gipcretConnectionLost (12) ]  failed to send on endp 0x7f1c7c051180 [000000000000046b] { gipcEndpoint : localAddr 'ipc', remoteAddr 'ipc://dwhdb1_DBG_CTSSD', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 22654, readyRef (nil), ready 0, wobj 0x7f1c7c05d7b0, sendp 0x7f1c7c05d570 status 0flags 0x2000a61e, flags-2 0x1, usrFlags 0x20020 }, addr 0000000000000000, buf 0x7f1c740232d0, len 65, cookie (nil), flags 0x0
  CLSDMC:2439239424: Failed to send dynamic control message to connection [ipc://dwhdb1_DBG_CTSSD][12]
2018-02-07 16:42:24.149186 :  CLSDMC:2439239424: gipcWait gets wrong msg from connection [ipc://dwhdb1_DBG_CTSSD][0] with type gipcreqtypeDisconnect
2018-02-07 16:42:24.149240 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] ClsdmClient::sendMessage clsdmc_send error rmsg:0 ecode:-10 errbuf:CRS-02004: error 0 encountered when sending messages to CTSSD
2018-02-07 16:42:24.150469 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] (:CLSN00108:) clsn_agent::stop }
2018-02-07 16:42:24.150479 :    AGFW:2439239424: {0:9:4} Command: stop for resource: ora.ctssd 1 1 completed with status: SUCCESS
2018-02-07 16:42:24.150761 :    AGFW:2435036928: {0:9:4} Agent sending reply for: RESOURCE_STOP[ora.ctssd 1 1] ID 4099:868
2018-02-07 16:42:24.150908 :  CLSDMC:2439239424: Connecting to ipc://dwhdb1_DBG_CTSSD
2018-02-07 16:42:24.151086 :  CLSDMC:2439239424: Error: gipcWait for gipcConnect - ret_gipcreqinfo=gipcretConnectionRefused, type_gipcreqinfo=gipcreqtypeConnect
2018-02-07 16:42:24.151135 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [check] ClsdmClient::sendMessage clsdmc_send error rmsg:0 ecode:-7 errbuf:
2018-02-07 16:42:24.151163 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [check] Calling PID check for daemon
2018-02-07 16:42:24.151201 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [check] Process id 22654 translated to
2018-02-07 16:42:24.151229 :  CLSDMC:2439239424: Connecting to ipc://dwhdb1_DBG_CTSSD
2018-02-07 16:42:24.151363 :  CLSDMC:2439239424: Error: gipcWait for gipcConnect - ret_gipcreqinfo=gipcretConnectionRefused, type_gipcreqinfo=gipcreqtypeConnect
2018-02-07 16:42:24.151421 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [check] ClsdmClient::sendMessage clsdmc_send error rmsg:0 ecode:-7 errbuf:
2018-02-07 16:42:24.151463 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [check] Check return = 1, state detail = NULL
2018-02-07 16:42:24.151655 :    AGFW:2435036928: {0:9:4} ora.ctssd 1 1 state changed from: STOPPING to: OFFLINE
2018-02-07 16:42:24.151731 :    AGFW:2435036928: {0:9:4} Agent sending last reply for: RESOURCE_STOP[ora.ctssd 1 1] ID 4099:868
2018-02-07 16:42:24.151781 :    AGFW:2435036928: {0:9:4} Agent has no resources to be monitored, Shutting down ..
2018-02-07 16:42:24.151816 :    AGFW:2435036928: {0:9:4} Agent sending message to PE: AGENT_SHUTDOWN_REQUEST[Proxy] ID 20486:63
2018-02-07 16:42:24.152954 :    AGFW:2435036928: {0:9:4} Agent is shutting down.
2018-02-07 16:42:24.152963 :   AGENT:2435036928: {0:9:4} Agfw calling user exitCB, will exit on return
2018-02-07 16:42:24.152968 :   AGENT:2435036928: {0:9:4} returned from user exitCB, exiting
2018-02-07 16:42:24.152987 :    AGFW:2435036928: {0:9:4} Agent is exiting with exit code: 1


how should I proceed from here?

should I delete the gpnp profiles, the zero out the asm disk used for storing ocr and voting disk?

many thanks in advance!
Re: 12.1.0.2 root.sh fails to start after deconfiguring clusterware [message #668120 is a reply to message #668119] Thu, 08 February 2018 00:23 Go to previous messageGo to next message
Michel Cadot
Messages: 67152
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

I can't help as I'm blocked waiting for your feedback in your previous topic:
http://www.orafaq.com/forum/m/667953/#msg_667953

Re: 12.1.0.2 root.sh fails to start after deconfiguring clusterware [message #668133 is a reply to message #668119] Thu, 08 February 2018 07:36 Go to previous messageGo to next message
John Watson
Messages: 8274
Registered: January 2010
Location: Global Village
Senior Member
I always use -deinstall as well as -deconfig. It doesn't actually deinstall anything, it reverses the ownership and mode changes on some files that root.sh does.
Re: 12.1.0.2 root.sh fails to start after deconfiguring clusterware [message #668141 is a reply to message #668133] Thu, 08 February 2018 10:28 Go to previous messageGo to next message
juniordbanewbie
Messages: 241
Registered: April 2014
Senior Member
Dear John,

https://support.oracle.com/epmos/faces/SearchDocDisplay?_adf.ctrl-state=zwib2btsd_9

How to Deinstall Oracle Clusterware Home Manually (Doc ID 1364419.1)

I not able to deconfigure and deinstall and neither can I detached the oracle home from the central inventory

so I renamed the following
/u01/app/oraInventory to /u01/app/oraInventory_corrupted
/u01/app/12.1.0.2/grid to /u01/app/12.1.0.2/grid_corrupted

when I run a single node cluster verification it gives me the following error:

Pre-check for cluster services setup was unsuccessful on all the nodes.
[grid@dwhDB1 ~]$ ~/it_vendor/oracle/rdbms/oracle_12.1.0.2_linux_x64/grid/runcluvfy.sh stage -pre crsinst -n dwhdb1  -osdba asmdba -verbose

ERROR:
PRKC-1032 : Directory /u01/app/12.1.0.2/grid does not exist

ERROR:
PRKC-1032 : Directory /u01/app/12.1.0.2/grid does not exist

ERROR:
PRVG-1060 : Failed to retrieve the network interface classification information from an existing CRS home at path "/u01/app/12.1.0.2/grid" on the local node
PRCI-1113 : Directory /u01/app/12.1.0.2/grid/bin does not exist
Verification cannot proceed


Pre-check for cluster services setup was unsuccessful on all the nodes.
[grid@dwhDB1 ~]$ ~/it_vendor/oracle/rdbms/oracle_12.1.0.2_linux_x64/grid/runcluvfy.sh stage -pre crsinst -n dwhdb1  -osdba asmdba -verbose


how should I proceed from here. I simply run out of ideas.

many thanks in advance!
Re: 12.1.0.2 root.sh fails to start after deconfiguring clusterware [message #668142 is a reply to message #668141] Thu, 08 February 2018 10:41 Go to previous messageGo to next message
John Watson
Messages: 8274
Registered: January 2010
Location: Global Village
Senior Member
I've already told you what I would do.
Re: 12.1.0.2 root.sh fails to start after deconfiguring clusterware [message #668297 is a reply to message #668142] Sat, 17 February 2018 03:38 Go to previous message
juniordbanewbie
Messages: 241
Registered: April 2014
Senior Member
Dear all,

this is how I resolve the issues

however take note this is only used as a last resort and it is suggested by MOS

/bin/rm -rf /u01/app/12.1.0 .2/grid
# /bin/rm -rf /u01/app/12.1.0.2/grid _corrupted
# /bin/rm -rf /u01/app/oraInventory
# /bin/rm -rf /u01/app/oraInventory _corrupted
# /bin/rm -f /etc/oraInst.loc
# /bin/rm -rf /etc/oracle
# /bin/rm -f /etc/oratab
# /bin/rm -rf /usr/tmp/.o racle

install binaries,

next do a runcluvfy from grid subfolder from the unzip grid installer location


./runcluvfy.sh stage -pre crsinst -n dhwdb1,dhwdb2 -q /dev/oracleasm/ocr_vote_1 -osdba asmdba -asm -presence local -asmgrp
asmadmin -crshome /u01/app/12.1.0.2/grid -networks bond0:10.10.30.0:public/bond1:192.168.2.24:cluster_interconnect -fixup -fixupnoexec -
verbose -asmdev /dev/oracleasm/ocr_vote_1,/dev/oracleasm/ocr_mirror_1,/dev/oracleasm/data_1,/dev/oracleasm/fra_1


https://docs.oracle.com/database/121/CWLIN/crsunix.htm#CWLIN490

cd $GRID_HOME/crs/config

./config.sh -silent responseFile <responsefile full path> -showprogress -executePrereqs

./config.sh -silent responseFile <responsefile full path> -showprogress -debug -waitforcompletion
Previous Topic: install and configure 11.1.0.6 clusterware but once server reboot clusterware never goes up
Next Topic: How to create a service that fails over in a 2 node admin-managed RAC
Goto Forum:
  


Current Time: Wed Jun 03 21:45:38 CDT 2020