Monday, March 27, 2017

Repair Oracle Cluster Registry

REAPIR ORACLE CLUSTR REGISTRY(OCR)

Let's see How we can use ocrconfig -repair command to repair ocr configuration on a node which was not up when the configuration was modified on the other nodes.

Current Setup:
2 node cluster
Nodes: rac1, rac2
Nodes rac1 is up
Node rac2 is down
OCR is stored on ASM diskgroup VOS

Summary:
 1. Store OCR on additionally on DATA diskgroup
 2. This information is modified in /etc/oracle/ocr.loc on nodes rac1 which is up
 3. This information is not modified in /etc/oracle/ocr.loc on node rac2  which is down.
 4. Startup Node rac2
 5. Clusterware does not come up on rac2
 6. Check alert log and crsd log on rac2
 7. Repair OCR configuration on rac2 so that /etc/oracle/ocr.loc on rac2 gets updated
 8. Start clusterware on rac2 – succeeds

 1) Store OCR on additionally on DATA diskgroup


[root@rac1 ~]# ocrconfig -add +DATA



2) Check that new OCR location is added  in /etc/oracle/ocr.loc on node rac1 which is up

[root@rac1 ~]# cat /etc/oracle/ocr.loc


#Device/file  getting replaced by device +DATA
ocrconfig_loc=+VOS
ocrmirrorconfig_loc=+DATA


3) Bring up node rac2


4) Check that new OCR location is not added  in /etc/oracle/ocr.loc on node rac2 which was down

[root@rac2 ~]# cat /etc/oracle/ocr.loc


#Device/file +TEST getting replaced by device +VOS
ocrconfig_loc=+VOS


5) Check that clusterware has not come up there

[root@rac2 ~]# crsctl stat res -t


CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.


6) Check the alert log of rac2

[root@rac2 ~]# tailf /u01/app/11.2.0/grid/log/rac2/alertrac2.log


[root@rac2 ~]# tailf /u01/app/11.2.0/grid/log/rac2/alertrac2.log
2017-03-27 06:10:22.716
[ohasd(3534)]CRS-2765:Resource 'ora.crsd' has failed on server 'rac2'.
2017-03-27 06:10:32.046
[ohasd(3534)]CRS-2765:Resource 'ora.crsd' has failed on server 'rac2'.
2017-03-27 06:10:34.128


7) Check the crsd  log of rac2 – Indicates that local and master information of OCR configuration does not match

[root@rac2 ~]# vi /u01/app/11.2.0/grid/log/rac2/crsd/crsd.log


2017-03-27 06:10:46.259: [  OCRSRV][1259493696]th_not_master_change: Master change callback not registered
2017-03-27 06:10:46.259: [  OCRMAS][1259493696]th_master:91: Comparing device hash ids between local and master failed
2017-03-27 06:10:46.259: [  OCRMAS][1259493696]th_master:91 Local dev (272089962, 1028247821, 0, 0, 0)
2017-03-27 06:10:46.259: [  OCRMAS][1259493696]th_master:91 Master dev (272089962, 1862408427, 0, 0, 0)
2017-03-27 06:10:46.259: [  OCRMAS][1259493696]th_master:9: Shutdown CacheLocal. my hash ids don't match


8) Repair OCR configuration on rac2 

[root@rac2 ~]# ocrconfig -repair -add +DATA



9) Check that new OCR location is added  in /etc/oracle/ocr.loc on node rac2 

[root@rac2 ~]# cat /etc/oracle/ocr.loc


#Device/file  getting replaced by device +DATA
ocrconfig_loc=+VOS
ocrmirrorconfig_loc=+DATA


10) Shutdown and restart cluster on rac2

[root@rac2 crsd]# crsctl stop crs -f
[root@rac2 crsd]# crsctl start crs



11) Check that crsd is started on rac2

[root@rac2 ~]# tailf /u01/app/11.2.0/grid/log/rac2/alertrac2.log


2017-03-27 06:19:11.800
[crsd(5604)]CRS-1012:The OCR service started on node rac2.
2017-03-27 06:19:17.498
[ctssd(5470)]CRS-2408:The clock on host rac2 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time.
2017-03-27 06:19:17.750
[crsd(5604)]CRS-1201:CRSD started on node rac2.


12) Verify Cluster Status

[root@rac2]# crsctl stat res -t



No comments:

Post a Comment