Monday, March 15, 2010

CRS is not coming up on all nodes after adding voting disks on AIX

Adding voting disks to the cluster in AIX

Recently we added 2 more voting disks to our RAC cluster and after that CRS was not coming on both the nodes simultaneous. So what was happening is that wherever CRS is started first, it’ll keep running and then CRS was not getting started on another node. Also We were not able to shutdown CRS on the node where we started so we had to kill ocssd.bin or oprocd.bin to restart CRS on first node. So we thought that wherever CRS is starting first, it is locking voting disks and as a result, CRS is not coming up on another node though voting disks were added correctly as mentioned below
crsctl query css votedisk
0. 0 /dev/crs/voting_disk01
1. 0 /dev/crs/voting_disk02
2. 0 /dev/crs/voting_disk03
oracle@tst01:/home/oracle>ls -ltr /dev/crs/voting_disk01 /dev/crs/voting_disk02 /dev/crs/voting_disk03
lrwxrwxrwx 1 oracle dba 12 Nov 29 2007 /dev/crs/voting_disk01-> /dev/rhdisk100
lrwxrwxrwx 1 oracle dba 12 Jan 28 2010 /dev/crs/voting_disk02-> /dev/rhdisk101
lrwxrwxrwx 1 oracle dba 12 Jan 28 2010 /dev/crs/voting_disk03-> /dev/rhdisk102
oracle@tst01:/home/oracle>ls -ltr /dev/*hdisk100 /dev/*hdisk101 /dev/*hdisk102
brw------- 1 root system 38, 2 Nov 28 2007 /dev/hdisk100
crw-rw---- 1 oracle dba 38, 2 Nov 28 2007 /dev/rhdisk100
brw------- 1 root system 38, 2 Jan 18 2010 /dev/hdisk101
crw-rw---- 1 oracle dba 38, 2 Jan 18 2010 /dev/rhdisk101
brw------- 1 root system 38, 2 Jan 18 2010 /dev/hdisk102
crw-rw---- 1 oracle dba 38, 2 Jan 18 2010 /dev/rhdisk102

so we checked disk permissions and disk attributes and here’s the culprit highlighted in bold.

hdisk101
algorithm fail_over Algorithm True
hcheck_interval 10 Health Check Interval True
hcheck_mode nonactive Health Check Mode True
queue_depth 16 Queue DEPTH True
reserve_policy single_path Reserve Policy True
rw_timeout 40 READ/WRITE time out value True
DISK: hdisk102
algorithm fail_over Algorithm True
hcheck_interval 10 Health Check Interval True
hcheck_mode nonactive Health Check Mode True
queue_depth 16 Queue DEPTH True
reserve_policy single_path Reserve Policy True
rw_timeout 40 READ/WRITE time out value True

The disk resever_policy must be no_reserve and algorithm should be round robin as mentioned below. We can change some other required attributes as well for both disks hdisk101 and hdisk102.
# chdev -l hdisk101 -a reserve_policy=no_reserve (Mandatory)
# chdev -l hdisk101-a algorithm='round_robin' (Mandatory)
# chdev -l hdisk101-a hcheck_mode='enabled'
# chdev -l hdisk101-a hcheck_interval='600'
# chdev -l hdisk101-a queue_depth='2'
# chdev -l hdisk101-a rw_timeout='60'
Please see below output of changed attributes for both the disks.
# lsattr -El hdisk101
PCM PCM/friend/MSYMM_RAIDS Path Control Module True
PR_key_value none Persistant Reserve Key Value True
algorithm round_robin Algorithm True
clr_q yes Device CLEARS its Queue on error True
hcheck_interval 600 Health Check Interval True
hcheck_mode enabled Health Check Mode True
location Location Label True
lun_id 0x15d000000000000 Logical Unit Number ID False
lun_reset_spt yes FC Forced Open LUN True
max_transfer 0x40000 Maximum TRANSFER Size True
node_name 0x50060482d53061b6 FC Node Name False
pvid none Physical volume identifier False
q_err no Use QERR bit True
q_type simple Queue TYPE True
queue_depth 2 Queue DEPTH True
reserve_policy no_reserve Reserve Policy True
rw_timeout 60 READ/WRITE time out value True
scsi_id 0x18100 SCSI ID False
start_timeout 180 START UNIT time out value True
ww_name 0x50060482d53061b6 FC World Wide Name False
# lsattr -El hdisk102
PCM PCM/friend/MSYMM_RAIDS Path Control Module True
PR_key_value none Persistant Reserve Key Value True
algorithm round_robin Algorithm True
clr_q yes Device CLEARS its Queue on error True
hcheck_interval 600 Health Check Interval True
hcheck_mode enabled Health Check Mode True
location Location Label True
lun_id 0x15e000000000000 Logical Unit Number ID False
lun_reset_spt yes FC Forced Open LUN True
max_transfer 0x40000 Maximum TRANSFER Size True
node_name 0x50060482d53061b6 FC Node Name False
pvid none Physical volume identifier False
q_err no Use QERR bit True
q_type simple Queue TYPE True
queue_depth 2 Queue DEPTH True
reserve_policy no_reserve Reserve Policy True
rw_timeout 60 READ/WRITE time out value True
scsi_id 0x18100 SCSI ID False
start_timeout 180 START UNIT time out value True
ww_name 0x50060482d53061b6 FC World Wide Name False

After changing disk attributes, CRS came online on both the nodes.

1 comment:

  1. This is very good information.i think it's useful advice. really nice blog. keep it up!!!

    - p-f curve

    ReplyDelete