1. Download location for 12c cluvfy
http://www.oracle.com/technetwork/database/options/clustering/downloads/index.html
- Cluster Verification Utility Download for Oracle Grid Infrastructure 12c
- Always download the newest cluvfy version from above linke
- The latest CVU version (July 2013) can be used with all currently supported Oracle RAC versions, including Oracle RAC 10g, Oracle RAC 11g and Oracle RAC 12c.
Impact of latest Cluvfy version
It's nothing more annoying than debugging a RAC problem which is finally a Cluvfy BUG. The latest Download from January 2015 shows the following version [grid@gract1 ~/CLUVFY-JAN-2015]$ bin/cluvfy -version 12.1.0.1.0 Build 112713x8664 whereas my current 12.1 installation reports the following version [grid@gract1 ~/CLUVFY-JAN-2015]$ cluvfy -version 12.1.0.1.0 Build 100213x866
Cluvfy trace Location
If you have installed cluvfy in /home/grid/CLUVFY-JAN-2015 the related cluvfy traces could be found in cv/log subdirectory [root@gract1 CLUVFY-JAN-2015]# ls /home/grid/CLUVFY-JAN-2015/cv/log cvutrace.log.0 cvutrace.log.0.lck Note some cluvfy commands like : # cluvfy comp dhcp -clustername gract -verbose must be run as root ! Im that case the default trace location may not have the correct permissions . In that uses the script below to set Trace Level and Trace Location
Setting Cluvfy trace File Locaton and Trace Level in a bash script
The following bash script sets the cluvfy trace location and the cluvfy trace level
#!/bin/bash
rm -rf /tmp/cvutrace
mkdir /tmp/cvutrace
export CV_TRACELOC=/tmp/cvutrace
export SRVM_TRACE=true
export SRVM_TRACE_LEVEL=2
Why cluvfy version matters ?
Yesterday I debugged a DHCP problem starting with cluvfy : [grid@gract1 ~]$ cluvfy -version 12.1.0.1.0 Build 100213x8664 [root@gract1 network-scripts]# cluvfy comp dhcp -clustername gract -verbose Verifying DHCP Check Checking if any DHCP server exists on the network... <null> At least one DHCP server exists on the network and is listening on port 67 Checking if DHCP server has sufficient free IP addresses for all VIPs... Sending DHCP "DISCOVER" packets for client ID "gract-scan1-vip" <null> Sending DHCP "REQUEST" packets for client ID "gract-scan1-vip" <null> .. DHCP server was able to provide sufficient number of IP addresses The DHCP server response time is within acceptable limits Verification of DHCP Check was unsuccessful on all the specified nodes. --> As verification was unsuccessful I started Network Tracing using tcpdump. But Network tracing looks good and I get a bad feeling about cluvfy ! What to do next ? Install the newest cluvfy version and rerun the test ! [grid@gract1 ~/CLUVFY-JAN-2015]$ bin/cluvfy -version 12.1.0.1.0 Build 112713x8664 Now rerun test : [root@gract1 CLUVFY-JAN-2015]# bin/cluvfy comp dhcp -clustername gract -verbose Verifying DHCP Check Checking if any DHCP server exists on the network... DHCP server returned server: 192.168.5.50, loan address: 192.168.5.150/255.255.255.0, lease time: 21600 At least one DHCP server exists on the network and is listening on port 67 Checking if DHCP server has sufficient free IP addresses for all VIPs... Sending DHCP "DISCOVER" packets for client ID "gract-scan1-vip" Checking if DHCP server has sufficient free IP addresses for all VIPs... Sending DHCP "DISCOVER" packets for client ID "gract-scan1-vip" DHCP server returned server: 192.168.5.50, loan address: 192.168.5.150/255.255.255.0, lease time: 21600 Sending DHCP "REQUEST" packets for client ID "gract-scan1-vip" .. released DHCP server lease for client ID "gract-gract1-vip" on port "67" DHCP server was able to provide sufficient number of IP addresses The DHCP server response time is within acceptable limits Verification of DHCP Check was successful.
Why you should always review your cluvfy logs ?
Per default cluvfy logs are under CV_HOME/cv/logs [grid@gract1 ~/CLUVFY-JAN-2015]$ cluvfy stage -pre crsinst -n gract1 Performing pre-checks for cluster services setup Checking node reachability... Node reachability check passed from node "gract1" Checking user equivalence... User equivalence check passed for user "grid" ERROR: An error occurred in creating a TaskFactory object or in generating a task list PRCT-1011 : Failed to run "oifcfg". Detailed error: [] PRCT-1011 : Failed to run "oifcfg". Detailed error: [] This error is not very helpful at all ! Reviewing cluvfy logfiles for details: [root@gract1 log]# cd $GRID_HOME/cv/log Cluvfy log cvutrace.log.0 : [Thread-49] [ 2015-01-22 08:51:25.283 CET ] [StreamReader.run:65] OUTPUT>PRIF-10: failed to initialize the cluster registry [main] [ 2015-01-22 08:51:25.286 CET ] [RuntimeExec.runCommand:144] runCommand: process returns 1 [main] [ 2015-01-22 08:51:25.286 CET ] [RuntimeExec.runCommand:161] RunTimeExec: output> [main] [ 2015-01-22 08:51:25.286 CET ] [RuntimeExec.runCommand:164] PRIF-10: failed to initialize the cluster registry [main] [ 2015-01-22 08:51:25.286 CET ] [RuntimeExec.runCommand:170] RunTimeExec: error> [main] [ 2015-01-22 08:51:25.286 CET ] [RuntimeExec.runCommand:192] Returning from RunTimeExec.runCommand [main] [ 2015-01-22 08:51:25.286 CET ] [CmdToolUtil.doexecuteLocally:884] retval = 1 [main] [ 2015-01-22 08:51:25.286 CET ] [CmdToolUtil.doexecuteLocally:885] exitval = 1 [main] [ 2015-01-22 08:51:25.286 CET ] [CmdToolUtil.doexecuteLocally:886] rtErrLength = 0 [main] [ 2015-01-22 08:51:25.286 CET ] [CmdToolUtil.doexecuteLocally:892] Failed to execute command. Command = [/u01/app/121/grid/bin/oifcfg, getif, -from, gpnp] env = null error = [] [main] [ 2015-01-22 08:51:25.287 CET ] [ClusterNetworkInfo.getNetworkInfoFromOifcfg:152] INSTALLEXCEPTION: occured while getting cluster network info. messagePRCT-1011 : Failed to run "oifcfg". Detailed error: [] [main] [ 2015-01-22 08:51:25.287 CET ] [TaskFactory.getNetIfFromOifcfg:4352] Exception occured while getting network information. msg=PRCT-1011 : Failed to run "oifcfg". Detailed error: [] Here we get a better error message : PRIF-10: failed to initialize the cluster registry and we extract the failing command : /u01/app/121/grid/bin/oifcfg getif Now we can retry the OS command as OS level [grid@gract1 ~/CLUVFY-JAN-2015]$ /u01/app/121/grid/bin/oifcfg getif PRIF-10: failed to initialize the cluster registry Btw, if you have uploaded the new cluvfy command you get a much better error output [grid@gract1 ~/CLUVFY-JAN-2015]$ bin/cluvfy stage -pre crsinst -n gract1 ERROR: PRVG-1060 : Failed to retrieve the network interface classification information from an existing CRS home at path "/u01/app/121/grid" on the local node PRCT-1011 : Failed to run "oifcfg". Detailed error: PRIF-10: failed to initialize the cluster registry For Fixing PRVG-1060,PRCT-1011,PRIF-10 runnung above cluvfy commnads please read following article: Common cluvfy errors and warnings
Run cluvfy before CRS installation by passing network connections for PUBLIC and CLUSTER_INTERCONNECT
$ ./bin/cluvfy stage -pre crsinst -n grac121,grac122 -networks eth1:192.168.1.0:PUBLIC/eth2:192.168.2.0:cluster_interconnect
Run cluvfy before doing an UPGRADE
grid@grac41 /]$ cluvfy stage -pre crsinst -upgrade -n grac41,grac42,grac43 -rolling -src_crshome $GRID_HOME
-dest_crshome /u01/app/grid_new -dest_version 12.1.0.1.0 -fixup -fixupdir /tmp -verbose
Run cluvfy 12.1 for preparing a 10.2.1.0 CRS installation
Always install newest cluvfy version even for 10gR2 CRS validations! [root@ract1 ~]$ ./bin/cluvfy -version 12.1.0.1.0 Build 112713x8664 Verify OS setup on ract1 [root@ract1 ~]$ ./bin/cluvfy comp sys -p crs -r 10gR2 -n ract1 -verbose -fixup --> Run required scripts [root@ract1 ~]# /tmp/CVU_12.1.0.1.0_oracle/runfixup.sh All Fix-up operations were completed successfully. Repeat this step on ract2 [root@ract2 ~]$ ./bin/cluvfy comp sys -p crs -r 10gR2 -n ract2 -verbose -fixup --> Run required scripts [root@ract2 ~]# /tmp/CVU_12.1.0.1.0_oracle/runfixup.sh All Fix-up operations were completed successfully. Now verify System requirements on both nodes [oracle@ract1 cluvfy12]$ ./bin/cluvfy comp sys -p crs -r 10gR2 -n ract1 -verbose -fixup Verifying system requirement .. NOTE: No fixable verification failures to fix Finally run cluvfy to test CRS installation readiness $ cluvfy12/bin/cluvfy stage -pre crsinst -r 10gR2 \ -networks eth1:192.168.1.0:PUBLIC/eth2:192.168.2.0:cluster_interconnect \ -n ract1,ract2 -verbose .. Pre-check for cluster services setup was successful.
Run cluvfy comp software to check file protections for GRID and RDBMS installations
- Note : Not all files are checked ( SHELL scripts like ohasd are missing ) – Bug 18407533 – CLUVFY DOES NOT VERIFY ALL FILES
- Config File : $GRID_HOME/cv/cvdata/ora_software_cfg.xml
Run cluvfy comp software to verify GRID stack [grid@grac41 ~]$ cluvfy comp software -r 11gR2 -n grac41 -verbose Verifying software Check: Software 1178 files verified Software check passed Verification of software was successful. Run cluvfy comp software to verify RDBMS stack [oracle@grac43 ~]$ cluvfy comp software -d $ORACLE_HOME -r 11gR2 -verbose Verifying software Check: Software 1780 files verified Software check passed Verification of software was successful.
Run cluvfy before CRS installation on a single node and create a script for fixable errors
$ ./bin/cluvfy comp sys -p crs -n grac121 -verbose -fixup Verifying system requirement Check: Total memory Node Name Available Required Status ------------ ------------------------ ------------------------ ---------- grac121 3.7426GB (3924412.0KB) 4GB (4194304.0KB) failed Result: Total memory check failed ... ***************************************************************************************** Following is the list of fixable prerequisites selected to fix in this session ****************************************************************************************** -------------- --------------- ---------------- Check failed. Failed on nodes Reboot required? -------------- --------------- ---------------- Hard Limit: maximum open grac121 no file descriptors Execute "/tmp/CVU_12.1.0.1.0_grid/runfixup.sh" as root user on nodes "grac121" to perform the fix up operations manually --> Now run runfixup.sh" as root on nodes "grac121" Press ENTER key to continue after execution of "/tmp/CVU_12.1.0.1.0_grid/runfixup.sh" has completed on nodes "grac121" Fix: Hard Limit: maximum open file descriptors Node Name Status ------------------------------------ ------------------------ grac121 successful Result: "Hard Limit: maximum open file descriptors" was successfully fixed on all the applicable nodes Fix up operations were successfully completed on all the applicable nodes Verification of system requirement was unsuccessful on all the specified nodes. Note errrors like to low memory/swap needs manual intervention: Check: Total memory Node Name Available Required Status ------------ ------------------------ ------------------------ ---------- grac121 3.7426GB (3924412.0KB) 4GB (4194304.0KB) failed Result: Total memory check failed Fix that error at OS level and rerun the above cluvfy command
Performing post-checks for hardware and operating system setup
- cluvfy stage -post hwos test multicast communication with multicast group “230.0.1.0”
[grid@grac42 ~]$ cluvfy stage -post hwos -n grac42,grac43 -verbose Performing post-checks for hardware and operating system setup Checking node reachability... Check: Node reachability from node "grac42" Destination Node Reachable? ------------------------------------ ------------------------ grac42 yes grac43 yes Result: Node reachability check passed from node "grac42" Checking user equivalence... Check: User equivalence for user "grid" Node Name Status ------------------------------------ ------------------------ grac43 passed grac42 passed Result: User equivalence check passed for user "grid" Checking node connectivity... Checking hosts config file... Node Name Status ------------------------------------ ------------------------ grac43 passed grac42 passed Verification of the hosts config file successful Interface information for node "grac43" Name IP Address Subnet Gateway Def. Gateway HW Address MTU ------ --------------- --------------- --------------- --------------- ----------------- ------ eth0 10.0.2.15 10.0.2.0 0.0.0.0 10.0.2.2 08:00:27:38:10:76 1500 eth1 192.168.1.103 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:F6:18:43 1500 eth1 192.168.1.59 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:F6:18:43 1500 eth1 192.168.1.170 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:F6:18:43 1500 eth1 192.168.1.177 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:F6:18:43 1500 eth2 192.168.2.103 192.168.2.0 0.0.0.0 10.0.2.2 08:00:27:1C:30:DD 1500 eth2 169.254.125.13 169.254.0.0 0.0.0.0 10.0.2.2 08:00:27:1C:30:DD 1500 virbr0 192.168.122.1 192.168.122.0 0.0.0.0 10.0.2.2 52:54:00:ED:19:7C 1500 Interface information for node "grac42" Name IP Address Subnet Gateway Def. Gateway HW Address MTU ------ --------------- --------------- --------------- --------------- ----------------- ------ eth0 10.0.2.15 10.0.2.0 0.0.0.0 10.0.2.2 08:00:27:6C:89:27 1500 eth1 192.168.1.102 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth1 192.168.1.165 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth1 192.168.1.178 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth1 192.168.1.167 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth2 192.168.2.102 192.168.2.0 0.0.0.0 10.0.2.2 08:00:27:DF:79:B9 1500 eth2 169.254.96.101 169.254.0.0 0.0.0.0 10.0.2.2 08:00:27:DF:79:B9 1500 virbr0 192.168.122.1 192.168.122.0 0.0.0.0 10.0.2.2 52:54:00:ED:19:7C 1500 Check: Node connectivity for interface "eth1" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac43[192.168.1.103] grac43[192.168.1.59] yes grac43[192.168.1.103] grac43[192.168.1.170] yes .. grac42[192.168.1.165] grac42[192.168.1.167] yes grac42[192.168.1.178] grac42[192.168.1.167] yes Result: Node connectivity passed for interface "eth1" Check: TCP connectivity of subnet "192.168.1.0" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac42:192.168.1.102 grac43:192.168.1.103 passed grac42:192.168.1.102 grac43:192.168.1.59 passed grac42:192.168.1.102 grac43:192.168.1.170 passed grac42:192.168.1.102 grac43:192.168.1.177 passed grac42:192.168.1.102 grac42:192.168.1.165 passed grac42:192.168.1.102 grac42:192.168.1.178 passed grac42:192.168.1.102 grac42:192.168.1.167 passed Result: TCP connectivity check passed for subnet "192.168.1.0" Check: Node connectivity for interface "eth2" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac43[192.168.2.103] grac42[192.168.2.102] yes Result: Node connectivity passed for interface "eth2" Check: TCP connectivity of subnet "192.168.2.0" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac42:192.168.2.102 grac43:192.168.2.103 passed Result: TCP connectivity check passed for subnet "192.168.2.0" Checking subnet mask consistency... Subnet mask consistency check passed for subnet "192.168.1.0". Subnet mask consistency check passed for subnet "192.168.2.0". Subnet mask consistency check passed. Result: Node connectivity check passed Checking multicast communication... Checking subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0"... Check of subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0" passed. Checking subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0"... Check of subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0" passed. Check of multicast communication passed. Checking for multiple users with UID value 0 Result: Check for multiple users with UID value 0 passed Check: Time zone consistency Result: Time zone consistency check passed Checking shared storage accessibility... Disk Sharing Nodes (2 in count) ------------------------------------ ------------------------ /dev/sdb grac43 /dev/sdk grac42 .. Disk Sharing Nodes (2 in count) ------------------------------------ ------------------------ /dev/sdp grac43 grac42 Shared storage check was successful on nodes "grac43,grac42" Checking integrity of name service switch configuration file "/etc/nsswitch.conf" ... Checking if "hosts" entry in file "/etc/nsswitch.conf" is consistent across nodes... Checking file "/etc/nsswitch.conf" to make sure that only one "hosts" entry is defined More than one "hosts" entry does not exist in any "/etc/nsswitch.conf" file All nodes have same "hosts" entry defined in file "/etc/nsswitch.conf" Check for integrity of name service switch configuration file "/etc/nsswitch.conf" passed Post-check for hardware and operating system setup was successful.
Debugging Voting disk problems with: cluvfy comp vdisk
As your CRS stack may not be up run these commands from a node which is up and running [grid@grac42 ~]$ cluvfy comp ocr -n grac41 Verifying OCR integrity Checking OCR integrity... Checking the absence of a non-clustered configuration... All nodes free of non-clustered, local-only configurations ERROR: PRVF-4194 : Asm is not running on any of the nodes. Verification cannot proceed. OCR integrity check failed Verification of OCR integrity was unsuccessful on all the specified nodes. [grid@grac42 ~]$ cluvfy comp vdisk -n grac41 Verifying Voting Disk: Checking Oracle Cluster Voting Disk configuration... ERROR: PRVF-4194 : Asm is not running on any of the nodes. Verification cannot proceed. ERROR: PRVF-5157 : Could not verify ASM group "OCR" for Voting Disk location "/dev/asmdisk1_udev_sdf1" ERROR: PRVF-5157 : Could not verify ASM group "OCR" for Voting Disk location "/dev/asmdisk1_udev_sdg1" ERROR: PRVF-5157 : Could not verify ASM group "OCR" for Voting Disk location "/dev/asmdisk1_udev_sdh1" PRVF-5431 : Oracle Cluster Voting Disk configuration check failed UDev attributes check for Voting Disk locations started... UDev attributes check passed for Voting Disk locations Verification of Voting Disk was unsuccessful on all the specified nodes. Debugging steps at OS level Verify disk protections and use kfed to read disk header [grid@grac41 ~/cluvfy]$ ls -l /dev/asmdisk1_udev_sdf1 /dev/asmdisk1_udev_sdg1 /dev/asmdisk1_udev_sdh1 b---------. 1 grid asmadmin 8, 81 May 14 09:51 /dev/asmdisk1_udev_sdf1 b---------. 1 grid asmadmin 8, 97 May 14 09:51 /dev/asmdisk1_udev_sdg1 b---------. 1 grid asmadmin 8, 113 May 14 09:51 /dev/asmdisk1_udev_sdh1 [grid@grac41 ~/cluvfy]$ kfed read /dev/asmdisk1_udev_sdf1 KFED-00303: unable to open file '/dev/asmdisk1_udev_sdf1'
Debugging file protection problems with: cluvfy comp software
- Related BUG: 18350484 : 112042GIPSU:”CLUVFY COMP SOFTWARE” FAILED IN 112042GIPSU IN HPUX
Investigate file protection problems with cluvfy comp software Cluvfy checks file protections against ora_software_cfg.xml [grid@grac41 cvdata]$ cd /u01/app/11204/grid/cv/cvdata [grid@grac41 cvdata]$ grep gpnp ora_software_cfg.xml <File Path="bin/" Name="gpnpd.bin" Permissions="0755"/> <File Path="bin/" Name="gpnptool.bin" Permissions="0755"/> Change protections and verify wiht cluvfy [grid@grac41 cvdata]$ chmod 444 /u01/app/11204/grid/bin/gpnpd.bin [grid@grac41 cvdata]$ cluvfy comp software -verbose | grep gpnpd /u01/app/11204/grid/bin/gpnpd.bin..."Permissions" did not match reference Permissions of file "/u01/app/11204/grid/bin/gpnpd.bin" did not match the expected value. [Expected = "0755" ; Found = "0444"] Now correct problem and verify again [grid@grac41 cvdata]$ chmod 755 /u01/app/11204/grid/bin/gpnpd.bin [grid@grac41 cvdata]$ cluvfy comp software -verbose | grep gpnpd --> No errors were reported anymore
Debugging CTSSD/NTP problems with: cluvfy comp clocksync
[grid@grac41 ctssd]$ cluvfy comp clocksync -n grac41,grac42,grac43 -verbose Verifying Clock Synchronization across the cluster nodes Checking if Clusterware is installed on all nodes... Check of Clusterware install passed Checking if CTSS Resource is running on all nodes... Check: CTSS Resource running on all nodes Node Name Status ------------------------------------ ------------------------ grac43 passed grac42 passed grac41 passed Result: CTSS resource check passed Querying CTSS for time offset on all nodes... Result: Query of CTSS for time offset passed Check CTSS state started... Check: CTSS state Node Name State ------------------------------------ ------------------------ grac43 Observer grac42 Observer grac41 Observer CTSS is in Observer state. Switching over to clock synchronization checks using NTP Starting Clock synchronization checks using Network Time Protocol(NTP)... NTP Configuration file check started... The NTP configuration file "/etc/ntp.conf" is available on all nodes NTP Configuration file check passed Checking daemon liveness... Check: Liveness for "ntpd" Node Name Running? ------------------------------------ ------------------------ grac43 yes grac42 yes grac41 yes Result: Liveness check passed for "ntpd" Check for NTP daemon or service alive passed on all nodes Checking NTP daemon command line for slewing option "-x" Check: NTP daemon command line Node Name Slewing Option Set? ------------------------------------ ------------------------ grac43 yes grac42 yes grac41 yes Result: NTP daemon slewing option check passed Checking NTP daemon's boot time configuration, in file "/etc/sysconfig/ntpd", for slewing option "-x" Check: NTP daemon's boot time configuration Node Name Slewing Option Set? ------------------------------------ ------------------------ grac43 yes grac42 yes grac41 yes Result: NTP daemon's boot time configuration check for slewing option passed Checking whether NTP daemon or service is using UDP port 123 on all nodes Check for NTP daemon or service using UDP port 123 Node Name Port Open? ------------------------------------ ------------------------ grac43 yes grac42 yes grac41 yes NTP common Time Server Check started... NTP Time Server ".LOCL." is common to all nodes on which the NTP daemon is running Check of common NTP Time Server passed Clock time offset check from NTP Time Server started... Checking on nodes "[grac43, grac42, grac41]"... Check: Clock time offset from NTP Time Server Time Server: .LOCL. Time Offset Limit: 1000.0 msecs Node Name Time Offset Status ------------ ------------------------ ------------------------ grac43 0.0 passed grac42 0.0 passed grac41 0.0 passed Time Server ".LOCL." has time offsets that are within permissible limits for nodes "[grac43, grac42, grac41]". Clock time offset check passed Result: Clock synchronization check using Network Time Protocol(NTP) passed Oracle Cluster Time Synchronization Services check passed Verification of Clock Synchronization across the cluster nodes was successful. At OS level you can run ntpq -p [root@grac41 dev]# ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== *ns1.example.com LOCAL(0) 10 u 90 256 377 0.072 -238.49 205.610 LOCAL(0) .LOCL. 12 l 15h 64 0 0.000 0.000 0.000
Running cluvfy stage -post crsinst after a failed Clusterware startup
- Note you should run cluvfy from a ndoe which is up and runnung to get best results
CRS resource status [grid@grac41 ~]$ my_crs_stat_init NAME TARGET STATE SERVER STATE_DETAILS ------------------------- ---------- ---------- ------------ ------------------ ora.asm ONLINE OFFLINE Instance Shutdown ora.cluster_interconnect.haip ONLINE OFFLINE ora.crf ONLINE ONLINE grac41 ora.crsd ONLINE OFFLINE ora.cssd ONLINE OFFLINE STARTING ora.cssdmonitor ONLINE ONLINE grac41 ora.ctssd ONLINE OFFLINE ora.diskmon OFFLINE OFFLINE ora.drivers.acfs ONLINE OFFLINE ora.evmd ONLINE OFFLINE ora.gipcd ONLINE ONLINE grac41 ora.gpnpd ONLINE ONLINE grac41 ora.mdnsd ONLINE ONLINE grac41 Verify CRS status with cluvfy ( CRS on grac42 is up and running ) [grid@grac42 ~]$ cluvfy stage -post crsinst -n grac41,grac42 -verbose Performing post-checks for cluster services setup Checking node reachability... Check: Node reachability from node "grac42" Destination Node Reachable? ------------------------------------ ------------------------ grac42 yes grac41 yes Result: Node reachability check passed from node "grac42" Checking user equivalence... Check: User equivalence for user "grid" Node Name Status ------------------------------------ ------------------------ grac42 passed grac41 passed Result: User equivalence check passed for user "grid" Checking node connectivity... Checking hosts config file... Node Name Status ------------------------------------ ------------------------ grac42 passed grac41 passed Verification of the hosts config file successful Interface information for node "grac42" Name IP Address Subnet Gateway Def. Gateway HW Address MTU ------ --------------- --------------- --------------- --------------- ----------------- ------ eth0 10.0.2.15 10.0.2.0 0.0.0.0 10.0.2.2 08:00:27:6C:89:27 1500 eth1 192.168.1.102 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth1 192.168.1.59 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth1 192.168.1.178 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth1 192.168.1.170 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth2 192.168.2.102 192.168.2.0 0.0.0.0 10.0.2.2 08:00:27:DF:79:B9 1500 eth2 169.254.96.101 169.254.0.0 0.0.0.0 10.0.2.2 08:00:27:DF:79:B9 1500 virbr0 192.168.122.1 192.168.122.0 0.0.0.0 10.0.2.2 52:54:00:ED:19:7C 1500 Interface information for node "grac41" Name IP Address Subnet Gateway Def. Gateway HW Address MTU ------ --------------- --------------- --------------- --------------- ----------------- ------ eth0 10.0.2.15 10.0.2.0 0.0.0.0 10.0.2.2 08:00:27:82:47:3F 1500 eth1 192.168.1.101 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:89:E9:A2 1500 eth2 192.168.2.101 192.168.2.0 0.0.0.0 10.0.2.2 08:00:27:6B:E2:BD 1500 virbr0 192.168.122.1 192.168.122.0 0.0.0.0 10.0.2.2 52:54:00:ED:19:7C 1500 Check: Node connectivity for interface "eth1" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac42[192.168.1.102] grac42[192.168.1.59] yes grac42[192.168.1.102] grac42[192.168.1.178] yes grac42[192.168.1.102] grac42[192.168.1.170] yes grac42[192.168.1.102] grac41[192.168.1.101] yes grac42[192.168.1.59] grac42[192.168.1.178] yes grac42[192.168.1.59] grac42[192.168.1.170] yes grac42[192.168.1.59] grac41[192.168.1.101] yes grac42[192.168.1.178] grac42[192.168.1.170] yes grac42[192.168.1.178] grac41[192.168.1.101] yes grac42[192.168.1.170] grac41[192.168.1.101] yes Result: Node connectivity passed for interface "eth1" Check: TCP connectivity of subnet "192.168.1.0" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac42:192.168.1.102 grac42:192.168.1.59 passed grac42:192.168.1.102 grac42:192.168.1.178 passed grac42:192.168.1.102 grac42:192.168.1.170 passed grac42:192.168.1.102 grac41:192.168.1.101 passed Result: TCP connectivity check passed for subnet "192.168.1.0" Check: Node connectivity for interface "eth2" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac42[192.168.2.102] grac41[192.168.2.101] yes Result: Node connectivity passed for interface "eth2" Check: TCP connectivity of subnet "192.168.2.0" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac42:192.168.2.102 grac41:192.168.2.101 passed Result: TCP connectivity check passed for subnet "192.168.2.0" Checking subnet mask consistency... Subnet mask consistency check passed for subnet "192.168.1.0". Subnet mask consistency check passed for subnet "192.168.2.0". Subnet mask consistency check passed. Result: Node connectivity check passed Checking multicast communication... Checking subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0"... Check of subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0" passed. Checking subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0"... Check of subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0" passed. Check of multicast communication passed. Check: Time zone consistency Result: Time zone consistency check passed Checking Oracle Cluster Voting Disk configuration... ERROR: PRVF-4193 : Asm is not running on the following nodes. Proceeding with the remaining nodes. --> Expected error as lower CRS stack is not completly up and running grac41 Oracle Cluster Voting Disk configuration check passed Checking Cluster manager integrity... Checking CSS daemon... Node Name Status ------------------------------------ ------------------------ grac42 running grac41 not running ERROR: PRVF-5319 : Oracle Cluster Synchronization Services do not appear to be online. Cluster manager integrity check failed --> Expected error as lower CRS stack is not completely up and running UDev attributes check for OCR locations started... Result: UDev attributes check passed for OCR locations UDev attributes check for Voting Disk locations started... Result: UDev attributes check passed for Voting Disk locations Check default user file creation mask Node Name Available Required Comment ------------ ------------------------ ------------------------ ---------- grac42 22 0022 passed grac41 22 0022 passed Result: Default user file creation mask check passed Checking cluster integrity... Node Name ------------------------------------ grac41 grac42 grac43 Cluster integrity check failed This check did not run on the following node(s): grac41 Checking OCR integrity... Checking the absence of a non-clustered configuration... All nodes free of non-clustered, local-only configurations ERROR: PRVF-4193 : Asm is not running on the following nodes. Proceeding with the remaining nodes. grac41 --> Expected error as lower CRS stack is not completely up and running Checking OCR config file "/etc/oracle/ocr.loc"... OCR config file "/etc/oracle/ocr.loc" check successful ERROR: PRVF-4195 : Disk group for ocr location "+OCR" not available on the following nodes: grac41 --> Expected error as lower CRS stack is not completly up and running NOTE: This check does not verify the integrity of the OCR contents. Execute 'ocrcheck' as a privileged user to verify the contents of OCR. OCR integrity check failed Checking CRS integrity... Clusterware version consistency passed The Oracle Clusterware is healthy on node "grac42" ERROR: PRVF-5305 : The Oracle Clusterware is not healthy on node "grac41" CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530: Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager CRS integrity check failed --> Expected error as lower CRS stack is not completly up and running Checking node application existence... Checking existence of VIP node application (required) Node Name Required Running? Comment ------------ ------------------------ ------------------------ ---------- grac42 yes yes passed grac41 yes no exists VIP node application is offline on nodes "grac41" Checking existence of NETWORK node application (required) Node Name Required Running? Comment ------------ ------------------------ ------------------------ ---------- grac42 yes yes passed grac41 yes no failed PRVF-4570 : Failed to check existence of NETWORK node application on nodes "grac41" --> Expected error as lower CRS stack is not completly up and running Checking existence of GSD node application (optional) Node Name Required Running? Comment ------------ ------------------------ ------------------------ ---------- grac42 no no exists grac41 no no exists GSD node application is offline on nodes "grac42,grac41" Checking existence of ONS node application (optional) Node Name Required Running? Comment ------------ ------------------------ ------------------------ ---------- grac42 no yes passed grac41 no no failed PRVF-4576 : Failed to check existence of ONS node application on nodes "grac41" --> Expected error as lower CRS stack is not completly up and running Checking Single Client Access Name (SCAN)... SCAN Name Node Running? ListenerName Port Running? ---------------- ------------ ------------ ------------ ------------ ------------ grac4-scan.grid4.example.com grac43 true LISTENER_SCAN1 1521 true grac4-scan.grid4.example.com grac42 true LISTENER_SCAN2 1521 true Checking TCP connectivity to SCAN Listeners... Node ListenerName TCP connectivity? ------------ ------------------------ ------------------------ grac42 LISTENER_SCAN1 yes grac42 LISTENER_SCAN2 yes TCP connectivity to SCAN Listeners exists on all cluster nodes Checking name resolution setup for "grac4-scan.grid4.example.com"... Checking integrity of name service switch configuration file "/etc/nsswitch.conf" ... Checking if "hosts" entry in file "/etc/nsswitch.conf" is consistent across nodes... Checking file "/etc/nsswitch.conf" to make sure that only one "hosts" entry is defined More than one "hosts" entry does not exist in any "/etc/nsswitch.conf" file All nodes have same "hosts" entry defined in file "/etc/nsswitch.conf" Check for integrity of name service switch configuration file "/etc/nsswitch.conf" passed SCAN Name IP Address Status Comment ------------ ------------------------ ------------------------ ---------- grac4-scan.grid4.example.com 192.168.1.165 passed grac4-scan.grid4.example.com 192.168.1.168 passed grac4-scan.grid4.example.com 192.168.1.170 passed Verification of SCAN VIP and Listener setup passed Checking OLR integrity... Checking OLR config file... ERROR: PRVF-4184 : OLR config file check failed on the following nodes: grac41 grac41:Group of file "/etc/oracle/olr.loc" did not match the expected value. [Expected = "oinstall" ; Found = "root"] Fix : [grid@grac41 ~]$ ls -l /etc/oracle/olr.loc -rw-r--r--. 1 root root 81 May 11 14:02 /etc/oracle/olr.loc root@grac41 Desktop]# chown root:oinstall /etc/oracle/olr.loc Checking OLR file attributes... OLR file check successful OLR integrity check failed Checking GNS integrity... Checking if the GNS subdomain name is valid... The GNS subdomain name "grid4.example.com" is a valid domain name Checking if the GNS VIP belongs to same subnet as the public network... Public network subnets "192.168.1.0" match with the GNS VIP "192.168.1.0" Checking if the GNS VIP is a valid address... GNS VIP "192.168.1.59" resolves to a valid IP address Checking the status of GNS VIP... Checking if FDQN names for domain "grid4.example.com" are reachable PRVF-5216 : The following GNS resolved IP addresses for "grac4-scan.grid4.example.com" are not reachable: "192.168.1.168" PRKN-1035 : Host "192.168.1.168" is unreachable --> GNS resolved IP addresses are reachable GNS resolved IP addresses are reachable GNS resolved IP addresses are reachable GNS resolved IP addresses are reachable Checking status of GNS resource... Node Running? Enabled? ------------ ------------------------ ------------------------ grac42 yes yes grac41 no yes GNS resource configuration check passed Checking status of GNS VIP resource... Node Running? Enabled? ------------ ------------------------ ------------------------ grac42 yes yes grac41 no yes GNS VIP resource configuration check passed. GNS integrity check passed OCR detected on ASM. Running ACFS Integrity checks... Starting check to see if ASM is running on all cluster nodes... PRVF-5110 : ASM is not running on nodes: "grac41," --> Expected error as lower CRS stack is not completly up and running Starting Disk Groups check to see if at least one Disk Group configured... Disk Group Check passed. At least one Disk Group configured Task ACFS Integrity check failed Checking to make sure user "grid" is not in "root" group Node Name Status Comment ------------ ------------------------ ------------------------ grac42 passed does not exist grac41 passed does not exist Result: User "grid" is not part of "root" group. Check passed Checking if Clusterware is installed on all nodes... Check of Clusterware install passed Checking if CTSS Resource is running on all nodes... Check: CTSS Resource running on all nodes Node Name Status ------------------------------------ ------------------------ grac42 passed grac41 failed PRVF-9671 : CTSS on node "grac41" is not in ONLINE state, when checked with command "/u01/app/11204/grid/bin/crsctl stat resource ora.ctssd -init" --> Expected error as lower CRS stack is not completly up and running Result: Check of CTSS resource passed on all nodes Querying CTSS for time offset on all nodes... Result: Query of CTSS for time offset passed Check CTSS state started... Check: CTSS state Node Name State ------------------------------------ ------------------------ grac42 Observer CTSS is in Observer state. Switching over to clock synchronization checks using NTP Starting Clock synchronization checks using Network Time Protocol(NTP)... NTP Configuration file check started... The NTP configuration file "/etc/ntp.conf" is available on all nodes NTP Configuration file check passed Checking daemon liveness... Check: Liveness for "ntpd" Node Name Running? ------------------------------------ ------------------------ grac42 yes Result: Liveness check passed for "ntpd" Check for NTP daemon or service alive passed on all nodes Checking NTP daemon command line for slewing option "-x" Check: NTP daemon command line Node Name Slewing Option Set? ------------------------------------ ------------------------ grac42 yes Result: NTP daemon slewing option check passed Checking NTP daemon's boot time configuration, in file "/etc/sysconfig/ntpd", for slewing option "-x" Check: NTP daemon's boot time configuration Node Name Slewing Option Set? ------------------------------------ ------------------------ grac42 yes Result: NTP daemon's boot time configuration check for slewing option passed Checking whether NTP daemon or service is using UDP port 123 on all nodes Check for NTP daemon or service using UDP port 123 Node Name Port Open? ------------------------------------ ------------------------ grac42 yes NTP common Time Server Check started... NTP Time Server ".LOCL." is common to all nodes on which the NTP daemon is running Check of common NTP Time Server passed Clock time offset check from NTP Time Server started... Checking on nodes "[grac42]"... Check: Clock time offset from NTP Time Server Time Server: .LOCL. Time Offset Limit: 1000.0 msecs Node Name Time Offset Status ------------ ------------------------ ------------------------ grac42 0.0 passed Time Server ".LOCL." has time offsets that are within permissible limits for nodes "[grac42]". Clock time offset check passed Result: Clock synchronization check using Network Time Protocol(NTP) passed PRVF-9652 : Cluster Time Synchronization Services check failed --> Expected error as lower CRS stack is not completly up and running Checking VIP configuration. Checking VIP Subnet configuration. Check for VIP Subnet configuration passed. Checking VIP reachability Check for VIP reachability passed. Post-check for cluster services setup was unsuccessful. Checks did not pass for the following node(s): grac41
Verify your DHCP setup ( only if using GNS )
[root@gract1 Desktop]# cluvfy comp dhcp -clustername gract -verbose Checking if any DHCP server exists on the network... PRVG-5723 : Network CRS resource is configured to use DHCP provided IP addresses Verification of DHCP Check was unsuccessful on all the specified nodes. --> If network resource is ONLINE you aren't allowed to run this command DESCRIPTION: Checks if DHCP server exists on the network and is capable of providing required number of IP addresses. This check also verifies the response time for the DHCP server. The checks are all done on the local node. For port values less than 1024 CVU needs to be run as root user. If -networks is specified and it contains a PUBLIC network then DHCP packets are sent on the public network. By default the network on which the host IP is specified is used. This check must not be done while default network CRS resource configured to use DHCP provided IP address is online. In my case even stopping nodeapps doesn't help . Only a full cluster shutdown the command seems query the DHCP server ! [root@gract1 Desktop]# cluvfy comp dhcp -clustername gract -verbose Verifying DHCP Check Checking if any DHCP server exists on the network... Checking if network CRS resource is configured and online Network CRS resource is offline or not configured. Proceeding with DHCP checks. CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.170/255.255.255.0, lease time: 21600 At least one DHCP server exists on the network and is listening on port 67 Checking if DHCP server has sufficient free IP addresses for all VIPs... Sending DHCP "DISCOVER" packets for client ID "gract-scan1-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.170/255.255.255.0, lease time: 21600 Sending DHCP "REQUEST" packets for client ID "gract-scan1-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.170/255.255.255.0, lease time: 21600 Sending DHCP "DISCOVER" packets for client ID "gract-scan2-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.169/255.255.255.0, lease time: 21600 Sending DHCP "REQUEST" packets for client ID "gract-scan2-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.169/255.255.255.0, lease time: 21600 Sending DHCP "DISCOVER" packets for client ID "gract-scan3-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.168/255.255.255.0, lease time: 21600 Sending DHCP "REQUEST" packets for client ID "gract-scan3-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.168/255.255.255.0, lease time: 21600 Sending DHCP "DISCOVER" packets for client ID "gract-gract1-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.174/255.255.255.0, lease time: 21600 Sending DHCP "REQUEST" packets for client ID "gract-gract1-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.174/255.255.255.0, lease time: 21600 CRS-10012: released DHCP server lease for client ID gract-scan1-vip on port 67 CRS-10012: released DHCP server lease for client ID gract-scan2-vip on port 67 CRS-10012: released DHCP server lease for client ID gract-scan3-vip on port 67 CRS-10012: released DHCP server lease for client ID gract-gract1-vip on port 67 DHCP server was able to provide sufficient number of IP addresses The DHCP server response time is within acceptable limits Verification of DHCP Check was successful. The nameserver /var/log/messages shows the following: Jan 21 14:42:53 ns1 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth2 Jan 21 14:42:54 ns1 dhcpd: DHCPOFFER on 192.168.1.170 to 00:00:00:00:00:00 via eth2 Jan 21 14:42:54 ns1 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth2 Jan 21 14:42:54 ns1 dhcpd: DHCPOFFER on 192.168.1.170 to 00:00:00:00:00:00 via eth2 Jan 21 14:42:54 ns1 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth2 Jan 21 14:42:54 ns1 dhcpd: DHCPOFFER on 192.168.1.170 to 00:00:00:00:00:00 via eth2 Jan 21 14:42:55 ns1 dhcpd: Wrote 6 leases to leases file. Jan 21 14:42:55 ns1 dhcpd: DHCPREQUEST for 192.168.1.170 (192.168.1.50) from 00:00:00:00:00:00 via eth2 Jan 21 14:42:55 ns1 dhcpd: DHCPACK on 192.168.1.170 to 00:00:00:00:00:00 via eth2 Jan 21 14:42:55 ns1 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth2 Jan 21 14:42:56 ns1 dhcpd: DHCPOFFER on 192.168.1.169 to 00:00:00:00:00:00 via eth2 Jan 21 14:42:56 ns1 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth2
Reference :
- Common cluvfy errors and warnings
2. Common cluvfy errors and warnings including first debugging steps
PRIF-10, PRVG-1060, PRCT-1011 [ cluvfy stage -pre crsinst ]
Current Configuration :
- Your CRS stack doesn’t come up and you want to verify your CRS statck
- Your are running cluvfy stage -pre crsinst in a ready installed CRS stack
ERROR : PRVG-1060 : Failed to retrieve the network interface classification information from an existing CRS home at path "/u01/app/121/grid" on the local node PRCT-1011 : Failed to run "oifcfg". Detailed error: PRIF-10: failed to initialize the cluster registry Command : cluvfy stage -pre crsinst in a ready installed CRS stack Workaround 1: Try to start clusterware in exclusive mode # crsctl start crs -excl Oracle High Availability Services is online CRS-4692: Cluster Ready Services is online in exclusive mode CRS-4529: Cluster Synchronization Services is online CRS-4533: Event Manager is online $ bin/cluvfy stage -pre crsinst -n gract1 Note if you can startup cluvfy in exclusive mode cluvfy stage -post crsinst should work too $ cluvfy stage -post crsinst -n gract1 Workaround 2: Need to be used if you can start the CRS stack in exclusive mode If you can startup the CRS stack you may use the WA from Bug 17505999 : CVU CHECKS FOR ACTIVEVERSION WHEN CRS STACK IS NOT UP. # mv /etc/oraInst.loc /etc/oraInst.loc_sav # mv /etc/oracle /etc/oracle_sav $ bin/cluvfy -version 12.1.0.1.0 Build 112713x8664 Now the command below should work and as said before always download the latest cluvfy version ! $ bin/cluvfy stage -pre crsinst -n gract1 .. Check for /dev/shm mounted as temporary file system passed Pre-check for cluster services setup was successful. Reference : Bug 17505999 : CVU CHECKS FOR ACTIVEVERSION WHEN CRS STACK IS NOT UP.
PRVF-0002 : Could not retrieve local nodename
Command : $ ./bin/cluvfy -h Error : PRVF-0002 : Could not retrieve local nodename Root cause : Nameserver down, host not not yet know in DNS $ nslookup grac41 returns error Server: 192.135.82.44 Address: 192.135.82.44#53 ** server can't find grac41: NXDOMAIN Fix : Restart DNS, or configure DNS . Nslookup should work in any case !
PRVG-1013 : The path “/u01/app/11203/grid” does not exist or cannot be created
Command : cluvfy stage -pre nodeadd -n grac3 -verbose Error : PRVG-1013 : The path "/u01/app/11203/grid" does not exist or cannot be created on the nodes to be added Shared resources check for node addition failed: Logfile : Check cluvify log: $GRID_HOME/cv/log/cvutrace.log.0 [ 15025@grac1.example.com] [Worker 1] [ 2013-08-29 15:17:08.266 CEST ] [NativeSystem.isCmdScv:499] isCmdScv: cmd=[/usr/bin/ssh -o FallBackToRsh=no -o PasswordAuthentication=no -o StrictHostKeyChecking=yes -o NumberOfPasswordPrompts=0 grac3 -n /bin/sh -c "if [ -d /u01 -a -w /u01 ] ; then echo exists; fi"] ... [15025@grac1.example.com] [main] [ 2013-08-29 15:17:08.270 CEST ] [TaskNodeAddDelete.checkSharedPath:559] PRVG-1013 : The path "/u01/app/11203/grid" does not exist or cannot be created on the nodes to be added [15025@grac1.example.com] [main] [ 2013-08-29 15:17:08.270 CEST ] [ResultSet.traceResultSet:359] Node Add/Delete ResultSet trace. Overall Status->VERIFICATION_FAILED grac3-->VERIFICATION_FAILED Root cause: cluvfy commands tries to check the /u01 directory with write attribute and fails /bin/sh -c "if [ -d /u01 -a -w /u01 ] ; then echo exists; fi" Code Fix : drop -w argument and we get the required fixed ouput $ /bin/sh -c "if [ -d /u01 -a /u01 ] ; then echo exists; fi" exists Related BUG: Bug 13241453 : LNX64-12.1-CVU: "CLUVFY STAGE -POST NODEADD" COMMAND REPORTS PRVG-1013 ERROR
PRVF-5229 : GNS VIP is active before Clusterware installation
Command : $ ./bin/cluvfy comp gns -precrsinst -domain grid.example.com -vip 192.168.1.50 -verbose -n grac121 Verifying GNS integrity Checking GNS integrity... Checking if the GNS subdomain name is valid... The GNS subdomain name "grid.example.com" is a valid domain name Checking if the GNS VIP is a valid address... GNS VIP "192.168.1.50" resolves to a valid IP address Checking the status of GNS VIP... Error : Error PRVF-5229 : GNS VIP is active before Clusterware installation GNS integrity check passed Fix : If your clusterware is already installed and up and running ignore this error If this is a new install use an unsed TPC/IP address for your GNS VIP ( note ping should fail ! )
PRVF-4007 : User equivalence check failed for user “oracle”
Command : $ ./bin/cluvfy stage -pre crsinst -n grac1 Error : PRVF-4007 : User equivalence check failed for user "oracle" Fix : Run sshUserSetup.sh $ ./sshUserSetup.sh -user grid -hosts "grac1 grac2" -noPromptPassphrase Verify SSH connectivity: $ /usr/bin/ssh -x -l grid grac1 date Tue Jul 16 12:14:17 CEST 2013 $ /usr/bin/ssh -x -l grid grac2 date Tue Jul 16 12:14:25 CEST 2a013
PRVF-9992 : Group of device “/dev/oracleasm/disks/DATA1” did not match the expected group
Command : $ ./bin/cluvfy stage -pre crsinst -n grac1 -asm -asmdev /dev/oracleasm/disks/DATA1 Checking consistency of device group across all nodes... Error : PRVF-9992 : Group of device "/dev/oracleasm/disks/DATA1" did not match the expected group. [Expected = "dba"; Found = "{asmadmin=[grac1]}"] Root cause : Cluvfy doesn't know that grid user belongs to a different group Fix: : Run cluvfy with -asmgrp asmadmin to provide correct group mappings: $ ./bin/cluvfy stage -pre crsinst -n grac1 -asm -asmdev /dev/oracleasm/disks/DATA1 -asmgrp asmadmin
PRVF-9802 : Attempt to get udev info from node “grac1” failed
Command : $ ./bin/cluvfy stage -pre crsinst -n grac1 -asm -asmdev /dev/oracleasm/disks/DATA1 Error : PRVF-9802 : Attempt to get udev info from node "grac1" failed UDev attributes check failed for ASM Disks Bug : Bug 12804811 : [11203-LIN64-110725] OUI PREREQUISITE CHECK FAILED IN OL6 Fix : If using ASMLIB you can ignore currently this error
PRVF-7539 – User “grid” does not belong to group “dba
Error : PRVF-7539 - User "grid" does not belong to group "dba Command : $ ./bin/cluvfy comp sys -p crs -n grac1 Fix : Add grid owner to DBA group Note : ID 1505586.1 : CVU found following errors with Clusterware setup : User "grid" does not belong to group "dba" [ID 1505586.1] : ID 316817.1] Cluster Verification Utility (CLUVFY) FAQ [ID 316817.1] Bug : Bug 12422324 : LNX64-112-CMT: HIT PRVF-7539 : GROUP "DBA" DOES NOT EXIST ON OUDA NODE ( Fixed : 11.2.0.4 )
PRVF-7617 : Node connectivity between “grac1 : 192.168.1.61” and “grac1 : 192.168.1.55” failed
Command : $ ./bin/cluvfy comp nodecon -n grac1 Error : PRVF-7617 : Node connectivity between "grac1 : 192.168.1.61" and "grac1 : 192.168.1.55" failed Action 1 : Disable firewall / IP tables # service iptables stop # chkconfig iptables off # iptables -F # service iptables status If after a reboot the firewall is enabled again please Action 2 : Checking ssh connectivity $ id uid=501(grid) gid=54321(oinstall) groups=54321(oinstall),504(asmadmin),506(asmdba),507(asmoper),54322(dba) $ ssh grac1 date Sat Jul 27 13:42:19 CEST 2013 Fix : Seems that we need to run cluvfy comp nodecon with at least 2 Nodes Working Command: $ ./bin/cluvfy comp nodecon -n grac1,grac2 -> Node connectivity check passed Failing Command: $ ./bin/cluvfy comp nodecon -n grac1 -> Verification of node connectivity was unsuccessful. Checks did not pass for the following node(s): grac1 : 192.168.1.61 : Ignore this error if running with a single RAC Node - Rerun later when both nodes are available : Verify that that ping is working with all involved IP addresses Action 3 : 2 or more network interfaces are using the same network address Test your Node Commectivity by running: $ /u01/app/11203/grid/bin//cluvfy comp nodecon -i eth1,eth2 -n grac31,grac32,grac33 -verbose Interface information for node "grac32" Name IP Address Subnet Gateway Def. Gateway HW Address MTU ------ --------------- --------------- --------------- --------------- ----------------- ------ eth0 10.0.2.15 10.0.2.0 0.0.0.0 10.0.2.2 08:00:27:88:32:F3 1500 eth1 192.168.1.122 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:EB:39:F1 1500 eth3 192.168.1.209 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:69:AE:D2 1500 Verifiy current settings via ifconfig eth1 Link encap:Ethernet HWaddr 08:00:27:5A:61:E3 inet addr:192.168.1.121 Bcast:192.168.1.255 Mask:255.255.255.0 eth3 Link encap:Ethernet HWaddr 08:00:27:69:AE:D2 inet addr:192.168.1.209 Bcast:192.168.1.255 Mask:255.255.255.0 --> Both eth1 and eth3 are using the same network address 192.168.1 Fix : Setup your network devices and provide a different IP Address like 192.168.3 for eth3 Action 4 :Intermittent PRVF-7617 error with cluvfy 11.2.0.3 ( cluvfy Bug ) $ /u01/app/11203/grid/bin/cluvfy -version 11.2.0.3.0 Build 090311x8664 $ /u01/app/11203/grid/bin/cluvfy comp nodecon -i eth1,eth2 -n grac31,grac32,grac33 -verbos --> Fails intermittent with following ERROR: PRVF-7617 : Node connectivity between "grac31 : 192.168.1.121" and "grac33 : 192.168.1.220" failed $ /home/grid/cluvfy_121/bin/cluvfy -version 12.1.0.1.0 Build 062813x8664 $ /home/grid/cluvfy_121/bin/cluvfy comp nodecon -i eth1,eth2 -n grac31,grac32,grac33 -verbose --> Works for each run Fix : Always use latest 12.1 cluvfy utility to test Node connectivity References: PRVF-7617: TCP connectivity check failed for subnet (Doc ID 1335136.1) Bug 16176086 - SOLX64-12.1-CVU:CVU REPORT NODE CONNECTIVITY CHECK FAIL FOR NICS ON SAME NODE Bug 17043435 : EM 12C: SPORADIC INTERRUPTION WITHIN RAC-DEPLOYMENT AT THE STEP INSTALL/CLONE OR
PRVG-1172 : The IP address “192.168.122.1” is on multiple interfaces “virbr0” on nodes “grac42,grac41”
Command : $ ./bin/cluvfy stage -pre crsinst -asm -presence local -asmgrp asmadmin -asmdev /dev/oracleasm/disks/DATA1,/dev/oracleasm/disks/DATA2,/dev/oracleasm/disks/DATA3,/dev/oracleasm/disks/DATA4 -n grac41,grac42 a Error : PRVG-1172 : The IP address "192.168.122.1" is on multiple interfaces "virbr0,virbr0" on nodes "grac42,grac41" Root cause : There are multiple networks ( eth0,eth1,eth2,virbr0 ) defined Fix : use cluvfy with -networks eth1:192.168.1.0:PUBLIC/eth2:192.168.2.0:cluster_interconnect -n grac41,grac42 Sample : $ ./bin/cluvfy stage -pre crsinst -asm -presence local -asmgrp asmadmin -asmdev /dev/oracleasm/disks/DATA1,/dev/oracleasm/disks/DATA2,/dev/oracleasm/disks/DATA3,/dev/oracleasm/disks/DATA4 -networks eth1:192.168.1.0:PUBLIC/eth2:192.168.2.0:cluster_interconnect -n grac41,grac42 ss
Cluvfy Warnings:
PRVG-1101 : SCAN name “grac4-scan.grid4.example.com” failed to resolve ( PRVF-4664 PRVF-4657
Warning: PRVG-1101 : SCAN name "grac4-scan.grid4.example.com" failed to resolve Cause: An attempt to resolve specified SCAN name to a list of IP addresses failed because SCAN could not be resolved in DNS or GNS using 'nslookup'. Action: Verify your GNS/SCAN setup using ping, nslookup can cluvfy $ ping -c 1 grac4-scan.grid4.example.com PING grac4-scan.grid4.example.com (192.168.1.168) 56(84) bytes of data. 64 bytes from 192.168.1.168: icmp_seq=1 ttl=64 time=0.021 ms --- grac4-scan.grid4.example.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 1ms rtt min/avg/max/mdev = 0.021/0.021/0.021/0.000 ms $ ping -c 1 grac4-scan.grid4.example.com PING grac4-scan.grid4.example.com (192.168.1.170) 56(84) bytes of data. 64 bytes from 192.168.1.170: icmp_seq=1 ttl=64 time=0.031 ms --- grac4-scan.grid4.example.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 2ms rtt min/avg/max/mdev = 0.031/0.031/0.031/0.000 ms $ ping -c 1 grac4-scan.grid4.example.com PING grac4-scan.grid4.example.com (192.168.1.165) 56(84) bytes of data. 64 bytes from 192.168.1.165: icmp_seq=1 ttl=64 time=0.143 ms --- grac4-scan.grid4.example.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.143/0.143/0.143/0.000 ms $ nslookup grac4-scan.grid4.example.com Server: 192.168.1.50 Address: 192.168.1.50#53 Non-authoritative answer: Name: grac4-scan.grid4.example.com Address: 192.168.1.168 Name: grac4-scan.grid4.example.com Address: 192.168.1.165 Name: grac4-scan.grid4.example.com Address: 192.168.1.170 $ $GRID_HOME/bin/cluvfy comp scan Verifying scan Checking Single Client Access Name (SCAN)... Checking TCP connectivity to SCAN Listeners... TCP connectivity to SCAN Listeners exists on all cluster nodes Checking name resolution setup for "grac4-scan.grid4.example.com"... Checking integrity of name service switch configuration file "/etc/nsswitch.conf" ... Check for integrity of name service switch configuration file "/etc/nsswitch.conf" passed Verification of SCAN VIP and Listener setup passed Verification of scan was successful. Fix: As nsloopkup, ping and cluvfy works as expected you can ignore this warning Reference: RVF-4664 PRVF-4657: Found inconsistent name resolution entries for SCAN name (Doc ID 887471.1)
WARNING : Could not find a suitable set of interfaces for the private interconnect
Root cause : public ( 192.168.1.60) and private interface ( 192.168.1.61) uses same network adress Fix : provide own network address ( 192.168.1.xx) for private interconenct After fix cluvfy reports : Interfaces found on subnet "192.168.1.0" that are likely candidates for VIP are: grac1 eth0:192.168.1.60 Interfaces found on subnet "192.168.2.0" that are likely candidates for a private interconnect are: grac1 eth1:192.168.2.101
WARNING: Could not find a suitable set of interfaces for VIPs
WARNING: Could not find a suitable set of interfaces for VIPs
Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "192.168.1.0".
Subnet mask consistency check passed for subnet "192.168.2.0".
Subnet mask consistency check passed.
Fix : Ignore this warning
Root Cause : Per BUG:4437727, cluvfy makes an incorrect assumption based on RFC 1918 that any IP address/subnet that
begins with any of the following octets is private and hence may not be fit for use as a VIP:
172.16.x.x through 172.31.x.x
192.168.x.x
10.x.x.x
However, this assumption does not take into account that it is possible to use these IPs as Public IP's on an
internal network (or intranet). Therefore, it is very common to use IP addresses in these ranges as
Public IP's and as Virtual IP(s), and this is a supported configuration.
Reference:
Note: CLUVFY Fails With Error: Could not find a suitable set of interfaces for VIPs or Private Interconnect [ID 338924.1]
PRVF-5436 : The NTP daemon running on one or more nodes lacks the slewing option “-x”
Error :PRVF-5436 : The NTP daemon running on one or more nodes lacks the slewing option "-x" Solution :Change /etc/sysconfig/ntpd # OPTIONS="-u ntp:ntp -p /var/run/ntpd.pid" to OPTIONS="-x -u ntp:ntp -p /var/run/ntpd.pid" Restart NTPD daemon [root@ract1 ~]# service ntpd restart
PRVF-5217 : An error occurred while trying to look up IP address for “grac1cl.grid2.example.com
WARNING: PRVF-5217 : An error occurred while trying to look up IP address for "grac1cl.grid2.example.com"
Action : Verify with dig and nslookup that VIP IP adresss is working:
$ dig grac1cl-vip.grid2.example.com
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.10.rc1.el6 <<>> grac1cl-vip.grid2.example.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23546
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 1
;; QUESTION SECTION:
;grac1cl-vip.grid2.example.com. IN A
;; ANSWER SECTION:
grac1cl-vip.grid2.example.com. 120 IN A 192.168.1.121
;; AUTHORITY SECTION:
grid2.example.com. 3600 IN NS ns1.example.com.
grid2.example.com. 3600 IN NS gns2.grid2.example.com.
;; ADDITIONAL SECTION:
ns1.example.com. 3600 IN A 192.168.1.50
;; Query time: 12 msec
;; SERVER: 192.168.1.50#53(192.168.1.50)
;; WHEN: Mon Aug 12 09:39:24 2013
;; MSG SIZE rcvd: 116
$ nslookup grac1cl-vip.grid2.example.com
Server: 192.168.1.50
Address: 192.168.1.50#53
Non-authoritative answer:
Name: grac1cl-vip.grid2.example.com
Address: 192.168.1.121
Fix: Ignores this warning.
DNS server on this system has stripped the authoritative flag. This results into the throw of an
UnknownHostExecption when CVU calls InetAddress.getAllByName(..). That's why cluvfy returns a WARNING.
Reference: Bug 12826689 : PRVF-5217 FROM CVU WHEN VALIDATING GNS
Running cluvfy comp dns -server fails silent – Cluvfy logs show PRCZ-2090 error
Command runcluvfy.sh comp dns -server ... just exits with SUCCESS which is not what we expect. Indeed this command should create a local DNS server and block until runcluvfy.sh comp dns -client -last was executed [grid@ractw21 linuxx64_12201_grid_home]$ runcluvfy.sh comp dns -server -domain grid122.example.com -vipaddress 192.168.1.59/255.255.255.0/enp0s8 -verbose -method root Enter "ROOT" password: Verifying Task DNS configuration check ... Waiting for DNS client requests... Verifying Task DNS configuration check ...PASSED Verification of DNS Check was successful. CVU operation performed: DNS Check Date: Apr 11, 2017 3:23:56 PM CVU home: /media/sf_kits/Oracle/122/linuxx64_12201_grid_home/ User: grid Review CVU traces shows that cluvfy command fails with: error PRCZ-2090 PRCZ-2090 : failed to create host key repository from file "/home/grid/.ssh/known_hosts" to establish SSH connection to node "ractw21" [main] [ 2017-04-14 17:38:09.204 CEST ] [ExecCommandNoUserEqImpl.runCmd:374] Final CompositeOperationException: PRCZ-2009 : Failed to execute command "/media/sf_kits/Oracle/122/linuxx64_12201_grid_home//cv/admin/odnsdlite" as root within 0 seconds on nodes "ractw21" Fix login user grid via ssh and create the proper ssh environment [grid@ractw21 linuxx64_12201_grid_home]$ ssh grid@ractw21.example.com
PRVF-5636 , PRVF-5637 : The DNS response time for an unreachable node exceeded “15000” ms
Problem 1: Command : $ ./bin/cluvfy stage -pre crsinst -n grac1 -asm -asmdev /dev/oracleasm/disks/DATA1 Error : PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: grac1 Root Cause: nsloopup return wrong status message # nslookup hugo.example.com Server: 192.168.1.50 Address: 192.168.1.50#53 ** server can't find hugo.example.com: NXDOMAIN # echo $? 1 --> Note the error can't find hugo.example.com is ok - but no the status code Note: PRVF-5637 : DNS response time could not be checked on following nodes [ID 1480242.1] Bug : Bug 16038314 : PRVF-5637 : DNS RESPONSE TIME COULD NOT BE CHECKED ON FOLLOWING NODESa Problem 2: Version : 12.1.0.2 Command : $GRID_HOME/addnode/addnode.sh -silent "CLUSTER_NEW_NODES={gract3}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={auto}" "CLUSTER_NEW_NODE_ROLES={hub}" a Error : SEVERE: [FATAL] [INS-13013] Target environment does not meet some mandatory requirements. FINE: [Task.perform:594] sTaskResolvConfIntegrity:Task resolv.conf Integrity[STASKRESOLVCONFINTEGRITY]:TASK_SUMMARY:FAILED:CRITICAL:VERIFICATION_FAILED PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: gract1,gract3a Verify : Runs ping SCAN address for a long time to check out node connectivity $ ping -v gract-scan.grid12c.example.com $ nsloopkup gract-scan.grid12c.example.com Note you may need to run above commands a long time until error comes up Root Cause : Due to the intermittent hang of the above OS commands a firewall issue could be identified Fix : Disable firewall Reference : PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes (Doc ID 1356975.1) PRVF-5637 : DNS response time could not be checked on following nodes (Doc ID 1480242.1) Using 11.2 WA by setting : $ export IGNORE_PREADDNODE_CHECKS=Y did not help
PRVF-4037 : CRS is not installed on any of the nodes
Error : PRVF-4037 : CRS is not installed on any of the nodes PRVF-5447 : Could not verify sharedness of Oracle Cluster Voting Disk configuration Command : $ cluvfy stage -pre crsinst -upgrade -n grac41,grac42,grac43 -rolling -src_crshome $GRID_HOME -dest_crshome /u01/app/grid_new -dest_version 12.1.0.1.0 -fixup -fixupdir /tmp -verbose Root Cause: /u01/app/oraInventory/ContentsXML/inventory.xml was corrupted ( missing node_list for GRID HOME ) <HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/app/11204/grid" TYPE="O" IDX="1" CRS="true"/> <HOME NAME="OraDb11g_home1" LOC="/u01/app/oracle/product/11204/racdb" TYPE="O" IDX="2"> <NODE_LIST> <NODE NAME="grac41"/> <NODE NAME="grac42"/> <NODE NAME="grac43"/> .... Fix: Correct entry in inventory.xml <HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/app/11204/grid" TYPE="O" IDX="1" CRS="true"> <NODE_LIST> <NODE NAME="grac41"/> <NODE NAME="grac42"/> <NODE NAME="grac43"/> </NODE_LIST> ... Reference : CRS is not installed on any of the nodes (Doc ID 1316815.1) CRS is not installed on any of the nodes. Inventory.xml is changed even when no problem with TMP files. (Doc ID 1352648.1)
avahi-daemon is running
Cluvfy report : Checking daemon "avahi-daemon" is not configured and running Daemon not configured check failed for process "avahi-daemon" Check failed on nodes: ract2,ract1 Daemon not running check failed for process "avahi-daemon" Check failed on nodes: ract2,ract1 Verify for running avahi-daemon daemon $ ps -elf | grep avahi-daemon 5 S avahi 4159 1 0 80 0 - 5838 poll_s Apr02 ? 00:00:00 avahi-daemon: running [ract1.local] 1 S avahi 4160 4159 0 80 0 - 5806 unix_s Apr02 ? 00:00:00 avahi-daemon: chroot helper Fix it ( run on all nodes ) : To shut it down, as root # /etc/init.d/avahi-daemon stop To disable it, as root: # /sbin/chkconfig avahi-daemon off Reference: Cluster After Private Network Recovered if avahi Daemon is up and Running (Doc ID 1501093.1)
Reference data is not available for verifying prerequisites on this operating system distribution
Command : ./bin/cluvfy stage -pre crsinst -upgrade -n gract3 -rolling -src_crshome $GRID_HOME -dest_crshome /u01/app/12102/grid -dest_version 12.1.0.2.0 -verbose Error : Reference data is not available for verifying prerequisites on this operating system distribution Verification cannot proceed Pre-check for cluster services setup was unsuccessful on all the nodes. Root cause: cluvfy runs rpm -qa | grep release --> if this command fails above error was thrown Working Node [root@gract1 log]# rpm -qa | grep release oraclelinux-release-6Server-4.0.4.x86_64 redhat-release-server-6Server-6.4.0.4.0.1.el6.x86_64 oraclelinux-release-notes-6Server-9.x86_64 Failing Node [root@gract1 log]# rpm -qa | grep release rpmdb: /var/lib/rpm/__db.003: No such file or directory error: db3 error(2) from dbenv->open: No such file or directory -> Due to space pressure /var/lib/rpm was partially deleted on a specific RAC node Fix : Restore RPM packages form a REMOTE RAC node or from backup [root@gract1 lib]# pwd /var/lib [root@gract1 lib]# scp -r gract3:/var/lib/rpm . Verify RPM database [root@gract1 log]# rpm -qa | grep release oraclelinux-release-6Server-4.0.4.x86_64 redhat-release-server-6Server-6.4.0.4.0.1.el6.x86_64 oraclelinux-release-notes-6Server-9.x86_64 Related Nodes: - Oracle Secure Enterprise Search 11.2.2.2 Installation Problem On RHEL 6 - [INS-75028] Environment Does Not Meet Minimum Requirements: Unsupported OS Distribution (Doc ID 1568473.1) - RHEL6: 12c OUI INS-13001: CVU Fails: Reference data is not available for verifying prerequisites on this operating system distribution (Doc ID 1567127.1)
Cluvfy Debug : PRVG-11049
Create a problem - Shutdown cluster Interconnect: $ ifconfig eth1 down Verify error with cluvfy $ cluvfy comp nodecon -n all -i eth1 Verifying node connectivity Checking node connectivity... Checking hosts config file... Verification of the hosts config file successful ERROR: PRVG-11049 : Interface "eth1" does not exist on nodes "grac2" ... Step 1 - check cvutrace.log.0 trace: # grep PRVG /home/grid/cluvfy112/cv/log/cvutrace.log.0 [21684@grac1.example.com] [main] [ 2013-07-29 18:32:46.429 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1394] Found Bad node(s): PRVG-11049 : Interface "eth1" does not exist on nodes "grac2" PRVG-11049 : Interface "eth1" does not exist on nodes "grac2" ERRORMSG(grac2): PRVG-11049 : Interface "eth1" does not exist on nodes "grac2" Step 2: Create a script and set trace level: SRVM_TRACE_LEVEL=2 rm -rf /tmp/cvutrace mkdir /tmp/cvutrace export CV_TRACELOC=/tmp/cvutrace export SRVM_TRACE=true export SRVM_TRACE_LEVEL=2 ./bin/cluvfy comp nodecon -n all -i eth1 -verbose ls /tmp/cvutrace Run script and check cluvfy trace file: [32478@grac1.example.com] [main] [ 2013-07-29 19:08:23.125 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1367] getting interface eth1 on node grac2 [32478@grac1.example.com] [main] [ 2013-07-29 19:08:23.126 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1374] Node: grac2 has no 'eth1' interfaces! [32478@grac1.example.com] [main] [ 2013-07-29 19:08:23.126 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1367] getting interface eth1 on node grac1 [32478@grac1.example.com] [main] [ 2013-07-29 19:08:23.127 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1394] Found Bad node(s): PRVG-11049 : Interface "eth1" does not exist on nodes "grac2" Verify problem with ifconfig on grac2 ( eth1 is not up ) # ifconfig eth1 eth1 Link encap:Ethernet HWaddr 08:00:27:8E:6D:24 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:413623 errors:0 dropped:0 overruns:0 frame:0 TX packets:457739 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:226391378 (215.9 MiB) TX bytes:300565159 (286.6 MiB) Interrupt:16 Base address:0xd240 Fix : Restart eth1 and restart crs # ifconfig eth1 up # $GRID_HOME/bin/crsctl stop crs -f # $GRID_HOME/bin/crsctl start crs
Debug PRVF-9802
From cluvfy log following command is failing $ /tmp/CVU_12.1.0.1.0_grid/exectask.sh -getudevinfo oracleasm/disks/DATA1 <CV_ERR><SLOS_LOC>CVU00310</SLOS_LOC><SLOS_OP></SLOS_OP><SLOS_CAT>OTHEROS</SLOS_CAT><SLOS_OTHERINFO>No UDEV rule found for device(s) specified</SLOS_OTHERINFO></CV_ERR> <CV_VRES>1</CV_VRES><CV_LOG>Exectask:getudevinfo success</CV_LOG><CV_CMDLOG> <CV_INITCMD>/tmp/CVU_12.1.0.1.0_grid/exectask -getudevinfo oracleasm/disks/DATA1 </CV_INITCMD> <CV_CMD>popen /etc/udev/udev.conf</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>opendir /etc/udev/permissions.d</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT> <CV_CMD>opendir /etc/udev/rules.d</CV_CMD><CV_CMDOUT> Reading: /etc/udev/rules.d</CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT> <CV_CMD>popen /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' | awk '{if ("oracleasm/disks/DATA1" ~ $1 ) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' | sed -e 's/://' -e 's/\.\*/\*/g'</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT></CV_CMDLOG><CV_ERES>0</CV_ERES> --> No Output Failing Command $ /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' | awk '{if ("oracleasm/disks/DATA1" ~ $1 ) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' | sed -e 's/://' -e 's/\.\*/\*/g' Diagnostics : cluvfy is scanning directory /etc/udev/rules.d/ for udev rules for device : oracleasm/disks/DATA1 - but couldn't find a rule that device Fix: setup udev rules. After fixing the udev rules the above command works fine and cluvfy doesn't complain anymore $ /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' kvm @ /etc/udev/rules.d/80-kvm.rules: KERNEL=="kvm", GROUP="kvm", MODE="0666" fuse @ /etc/udev/rules.d/99-fuse.rules: KERNEL=="fuse", MODE="0666",OWNER="root",GROUP="root" Fix: setup udev rules ..... Verify: $ /tmp/CVU_12.1.0.1.0_grid/exectask.sh -getudevinfo /dev/asmdisk1_udev_sdb1 <CV_VAL><USMDEV><USMDEV_LINE>/etc/udev/rules.d/99-oracle-asmdevices.rules KERNEL=="sdb1", NAME="asmdisk1_udev_sdb1", OWNER="grid", GROUP="asmadmin", MODE="0660" </USMDEV_LINE><USMDEV_NAME>sdb1</USMDEV_NAME><USMDEV_OWNER>grid</USMDEV_OWNER><USMDEV_GROUP>asmadmin</USMDEV_GROUP><USMDEV_PERMS>0660</USMDEV_PERMS></USMDEV></CV_VAL><CV_VRES>0</CV_VRES><CV_LOG>Exectask:getudevinfo success</CV_LOG><CV_CMDLOG><CV_INITCMD>/tmp/CVU_12.1.0.1.0_grid/exectask -getudevinfo /dev/asmdisk1_udev_sdb1 </CV_INITCMD><CV_CMD>popen /etc/udev/udev.conf</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>opendir /etc/udev/permissions.d</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>opendir /etc/udev/rules.d</CV_CMD><CV_CMDOUT> Reading: /etc/udev/rules.d</CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>popen /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' | awk '{if ("/dev/asmdisk1_udev_sdb1" ~ $1 ) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' | sed -e 's/://' -e 's/\.\*/\*/g'</CV_CMD><CV_CMDOUT> /etc/udev/rules.d/99-oracle-asmdevices.rules KERNEL=="sdb1", NAME="asmdisk1_udev_sdb1", OWNER="grid", GROUP="asmadmin", MODE="0660" </CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT></CV_CMDLOG><CV_ERES>0</CV_ERES>
Debug and Fix PRVG-13606 Error
- Setup Chrony to avoid PRVG-13606 in a VirtualBox/RAC env
Reference:
- Cluvfy Usage
3. Debug Cluvfy error ERROR: PRVF-9802
ERROR: PRVF-9802 : Attempt to get udev information from node "hract21" failed No UDEV rule found for device(s) specified Checking: cv/log/cvutrace.log.0 ERRORMSG(hract21): PRVF-9802 : Attempt to get udev information from node "hract21" failed No UDEV rule found for device(s) specified [Thread-757] [ 2015-01-29 15:56:44.157 CET ] [StreamReader.run:65] OUTPUT><CV_ERR><SLOS_LOC>CVU00310</SLOS_LOC><SLOS_OP> </SLOS_OP><SLOS_CAT>OTHEROS</SLOS_CAT> <SLOS_OTHERINFO>No UDEV rule found for device(s) specified</SLOS_OTHERINFO> </CV_ERR><CV_VRES>1</CV_VRES><CV_LOG>Exectask:getudevinfo success</CV_LOG> <CV_CMDLOG><CV_INITCMD>/tmp/CVU_12.1.0.1.0_grid/exectask -getudevinfo asmdisk1_10G,asmdisk2_10G,asmdisk3_10G,asmdisk4_10G </CV_INITCMD><CV_CMD>popen /etc/udev/udev.conf</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT> <CV_CMD>opendir /etc/udev/permissions.d</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT> <CV_CMD>opendir /etc/udev/rules.d</CV_CMD><CV_CMDOUT> Reading: /etc/udev/rules.d</CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT> <CV_CMD>popen /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' | awk '{if ("asmdisk1_10G" ~ $1 ) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' | sed -e 's/://' -e 's/\.\*/\*/g'</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT> .. [Worker 3] [ 2015-01-29 15:56:44.157 CET ] [RuntimeExec.runCommand:144] runCommand: process returns 0 [Worker 3] [ 2015-01-29 15:56:44.157 CET ] [RuntimeExec.runCommand:161] RunTimeExec: output> Run the exectask from OS prompt : [root@hract21 ~]# /tmp/CVU_12.1.0.1.0_grid/exectask -getudevinfo asmdisk1_10G,asmdisk2_10G,asmdisk3_10G,asmdisk4_10G <CV_ERR><SLOS_LOC>CVU00310</SLOS_LOC><SLOS_OP></SLOS_OP><SLOS_CAT>OTHEROS</SLOS_CAT><SLOS_OTHERINFO>No UDEV rule found for device(s) specified</SLOS_OTHERINFO></CV_ERR><CV_VRES>1</CV_VRES><CV_LOG>Exectask:getudevinfo success</CV_LOG> <CV_CMDLOG><CV_INITCMD>/tmp/CVU_12.1.0.1.0_grid/exectask -getudevinfo asmdisk1_10G,asmdisk2_10G,asmdisk3_10G,asmdisk4_10G </CV_INITCMD><CV_CMD>popen /etc/udev/udev.conf</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT> <CV_CMD>opendir /etc/udev/permissions.d</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT> <CV_CMD>opendir /etc/udev/rules.d</CV_CMD><CV_CMDOUT> Reading: /etc/udev/rules.d</CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT> <CV_CMD>popen /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' | awk '{if ("asmdisk1_10G" ~ $1 ) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' | sed -e 's/://' -e 's/\.\*/\*/g' </CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>popen /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' | awk '{if ("asmdisk2_10G" ~ $1 ) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' | sed -e 's/://' -e 's/\.\*/\*/g' </CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT> Test the exectask in detail: [root@hract21 rules.d]# cat /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' | awk ' {if ("asmdisk1_10G" ~ $1) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' --> Here awk returns nothing ! [root@hract21 rules.d]# cat /etc/udev/rules.d/*.rules | grep GROUP | grep MODE |sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' |awk ' { print $1, $2, $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' sd?1 @ NAME="asmdisk1_10G", KERNEL=="sd?1", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent", RESULT=="1ATA_VBOX_HARDDISK_VBe7363848-cbf94b0c", OWNER="grid" --> The above sed script adds sd?1 as parameter $1 and @ as parameter $2 . later awk search for "asmdisk1_10G" in parameter $1 if ("asmdisk1_10G" ~ $1) ... as string "asmdisk1_10G" can be found in paramter $3 but in in paramter $1 !! Potential Fix : Modify search string we get a record back ! [root@hract21 rules.d]# cat /etc/udev/rules.d/*.rules | grep GROUP | grep MODE |sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' |awk ' /asmdisk1_10G/ { print $1, $2, $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' sd?1 @ NAME="asmdisk1_10G", KERNEL=="sd?1", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent", RESULT=="1ATA_VBOX_HARDDISK_VBe7363848-cbf94b0c", OWNER="grid", .. --> Seems the way Oracle extracts UDEV data is not working for OEL 6 where UDEV Records could look like: NAME="asmdisk1_10G", KERNEL=="sd?1", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent", RESULT=="1ATA_VBOX_HARDDISK_VBe7363848-cbf94b0c", OWNER="grid", GROUP="asmadmin", MODE="0660" As the ASM disk has the proper permissions I decided to ignore the warnings [root@hract21 rules.d]# ls -l /dev/asm* brw-rw---- 1 grid asmadmin 8, 17 Jan 29 09:33 /dev/asmdisk1_10G brw-rw---- 1 grid asmadmin 8, 33 Jan 29 09:33 /dev/asmdisk2_10G brw-rw---- 1 grid asmadmin 8, 49 Jan 29 09:33 /dev/asmdisk3_10G brw-rw---- 1 grid asmadmin 8, 65 Jan 29 09:33 /dev/asmdisk4_10G
No comments:
Post a Comment