Tuesday, January 26, 2021

OSWatcher

Install OSWatcher

  • Download OSWatcher from OTN and Untar oswatcher ( see Note Doc ID 301137.1 )
  • Untar the reladet TAR archive : tar xvf oswbb601.tar
  • Before starting OSWatcher is checking for a process with OSWatcher string in its full path .
  • Don’t install OSWatcher in a directory named OSWatcher and use full PATH to start the tool. Even running a gedit session (  $ gedit OSWatcher.dat ) will signal that OSWatcher is running and you can’t start OSWatcher
  • In short :  ps -e | grep OSWatch  should not return any results before starting OSWatcher
# tar xvf oswbb601.tar 
oswbb/
oswbb/src/
oswbb/src/tombody.gif
oswbb/src/Thumbs.db
oswbb/src/missing_graphic.gif

 

 Create file private.net for monitoring Cluster interconnet

Create file private.net based on Exampleprivate.net - here is the Linux Version 

#####################################################################
# This file contains examples of how to monitor private networks. To
# monitor your private networks create an executable file in this same
# directory named private.net. Use the example for your host os below.
# Make sure not to remove the last line in this file. Your file
# private.net MUST contain the rm lock.file line.
######################################################################
#Linux Example
######################################################################
echo "zzz ***"`date`
traceroute -r -F grac1int.example.com
traceroute -r -F grac2int.example.com
traceroute -r -F grac3int.example.com
######################################################################
rm locks/lock.file

Debugging a Node Eviction using OSWatcher

 

Overview

  • Note as OSWatcher is not running with RT privs ( like CHM ) –  we may miss a lot of interesting records
  • OSWatcher utils ( vmstat , iostat , ..) may not get scheduled if we have CPU / Paging/ Swapping problems
  • Always check the OSWatcher vmstat file for missing records
  • If we have missing records for our Eviction Time we can only look in the past before eviction time
  • Always check CHM data as here we get much more details about our system status during the Eviction Time
  • Use the graphical tool of OSWatcher to check for high count of blocking process

Create an OSWatcher Analyzer report

Locate OSWatcher files % find . -name archive ./tds-2014-03-22/14387025_grac41.tfa_Sat_Mar_22_15_46_29_CET_2014.zip/grac41/u01/app/11204/grid/oswbb/archive ./tds-2014-03-22/14387023_grac42.tfa_Sat_Mar_22_15_46_29_CET_2014.zip/grac42/u01/app/11204/grid/oswbb/archive ./tds-2014-03-22/14386169_grac43.tfa_Sat_Mar_22_15_46_29_CET_2014.zip/grac43/u01/app/11204/grid/oswbb/archive Unzip OSWAtcher archives % gunzip -r ./tds-2014-03-22/14387025_grac41.tfa_Sat_Mar_22_15_46_29_CET_2014.zip/grac41/u01/app/11204/grid/oswbb/archive % gunzip -r ./tds-2014-03-22/14387023_grac42.tfa_Sat_Mar_22_15_46_29_CET_2014.zip/grac42/u01/app/11204/grid/oswbb/archive % gunzip -r ./tds-2014-03-22/14386169_grac43.tfa_Sat_Mar_22_15_46_29_CET_2014.zip/grac43/u01/app/11204/grid/oswbb/archive Create an analyzer file   java -jar /home/hhutzler/Tools/SupportBundle_v1_3_1/oswbb/oswbba.jar \      -i ./tds-2014-03-22/14387025_grac41.tfa_Sat_Mar_22_15_46_29_CET_2014.zip/grac41/u01/app/11204/grid/oswbb/archive \      -S grac41.txt -B Mar 22 9:00:00 2014 -E Mar 22 11:00:00 2014

 

Does OSWatcher provide enough data to analyze the problem ?

% grep zzz grac41.example.com_vmstat_14.03.22.0900.dat .... zzz ***Sat Mar 22 09:57:04 CET 2014 zzz ***Sat Mar 22 09:57:34 CET 2014 zzz ***Sat Mar 22 09:58:09 CET 2014 % grep zzz grac41.example.com_vmstat_14.03.22.1000.dat zzz ***Sat Mar 22 10:09:35 CET 2014 zzz ***Sat Mar 22 10:10:05 CET 2014 zzz ***Sat Mar 22 10:10:35 CET 2014 zzz ***Sat Mar 22 10:11:05 CET 2014
  • We don’t have enough data during the eviction so we may not able to find the root cause of the problem
  • OSWatcher records missing from 09:58:09 – 10:09:35

 

Read and interpret OSWatcher Analyzer Data

Analyzing System Status

############################################################################ # Section 1: System Status # # This section lists the status of each major subsystem. Status values are: # Critical: The subsystem requires immediate attention # Warning:  The subsystem detailed findings should be reviewed # OK:       The subsystem was found to be okay # Unknown:  The status of the subsystem could not be determined # # Subsystem       Status ------------------------ CPU             CRITICAL MEMORY          WARNING I/O             WARNING NET             WARNING --> Need to review all subsytems ############################################################################ # Section 2.0: System Slowdown Summary Ordered By Impact # # This section lists the times when the OS started to slowdown. oswbba is # able to measure this by looking at the timestamps in the individual files # it collects. It compares the time between the snapshots and looks to see # how this time differs from the expected timestamp which will be the oswbb # $ Snapshot Freq value listed at the top of this file. Any slowdowns listed # in this section will be ordered by the slowdown Secs column.The subsystem # most likely responsible for the slowdown will be identified here. # SnapTime        Variance   Secs      Flags    SubSystem ----------------------------------------------------------------- Sat Mar 22 09:56:33  1.5     45  0020-00-01   Memory Sat Mar 22 09:55:48  1.3     39  2200-00-00   CPU Sat Mar 22 09:55:09  1.1     35  2200-00-00   Memory Sat Mar 22 09:58:09  1.1     35  2200-00-01   Memory --> Both CPU and Memory problem are reported as root cause for system slowdown Report Summary SnapTime        Variance   Secs      Flags   Cause(Most Likely) ----------------------------------------------------------------- Sat Mar 22 09:58:09  1.1     35  2200-30-01   1: System paging memory                                               2: Large Run Queue >>>Looking for cause of problem 1: System paging memory    Advise: The OS is paging memory.    Reasons: 1. The system is under stress with respect to memory >>>Looking for cause of problem 2: Large Run Queue    Advise: Check why run queue is so large  PERCENT    Reasons: 1. Possible login storm             2. Possible mutex issue in database (Examine AWR) --> Above reports confirms that CPU run queue is large and System is paging ############################################################################  # Section 3: System General Findings  #  # This section lists all general findings that require attention. Each  # finding has a status along with a subsystem. Further advice may also  # available regarding the finding.  #  CRITICAL: CPU Run Queue observed very high spikes.   Advise: Check why run queue is so large.   Check:  The number of processes for possible login storm   Check:  AWR for possible mutex issue in database (Examine AWR)  CRITICAL: CPU Running in System Mode observed to be high.   Advise: Check why large amount of cpu is running in kernel mode.   Check:  Output of top command top to see what processes are running and using kernel cpu   Check:  If the system is undersized with respect to CPU capacity  WARNING: Memory high paging rate observed.   Advise: The OS is low on free memory.   Check:  The system is under stress with respect to memory  WARNING : Disk heavy utilization observed.   Advise: Check disks to see why utilization is so high.   Check:  Hot disk: I/O distribution should be evaluated   Check:  The system is undersized with respect to I/O capacity   Check:  AWR for SQL regression causing more I/O  WARNING : Disk high service time observed.   Advise: Check disks to see why service time is so high.   Check:  Hot disk: I/O distribution should be evaluated   Check:  Disk may be defective  WARNING : Network UDP errors observed.   Advise: UDP protocol only relevant for RAC. Ignore for Non-RAC   Advise: Avoid any dropped packets in UDP protocol   Check:  UDP socket receive buffer on the local machine too small   Check:  The application not reading the data fast enough   Check:  Section 7.3 below for more details

Analyzing CPU data

############################################################################ # Section 4.1: CPU RUN QUEUE: # Run queue should not exceed (Value/#CPU > 3) for any long period of time. # Below lists the number of times (NUMBER) and percent of the number of times # (PERCENT) that run queue was High (>3) or Very High (>6). Pay attention to # high spanning multiple snaps as this represents the number of times ru  PERCENT # queue remained high in back to back snapshots # ------------------------------------------------------ Snaps captured in archive                 214   100.00 High (>3)                                  12     5.61 Very High (>6)                              7     3.27 High spanning multiple snaps                3      1.4 The following snaps recorded very high run queue values: SnapTime                      Value   Value/#CPU ------------------------------------------------ Sat Mar 22 09:55:09 UTC 2014     29           14 Sat Mar 22 09:55:48 UTC 2014     20           10 Sat Mar 22 09:57:04 UTC 2014    117           58 Sat Mar 22 09:58:09 UTC 2014     45           22 --> At 09:57:04 58 process per CPU are waiting - this is way to much ############################################################################ # Section 4.2: CPU UTILIZATION: PERCENT BUSY # CPU utilization should not be high over long periods of time. The higher # the cpu utilization the longer it will take processes to run.  Below lists # the number of times (NUMBER) and percent of the number of times (PERCENT) # that cpu percent busy was High (>95%) or Very High (100%). Pay attention # to high spanning multiple snaps as this represents the number of times cpu # percent busy remained high in back to back snapshots                                        NUMBER  PERCENT ------------------------------------------------------ Snaps captured in archive                 214   100.00 High (>95%)                                 5     2.34 Very High (100%)                            4     1.87 High spanning multiple snaps                2     0.93 CPU UTILIZATION: The following snaps recorded cpu utilization of 100% busy: SnapTime ------------------------------ Sat Mar 22 09:55:09 UTC 2014 Sat Mar 22 09:55:48 UTC 2014 Sat Mar 22 09:58:09 UTC 2014 --> CPU utilization is too high before Node Eviction occurs at 10:03 We can't say anything about CPU usage at Evicition time but it can be expected that CPUs usage remains high for the missing OSWatcher monitor records ############################################################################  # Section 4.3:CPU UTILIZATION: PERCENT SYS  # CPU utilization running in SYSTEM mode should not be greater than 30% over  # long periods of time. The higher system cpu utilization the longer it will  # take processes to run. Pay attention to high spanning multiple snaps as it  # is important that cpu utilization not stay persistently high (>30%)  #                                         NUMBER  PERCENT  Snaps captured in archive                 28   100.00  High (>30%)                                5    17.86  Very High (50%)                            2     7.14  High spanning multiple snaps               1     3.57  High values for SYSTEM Mode ( > 30% ) could be related to - High Paging/Swapping activities - High Disk or Network I/O - Wild running processes running a lot of system calls  CPU UTILIZATION: The following snaps recorded very high percent  SnapTime                      Percent  -----------------------------------  Sat Mar 22 09:54:34 PDT 2014     53  Sat Mar 22 09:56:33 PDT 2014     59 CPU UTILIZATION: The following snaps recorded ROOT processes using high percent cpu: SnapTime                          Pid   CPU   Command ----------------------------------------------------- Sat Mar 22 09:47:33 UTC 2014     2867  94.8   mp_stress Sat Mar 22 09:48:03 UTC 2014     3554  91.4   mp_stress Sat Mar 22 09:48:37 UTC 2014     3554  42.8   mp_stress Sat Mar 22 09:49:32 UTC 2014     4738  37.1   tfactl.pl Sat Mar 22 09:49:32 UTC 2014     4946  35.1   tfactl.pl Sat Mar 22 09:55:11 UTC 2014    14181 328.9   mp_stress Sat Mar 22 09:55:59 UTC 2014    14181 104.6   mp_stress Sat Mar 22 09:57:04 UTC 2014    16174 219.0   mp_stress Sat Mar 22 09:57:34 UTC 2014    16805  52.4   tfactl.pl Sat Mar 22 10:12:36 UTC 2014    28518  66.5   tfactl.pl -> Process mp_stress is eating up our CPU

Analyzing Memory Usage

############################################################################ # Section 5.3: MEMORY PAGE IN # Page in values should be 0 or low. High values (> 25) indicate memory is # under pressure and may be precursor to swapping. Pay attention to high # spanning multiple snaps as this value should not stay persistently high #                                        NUMBER  PERCENT ------------------------------------------------------ Snaps captured in archive                 214   100.00 High (>25)                                 31    14.49 High spanning multiple snaps               19     8.88 The following snaps recorded very high page in rates: SnapTime                      Value ----------------------------------- Sat Mar 22 09:51:33 UTC 2014     32 Sat Mar 22 09:54:34 UTC 2014    312 Sat Mar 22 09:55:09 UTC 2014     32 Sat Mar 22 09:56:33 UTC 2014    624 Sat Mar 22 09:57:04 UTC 2014    352 Sat Mar 22 09:57:34 UTC 2014    664 Sat Mar 22 09:58:09 UTC 2014    128 Sat Mar 22 10:09:35 UTC 2014    292 -> Paging is too high for 15 % of our snapshots before Node Eviction occurs at 10:03 #################################################################################################################################### Section 5.5: Top 5 Memory Consuming Processes Beginning # This section list the top 5 memory consuming processes at the start of the oswbba analysis. There will always be a top 5 process list. # A process listed here does not imply this process is a problem only that it is a top consumer of memory. SnapTime                             PID        USER    %CPU    %MEM         VSZ         RSS       COMMAND ----------------------------------------------------------------------------------------------------------------------------------- Sat Mar 22 09:00:52 UTC 2014        2566        root    0.40    6.20     1798816      273796   ../ojdbc6.jar oracle.rat.tfa.TFAMain ../grac41/tfa_home Sat Mar 22 09:00:52 UTC 2014       27215      oracle    0.00    4.30     1663316      187352  ora_dbw0_grac41 Sat Mar 22 09:00:52 UTC 2014       27131      oracle    0.50    3.90     1569328      171356  ora_lms0_grac41 Sat Mar 22 09:00:52 UTC 2014        5661        root    2.90    3.80      981288      168316  /u01/app/11204/grid/bin/ologgerd -M -d . /grac41 Sat Mar 22 09:00:52 UTC 2014       27221      oracle    0.00    3.20     1564988      143556  ora_smon_grac41 #################################################################################################################################### Section 5.6: Top 5 Memory Consuming Processes Ending # This section list the top 5 memory consuming processes at the end of the oswbba analysis. There will always be a top 5 process list. # A process listed here does not imply this process is a problem only that it is a top consumer of memory. SnapTime                             PID        USER    %CPU    %MEM         VSZ         RSS       COMMAND ----------------------------------------------------------------------------------------------------------------------------------- Sat Mar 22 10:59:49 UTC 2014        2566        root    0.40    4.70     1798816      207060  .. /ojdbc6.jar oracle.rat.tfa.TFAMain ../grac41/tfa_home Sat Mar 22 10:59:49 UTC 2014        5661        root    3.00    3.90     1047852      170232  /u01/app/11204/grid/bin/ologgerd -M -d ../grac41 Sat Mar 22 10:59:49 UTC 2014       22565      oracle    0.00    3.10     1554224      135496  ora_mman_grac41 Sat Mar 22 10:59:49 UTC 2014        5283        grid    6.20    2.90     1128680      127744  /u01/app/11204/grid/bin/ocssd.bin Sat Mar 22 10:59:49 UTC 2014       22578      oracle    0.00    2.60     1560896      114060  ora_smon_grac4 --> Be carefull here as our top consumer  process mp_stress is not shown  as the process was later started and also preempts stops reaching the oswbba end period     Always check section 8 for process related results !

Analyzing Disk IO

############################################################################ # Section 6: Disk Detailed Findings # This section list only those device which have high percent busy, high service # times or high wait times # ############################################################################ # Section 6.1: Disk Percent Busy Findings # (Only Devices With Percent Busy > 50% Reported) # DEVICE: sda PERCENT BUSY                                        NUMBER  PERCENT ------------------------------------------------------ Snaps captured in archive                 214   100.00 High (>50%)                                21     9.81 Very High (>95%)                           17     7.94 High spanning multiple snaps               14     6.54 The following snaps recorded high percent busy for device: sda SnapTime                           Value ------------------------------------------- Sat Mar 22 09:48:36 UTC 2014      77.09 Sat Mar 22 09:49:32 UTC 2014       98.7 Sat Mar 22 09:50:02 UTC 2014      99.21 Sat Mar 22 09:50:32 UTC 2014      100.0 Sat Mar 22 09:51:03 UTC 2014       99.5 DEVICE: dm-0 PERCENT BUSY                                        NUMBER  PERCENT ------------------------------------------------------ Snaps captured in archive                 214   100.00 High (>50%)                                17     7.94 Very High (>95%)                            9     4.21 High spanning multiple snaps                9     4.21 The following snaps recorded high percent busy for device: dm-0 ( our swap device ) SnapTime                           Value ------------------------------------------- Sat Mar 22 09:48:36 UTC 2014      67.09 Sat Mar 22 09:49:32 UTC 2014       98.7 Sat Mar 22 09:50:02 UTC 2014      99.21 Sat Mar 22 09:50:32 UTC 2014       82.2 Sat Mar 22 09:51:03 UTC 2014       77.2 DEVICE: dm-1 PERCENT BUSY                                        NUMBER  PERCENT ------------------------------------------------------ Snaps captured in archive                 214   100.00 High (>50%)                                17     7.94 Very High (>95%)                           16     7.48 High spanning multiple snaps               14     6.54 The following snaps recorded high percent busy for device: dm-1 SnapTime                           Value ------------------------------------------- Sat Mar 22 09:48:36 UTC 2014      77.01 Sat Mar 22 09:49:32 UTC 2014      88.7 Sat Mar 22 09:50:02 UTC 2014      99.21 Sat Mar 22 09:50:32 UTC 2014      99.9 Sat Mar 22 09:51:03 UTC 2014      93.9 Here we need to know something about our partition layout For details check: http://www.hhutzler.de/blog/how-are-logical-volumes-like-devdm-0-mapped-to-devsdx-diskspartitions/" target="_blank" title="How are logical volumes like /dev/dm-0 mapped to /dev/sdx disks/partitions ?"> following link. #  dmsetup ls --tree -o device vg_oel64-lv_swap (252:1)  +- (8:3)                         <-- Major, Minor number from /dev/sdX  +- (8:2) vg_oel64-lv_root (252:0)  +- (8:2) Check /dev/mapper # ls -l  /dev/mapper/vg* lrwxrwxrwx. 1 root root 7 Mar 24 09:07 /dev/mapper/vg_oel64-lv_root -> ../dm-0 lrwxrwxrwx. 1 root root 7 Mar 24 09:07 /dev/mapper/vg_oel64-lv_swap -> ../dm-1 Match major/minor number returned from above dmsetup output # ls -l /dev/sda2 /dev/sda3 brw-rw----. 1 root disk 8, 2 Mar 24 09:07 /dev/sda2 brw-rw----. 1 root disk 8, 3 Mar 24 09:07 /dev/sda3 --> Root partition and Swap partition are pointing to the same physical disk /dev/sda -> I/O contention     For our swap partition we see high BUSY rates > 90 % around 09:50 -> Increased paging/swapping

Analyzing Processes

a  ############################################################################  # Section 8.2: PS for Processes With Status = D, T or W Ordered By Time  # In this section list all processes captured in the oswbb logs which have a  # status of D, T or W  #  SnapTime                             PID        USER         CPU    STATUS                 WCHAN  COMMAND  -----------------------------------------------------------------------------------------------------------------------------------  Sat Mar 22 09:49:32 PDT 2014        7573        grid         0.0         D                sleep_  asm_rbal_+ASM1  Sat Mar 22 09:49:32 PDT 2014       31115      oracle         0.0         D                sleep_  ora_cjq0_grac41  Sat Mar 22 09:49:32 PDT 2014       27487      oracle         0.0         D                sleep_  ora_lck0_grac41  Sat Mar 22 09:49:32 PDT 2014        4915        root         0.0         D                sleep_  /u01/app/11204/grid/bin/./crsctl.bin stat res procwatcher  Sat Mar 22 09:49:32 PDT 2014       27213      oracle         0.0         D                sleep_  ora_mman_grac41  Sat Mar 22 09:49:32 PDT 2014       23293      oracle         0.0         D                sleep_  ora_pz99_grac41a ... --> At lot of processes are in disk wait status For 2.6 kernels this could be either an IO problem or more likely a paging/swapping problem ####################################################################################### # Section 8.3: PS for (Processes with CPU > 0) When System Idle CPU < 30% Ordered By Time # In this section list all processes captured in the oswbb logs with process cpu consumption #  > 0 and system idle cpu < 30% # SnapTime                        IDLE_CPU         PID        USER         CPU    STATUS  COMMAND ---------------------------------------------------------------------------------------------------------------------------------- Sat Mar 22 09:55:11 UTC 2014         0.0       14181        root      328.90         S  mp_stress                                Sat Mar 22 09:55:59 UTC 2014         0.0       14181        root      104.60         S  mp_stress                                Sat Mar 22 09:57:04 UTC 2014         9.0       16174        root      219.00         S  mp_stress -> process mp_stress is taking a lot of CPU - there is no IDLE CPU from 09:55:11 on ####################################################################################### # Section 8.4: Top VSZ Processes Increasing Memory Per Snapshot # In this section list all changes in virtual memory allocations per process # SnapTime                             PID        USER    %CPU    %MEM         VSZ      CHANGE   %CHANGE       COMMAND ----------------------------------------------------------------------------------------------------------------------------------- Sat Mar 22 09:55:59 UTC 2014       14181        root  205.00   18.50     1090096     +630036    +136.94  ./mp_stress -t 4 -m 5 -p 50 -c 50        Sat Mar 22 09:56:33 UTC 2014       14181        root  165.00   22.40     1263176     +173080    +15.87   ./mp_stress -t 4 -m 5 -p 50 -c 50 --> Virtual memory for process   ./mp_stress is increasing a lot and + CPU usage is also very high Increased CPU usage and Memory usage could be the root cause for a Node Eviction ! ####################################################################################### Section 8.5: Top RSS Processes Increasing Memory Per Snapshot # In this section list all changes in resident memory allocations per process # SnapTime                             PID        USER    %CPU    %MEM         RSS      CHANGE   %CHANGE       COMMAND ----------------------------------------------------------------------------------------------------------------------------------- Sat Mar 22 09:55:59 UTC 2014       14181        root  205.00   18.50      805984     +630016    +358.02   ./mp_stress -t 4 -m 5 -p 50 -c 50        Sat Mar 22 09:56:33 UTC 2014       14181        root  165.00   22.40      977540     +171556    +21.28    ./mp_stress -t 4 -m 5 -p 50 -c 50 -->  Resident  memory for process   ./mp_stress increases a lot Problem could be either a Memory leak or things like a connection storm

 

Using grep to retrieve process priority  from OSWatcher raw data

% egrep 'zzz|mp_stress|PRI' grac41.example.com_ps_14.03.22.0900.dat USER PID PPID PRI %CPU %MEM VSZ RSS WCHAN S STARTED TIME COMMAND zzz ***Sat Mar 22 09:56:33 CET 2014 USER PID PPID PRI %CPU %MEM VSZ RSS WCHAN S STARTED TIME COMMAND root 14181 4270 90 165 22.4 1263176 977540 n_tty_ S 09:55:02 00:02:32 ./mp_stress -t 4 -m 5 -p 50 -c 50 USER PID PPID PRI %CPU %MEM VSZ RSS WCHAN S STARTED TIME COMMAND zzz ***Sat Mar 22 09:57:04 CET 2014 USER PID PPID PRI %CPU %MEM VSZ RSS WCHAN S STARTED TIME COMMAND USER PID PPID PRI %CPU %MEM VSZ RSS WCHAN S STARTED TIME COMMAND zzz ***Sat Mar 22 09:57:34 CET 2014 USER PID PPID PRI %CPU %MEM VSZ RSS WCHAN S STARTED TIME COMMAND USER PID PPID PRI %CPU %MEM VSZ RSS WCHAN S STARTED TIME COMMAND zzz ***Sat Mar 22 10:00:17 CET 2014 USER PID PPID PRI %CPU %MEM VSZ RSS WCHAN S STARTED TIME COMMAND root 16867 4270 90 108 68.3 4542216 2974540 futex_ S 09:57:38 00:03:44 ./mp_stress -t 4 -m 20 -p 50 -c 200 USER PID PPID PRI %CPU %MEM VSZ RSS WCHAN S STARTED TIME COMMAND ==> Priority is quite high 90 ( true as mp_stress is a RT process ) CPU usage is high too Memory usage explodes from 22 % to 698 %

 

Summary

  • process mp_stress leaks memory and eats up all our CPU and is very likely the root cause of the problem
  • System is paging and lot of process are waiting on Disk I/O
  • CPU queue is high – after a while the most process migrate to the blocking queue 
  • CPU usage is high – all the time
  • As all I/O is redirected to single physical disk we see high disk service time and disk waits
  • From the  provided OSWatcher data we can’t pin point the root cause  of the Node Eviction
  • Root cause could be either : CPU starvation, Paging/Swapping or slow disk I/O
 

No comments:

Post a Comment