Performance metrics for SAN environments help to give those within your organization the powerful, real-time and actionable information that they need to make the most informed decisions possible. It’s important to understand that performance analysis is not an exact science and is much more of an art form, however. The numbers themselves only tell one part of a much larger story. How you choose to interpret those numbers and the steps that you take thereafter are what will ultimately make the difference between success and failure.
The Percentage of SP Cache Dirty Pages
SP Cache Dirty Pages exist in a write cache and are pages that have already received new data from hosts, but have not yet flushed that data to disk. This percentage should always be relatively high as it increases the chance of additional writes to the same block of data that will ultimately be absorbed by the cache.
The Percentage of SP Utilization
If your SP utilization is at or above 75%, you can expect application response time to increase as a result. If you’re performing non-disruptive upgrades, however, your SP utilization will need to be at 50% or below.
SP Response Time
This number, which is commonly measured in milliseconds, should be under 10 milliseconds at all times. If one of your SPs has a high utilization and response time and the other has a low response time, one of the two is likely using more resources from the array.
The Percentage of LUN Utilization
This number gives you insight into the fraction of an observational period that exists when a LUN in your SAN environment has outstanding requests that have not yet been processed. If you are attempting to locate a bottleneck in your LUN, you will see this number rise very close to 100%.
LUN Queue Length
The LUN Queue Length simply refers to the average number of requests within a particular period of time that are currently registering as outstanding in the LUN in question. If you see a queue length of zero, this means that the LUN has become idle. Remember that only one request can be served at any given time, so there will usually be a few waiting in the queue. If your LUN is performing poorly, you can usually expect to see a queue length that is larger than “2” for any given disk drive in your environment.