Published Tuesday, December 20, 2005 6:36 AM by robertvv

Calculating single disk performance for random I/O

I have been doing a lot of performance analysis lately. Nine out of ten times the bad performance is caused by a not optimized storage solution. Why? When data doesn't fit into the memory of a server, it has to come from disk sooner or later. In my experience it's usually later (due to non optimized solutions), whereas sooner was to be expected. So, it's time to write down some disk performance basics.

I used a Seagate ST-373307FC 10.000 Rotations Per Minute (RPM) disk with a Fibre Channel (FC) interface for my calculations. You can find its specification sheet here. All information that is used for calculating RAW disk performance are obtained from this spec sheet.

                             Random I/O (R/W)
Average Seek (mSec)               4.7/5.3   
Average latency (mSec)           2.99/2.99  
Command and data transfer (mSec)  0.2/0.2   
------------------------------------------- +
Total access time (mSec)          7.9/8.5   

These numbers were obtained from the spec sheet

So the total number of I/O Operations Per Second, better known as IOP's can be calculated by dividing "1 second / total access time". Using the numbers from the table above: 1/0.0079 = ~ 126 IOP's for read operations and 1/0.0085 = ~ 117 IOP's for write operations.

Now we know how to calculate maximum random I/O performance for a single disk. Why is it so important to be able to calculate random I/O performance and not sequential I/O performance? The answer is quite simple. The more clients/processes that are to be served by a server, the bigger the chance its data access pattern will be random. Do you really think that when 4000 concurrent users connected to a file server generate sequential I/O and that the server will be able to predict the next piece of data to be requested by a client? No, it won't. And the fact it won't be able to predict it, limits the positive performance impact of caching algorithms in the process.

To make things even more complicated, it doesn't matter when a disk has to read one sector of 512 bytes or 32 sectors (16 KB block). Its platters rotate with 10.000 RPM per minute (in case of our Seagate disk), remember. So it reads 32 sectors almost as fast as one sector. Only when track boundary is reached, the performance drops a bit, due to the time it needs to seek to the adjacent track.

Knowing this, let's calculate the maximum random I/O performance for reading 8 KB blocks (16 sectors) and 16 KB blocks (32 sectors).

The formula is quite simple: Total numbers of IOP's x block size
Max random I/O (8 KB): 126 x 8 KB = 1008 KB per second!!!
Max random I/O (16 KB): 126 x 16 KB = 2016 KB per second!!!

Max random I/O (32 KB): 126 x 32 KB = 4032 KB per second
Max random I/O (64 KB): 126 x 64 KB = 8064 KB per second

You might want to increase block sizes, segment sizes, stripe size etc. in order to increase the performance... Please, consider the impact of RAID-levels on this and choose the right one. For more information on the impact of RAID-levels, read this blog.

As a baseline, use the following numbers for your calculations:
for a 10.000 RPM disk: 125 IOP's.
for a 15.000 RPM disk: 175 IOP's.

What about latency then? The killing factor for disk performance is the actual time needed to perform an I/O operation. Most disks run optimally with 2-3 outstanding I/O's in their queue. Using 7.9 mSec random latency for read operations, this gives us 7.9 mSec x 3 = 23.7 mSec maximum latency. For write operations this is 8.5 mSec x 3 = 25.5 mSec maximum latency (still using our Seagate disk as reference). As a rule of thumb, 21 mSec is considered the maximum, any average latency above that is considered a bottleneck. Any reoccurring spike far above that value is also considered a bottleneck.

Next time, I will talk about sequential disk performance.


Filed Under: