Published
Monday, December 19, 2005 8:43 PM
by
robertvv
I'm often faced with the following customer situation: "I added additional CPU's to the system because the average CPU utilization exceeded 80%, but the application doesn't run any smoother. Why is that?". Well, let me try to explain.
To determine a CPU is busy in terms of the amount of work (threads) that is waiting to be executed, take a closer look at the following Performance Monitor counter: "System\Processor queue length". If this one exceeds 10 per CPU installed, you definitely need more CPU's. But... take a close look at the following two Task Manager screenshots. I used the Task Manager, because it's easier to visualize my point.

A two CPU system at approx 80%.

A quad CPU (the same) system at approx. 40% (with extra memory added).
What do you see there? You'll expect to see what percentage of the total 100% of CPU time the CPU is spending on executing threads... Wrong. You see the amount of time the Windows "idle thread" runs subtracted from 100 (it counts the amount the times the "Idle Thread" is called). Why is this important, because it shows you one important thing. Two CPU's at approx 80% is roughly the same as four CPU's at 40%. 80 x 2 = 40 x 4. Simple mathematics, right?. Right. So, this tells you one thing. The application doesn't run any smoother or "faster" then a system equipped with two CPU's. This tells you a lot about the application, not about the system. It tells you the application doesn't scale from two CPU's to four CPU's. So, one should have looked at the "System\Processor queue length" counter. If that one indicates more then 20 threads (more then 10 per processor), the application generates enough work to take advantage of extra CPU's. If the "System\Processor queue length" counter is far away from 20, don't bother to add additional CPU's, the chances are that the application won't take advantage of them anyway.
So, the moral of the story... Looking at CPU utilization and one doubles the amount of CPU's and the utilization droppes by 50%, run away as fast as you can... You just encountered one of many bad scaling applications and they blame Windows or the system for not scaling.