Total Pageviews

Friday, February 15, 2013

SCCM 2007 R3 Despooler process slow on an Dell PowerEdge R910

**Updated**
I have added the SetTimer.exe and ClockRes.exe into a zip file and shared it to the public on SkyDrive.  The link is http://1drv.ms/1cNCY9x.

I recently had to go onsite where a client was using Windows Server 2008 R2 Sp1 and SCCM 2007 R3.  They had decided to upgrade the hardware from a Dell PowerEdge R900 to a PowerEdge R910.  The R900 had 8 processor cores and 16 GB of RAM.  The R910 has 32 processor cores, with the ability to run Hyper-Threading and 128 GB of RAM.  In all testing, the R900 was able to process the SCCM despooler logs in roughly 70 minutes.  Using the R910, the exact same process using the exact same data ran in 6 hours.

The troubleshooting process was difficult to say the least.  We ruled out hardware, it happened on multiple platforms, R710, R820, and 2 other R910s, using Intel Windermere processors.  The disk I/O did not look bad, and when we ran IOMeter against it, the disks were relatively fast.  Memory was configured correctly, as the R910 has 4 NUMA nodes.

We ruled out SQL Server very quickly as I verified the wait stats looked relatively good.  The CXPacket wait stat was a little high, so I changed the max degree of parallelism to 8 (there are 4 NUMA nodes and 8 processors in each NUMA node).  We configured the maxservermemory setting to 100 GB, and verified that all indexes and statistics were up to date.  We ran multiple PSSDiags and used SQL Nexus to verify that SQL performance and configuration was pretty solid.  There was no other indication that SQL could be an issue.

Looking at the OS, it looked relatively good.  The storport.sys driver was a little bit out of date, but even with the newest driver, we still experienced the same issue.  Once all drivers, firmware, windows updates, SQL updates, and SCCM updates were installed, the issue still persisted.  At this time, we decided to get Microsoft involved to look over the server.

Microsoft looked over the server, including the normal MPS report and PSSDiags.  Even they said everything looks good.  Then, they suggested we look at the thread sleep time.  To do this, they had us run Clockres.exe.  On the R900, which did not have the issue, the current timer interval was 1 ms.  On the R910, the current timer interval was 15.6 ms.  This is 15 times longer for a thread to sleep.  This interval is set when the OS boots.  It runs an algorithm that calculates this sleep time from the CPU frequency and some other information, that at this time is not known.  However, any application with administrative privileges can change this time system wide.  2 good examples are the Lync client and windows media player.  Microsoft provided a small application called SetTimer.exe that when ran, changes the current timer interval to 1 ms.  With this set to 1 ms, the test ran faster than the R900 configuration.

Before running SetTimer.exe

After running SetTimer.exe.

After further research, the kernel variables that set this information are KiCyclesPerClockQuantum and KeMaximumIncrement.  More information can be found in the book Windows Internals 6 Part 1, Page 422-423, under the section of Quantum.