Copy of email from a software engineer about TSC vs HPET:
FWIW, I wanted to provide some additional insight as I’ve been involved with this since 2008 R2 when the Stamp Counter (TSC)was (Re)-introduced to Windows. The PlatformClock parameter / tweak is not meant to “improve” performance per say, it’s meant for stability and compatibility purposes which can (and will) improve performance over use of the TSC when enabled on / within the correct (Server) configuration environment.
You may recall that traditionally Windows (as do most x86 based OSs) function on the assumption / use of a Single Clock Domain for a given server, however with the ability of servers to be physically “scaled” (connected together) to create a larger “multinode” server (IE – 2, 3, 4, 5, 6, 7 or 8 node), we have a problem where each server has it’s own local “clock”. This creates a “multi-clock” domain which in and of itself is not bad, however, the “clocks” are Not synchronized across all nodes (unless Hardware clock synchronization is implemented which is very difficult/involved to implement) therefore there can be clock skew / drift between nodes & processors (other than Node 0) which can lead to thread scheduling and timing issues which at minimum can lead to performance problems in addition to other strange & bizarre behavior (poor network & disk performance, hang conditions, etc. etc. etc.).
The problem is encountered within a multi-clock domain beginning in 2008 R2 when the TSC was re-introduced as the default Clock (as mentioned above) Vs the use of the HPET (or Power Management (ACPI / PMclock)) Clock that prior OS versions used. The TSC is very fast and reliable but in using the TSC as the default Source Clock, the OS assumes a Single clock source. Depending upon which node the RDTSC thread executes, clock synchronization may / can become skewed resulting in thread scheduling problems and issues such as is mentioned above.
This said, to alleviate this potential problem on some older servers and certain multi-node scalable server platforms, “bcdedit /set useplatformclock true” should be implemented to circumvent these potential problems. Note: Even if the default system (platform) clock is HPET, ACPI / PMclock, etc, Windows must be explicitly told to use the Platform Clock else TSC will be used.
I wrote the following Knowledge Base (Retain Tip) article back in 2011 as a result of problems encountered when TSC was re-introduced in 2008 R2. Please note that these issues are not limited to IBM but can impact any vendor, system / server on which a multi-clock domain environment exists:
A Negative Ping Value, Hang Condition, Poor Performance or Machine Checks may be experienced on IBM System X Multinode Servers running Microsoft Windows 2008 R2 – IBM System x
The following is a good general article from MS related to Time Stamps and there are other on the web as well:
Acquiring high-resolution time stamps
Regards,
Ron Arndt
--------------------------------------------------------------------------------------------------------------------
From a Desktop / Gaming standpoint, there should be no need to use the useplatformclock tweak within a Desktop / Laptop / Workstation environment unless there are known problems with the system using the TSC that could not be resolved within the vendor’s BIOS/UEFI code (which should be a known and documented issue within the vendor’s support forums, etc.).
In general, TSC is much faster for the OS and Apps to utilize than the other methods which should provide better overall performance by avoiding additional latencies due to additional overhead of the other clock counter methods.
This said, it is possible that depending upon the era of the HW platform (desktop, laptop, workstation, etc) that there may be some stability issues of some systems running Win7 / 2008 R2 (due to HW / FW and OS (TSC) implementations) being within a transitional period of time where a system may have better stability when using a system’s default (or configured platform clock within BIOS/UEFI) which is not the TSC (HPET, ACPI/PM_Timer) so stability is also a performance consideration.
In short, the TSC provides the lowest overhead which can and does translate to lower overhead / latency generally resulting in better performance and therefore the use of the TSC for Windows 2008 R2 (Windows 7) and later should remain as the default clock source unless known issues or problems are encountered.
Also, you might notice more of a difference in performance when testing single processor (socket) and single core processor packages Vs multi-socket / multi-core processor packages and configurations. I believe some of the more current desktops / workstations or Main Boards have more than one processor socket & core per processor which may be an interesting test behavior as we see the personal systems become more like the servers of a few years ago.
Regards,
Ron Arndt