The behavior of the Windows scheduler changed significantly in Windows 10 2004, in a way that will break a few applications, and there appears to have been no announcement, and the documentation has not been updated. This isn’t the first time this has happened, but this change seems bigger than last time.

The short version is that calls to timeBeginPeriod from one process now affect other processes less than they used to, but there is still an effect.

I think the new behavior is an improvement, but it’s weird, and it deserves to be documented. Fair warning – all I have are the results of experiments I have run, so I can only speculate about the quirks and goals of this change. If any of my conclusions are wrong then please let me know and I will update this.

Timer interrupts and their raison d’être

A geeky clockFirst, a bit of operating-system design context. It is desirable for a program to be able to go to sleep and then wake up a little while later. This actually shouldn’t be done very often – threads should normally be waiting on events rather than timers – but it is sometimes necessary. And so we have the Windows Sleep function – pass it the desired length of your nap in milliseconds and it wakes you up later, like this:

Sleep(1);

It’s worth pausing for a moment to think about how this is implemented. Ideally the CPU goes to sleep when Sleep(1) is called, in order to save power, so how does the operating system (OS) wake your thread if the CPU is sleeping? The answer is hardware interrupts. The OS programs a timer chip that then triggers an interrupt that wakes up the CPU and the OS can then schedule your thread.

The WaitForSingleObject and WaitForMultipleObjects functions also have timeout values and those timeouts are implemented using the same mechanism.

If there are many threads all waiting on timers then the OS could program the timer chip with individual wakeup times for each thread, but this tends to result in threads waking up at random times and the CPU never getting to have a long nap. CPU power efficiency is strongly tied to how long the CPU can stay asleep (8+ ms is apparently a good number), and random wakeups work against that. If multiple threads can synchronize or coalesce their timer waits then the system becomes more power efficient.

There are lots of ways to coalesce wakeups but the main mechanism used by Windows is to have a global timer interrupt that ticks at a steady rate. When a thread calls Sleep(n) then the OS will schedule the thread to run when the first timer interrupt fires after the time has elapsed. This means that the thread may end up waking up a bit late, but Windows is not a real-time OS and it actually cannot guarantee a specific wakeup time (there may not be a CPU core available at that time anyway) so waking up a bit late should be fine.

The interval between timer interrupts depends on the Windows version and on your hardware but on every machine I have used recently the default interval has been 15.625 ms (1,000 ms divided by 64). That means that if you call Sleep(1) at some random time then you will probably be woken sometime between 1.0 ms and 16.625 ms in the future, whenever the next interrupt fires (or the one after that if the next interrupt is too soon).

In short, it is the nature of timer delays that (unless a busy wait is used, and please don’t busy wait) the OS can only wake up threads at a specific time by using timer interrupts, and a regular timer interrupt is what Windows uses.

Some programs (WPF, SQL Server, Quartz, PowerDirector, Chrome, the Go Runtime, many games, etc.) find this much variance in wait delays hard to deal with but luckily there is a function that lets them control this. timeBeginPeriod lets a program request a smaller timer interrupt interval by passing in a requested timer interrupt interval. There is also NtSetTimerResolution which allows setting the interval with sub-millisecond precision but that is rarely used and never needed so I won’t mention it again.

Decades of madness

Here’s the crazy thing: timeBeginPeriod can be called by any program and it changes the timer interrupt interval, and the timer interrupt is a global resource.

Let’s imagine that Process A is sitting in a loop calling Sleep(1). It shouldn’t be doing this, but it is, and by default it is waking up every 15.625 ms, or 64 times a second. Then Process B comes along and calls timeBeginPeriod(2). This makes the timer interrupt fire more frequently and suddenly Process A is waking up 500 times a second instead of 64 times a second. That’s crazy! But that’s how Windows has always worked.

At this point if Process C came along and called timeBeginPeriod(4) this wouldn’t change anything – Process A would continue to wake up 500 times a second. It’s not last-call-sets-the-rules, it’s lowest-request-sets-the-rules.

To be more specific, whatever still running program has specified the smallest timer interrupt duration in an outstanding call to timeBeginPeriod gets to set the global timer interrupt interval. If that program exits or calls timeEndPeriod then the new minimum takes over. If a single program called timeBeginPeriod(1) then that is the timer interrupt interval for the entire system. If one program called timeBeginPeriod(1) and another program then called timeBeginPeriod(4) then the one ms timer interrupt interval would be the law of the land.

powercfg /energy /duration 5This matters because a high timer interrupt frequency – and the associated high-frequency of thread scheduling – can waste significant power, as discussed here.

One case where timer-based scheduling is needed is when implementing a web browser. The JavaScript standard has a function called setTimeout which asks the browser to call a JavaScript function some number of milliseconds later. Chromium uses timers (mostly WaitForSingleObject with timeouts rather than Sleep) to implement this and other functionality. This often requires raising the timer interrupt frequency. In order to reduce the battery-life implications of this Chromium has been modified recently so that it doesn’t raise the timer interrupt frequency above 125 Hz (8 ms interval) when running on battery.

timeGetTime

timeGetTime (not to be confused with GetTickCount) is a function that returns the current time, as updated by the timer interrupt. CPUs have historically not been good at keeping accurate time (their clocks intentionally fluctuate to avoid being FM transmitters, and for other reasons) so they often rely on separate clock chips to keep accurate time. Reading from these clock chips is expensive so Windows maintains a 64-bit counter of the time, in milliseconds, as updated by the timer interrupt. This timer is stored in shared memory so any process can cheaply read the current time from there, without having to talk to the timer chip. timeGetTime calls ReadInterruptTick which at its core just reads this 64-bit counter. Simple!

Since this counter is updated by the timer interrupt we can monitor it and find the timer interrupt frequency.

The new undocumented reality

With the Windows 10 2004 (April 2020 release) some of this quietly changed, but in a very confusing way. I first heard about this through reports that timeBeginPeriod didn’t work anymore. The reality was more complicated than this.

A bit of experimentation gave confusing results. When I ran a program that called timeBeginPeriod(2) then clockres showed that the timer interval was 2.0 ms, but a separate test program with a Sleep(1) loop was only waking up about 64 times a second instead of the 500 times a second that it would have woken up under previous versions of Windows.

It’s time to do science

I then wrote a pair of programs which revealed what was going on. One program (change_interval.cpp) just sits in a loop calling timeBeginPeriod with intervals ranging from 1 to 15 ms. It holds each timer interval request for four seconds, and then goes to the next one, wrapping around when it is done. It’s fifteen lines of code. Easy.

The other program (measure_interval.cpp) runs some tests to see how much its behavior is altered by the behavior of change_interval.cpp. It does this by gathering three pieces of information.

  1. It asks the OS what the current global timer resolution is, using NtQueryTimerResolution.
  2. It measures the precision of timeGetTime by calling it in a loop until its return value changes. When it changes then the amount it changed by is its precision.
  3. It measures the delay of Sleep(1) by calling it in a loop for a second and counting how many calls it can make. The average delay is just the reciprocal of the number of iterations.

@FelixPetriconi ran the tests for me on Windows 10 1909 and I ran the tests on Windows 10 2004. The results (cleaned up to remove randomness) are shown here:

Table of timeGetTime precision and Sleep(1) delays

What this means is that timeBeginPeriod still sets the global timer interrupt interval, on all versions of Window. We can tell from the results of timeGetTime() that the interrupt fires on at least one CPU core at that rate, and the time is updated. Note also that the 2.0 on row one for 1909 was 2.0 on Windows XP, then 1.0 on Windows 7/8, and is apparently back to 2.0? I guess?

However the scheduler behavior changes dramatically in Windows 10 2004. Previously the delay for Sleep(1) in any process was simply the same as the timer interrupt interval (with an exception for timeBeginPeriod(1)), giving a graph like this:

Sleep(1) delays on Windows 10 1909 vs. Global interrupt interval

In Windows 10 2004 the mapping between timeBeginPeriod and the sleep delay in another process (one that didn’t call timeBeginPeriod) is bizarre:

Sleep(1) delays on Windows 10 2004 vs. Global interrupt interval

The exact shape of the left side of the graph is unclear but it definitely slopes in the opposite direction from before!

Why?

Implications

As was pointed out in the reddit/hacker-news discussion, the left half of the graph seems to be an attempt to simulate the “normal” delay as closely as possible given the available precision of the global timer interrupt. That is, with a 6 millisecond interrupt interval they delay for ~12 ms (two cycles) and with a 7 millisecond interrupt interval they delay for ~14 ms (two cycles). However, measuring of the actual delays shows that the reality is messier than that. With the timer interrupt set to 7 ms a Sleep(1) delay of 14 ms is not even the most common result:

image

Some readers will be tempted to blame this on random noise on the system, but when the timer interrupt frequency is at 9 ms and above there is zero noise, so that cannot be the explanation. Try the updated code yourself. The timer interrupt intervals from 4 ms to 8 ms seem to be particularly perplexing. Probably the interval measurements should be done with QueryPerformanceCounter because the current code is affected by changing scheduling rules and changing timer precision, which is messy.

This is all very weird, and I don’t understand the rationale, or the implementation. Maybe it is a bug, but I doubt it. I think that there is complex backwards compatibility logic behind this. But, the most powerful way to avoid compatibility problems is to document your changes, preferably in advance, and this seems to have been slipped in without anyone being notified.

Most programs will be unaffected. If a process wants a faster timer interrupt then it should be calling timeBeginPeriod itself. That said, here are the problems that this could cause:

  • A program might accidentally assume that Sleep(1) and timeGetTime have similar resolutions, and that assumption is broken now. But, such an assumption seems unlikely.
  • A program might depend on a fast timer resolution and fail to request it. There have been multiple claims that some games have this problem and there is a tool called Windows System Timer Tool and another called TimerResolution 1.2 that “fix” these games by raising the timer interrupt frequency. Those fixes presumably won’t work anymore, or at least not as well. Maybe this will force those games to do a proper fix, but until then this change is a backwards compatibility problem.
  • A multi-process program might have its master control program raise the timer interrupt frequency and then expect that this would affect the scheduling of its child processes. This used to be a reasonable design choice, and now it doesn’t work. This is how I was alerted to this problem. The product in question now calls timeBeginPeriod in all of their processes so they are fine, thanks for asking, but their software was misbehaving for several months with no explanation.

Sacrifice

The change_interval.cpp test program only works if nothing has requested a higher timer interrupt frequency. Since both Chrome and Visual Studio have a habit of doing this I had to do most of my experimentation with no access to the web while writing code in notepad. Somebody suggested Emacs but wading into that debate is more than I’m willing to do.

I’d love to hear more about this from Microsoft, including any corrections to my analysis. Discussions:

Read More

10 تعليقات

  1. It shouldn’t be doing this, but it is

    In my opinion this still remains the conclusion, as it has been for the past decades. I cannot remember when I read a bit on Sleep() behavior and timeBeginPeriod() but I remember that what I read was enough to make clear you just shouldn't rely on these (unless you're 100% sure the consequences are within your spec and will remain so), also not because the workarounds are also widely known (IIRC – things like using WaitForSingleObject if you need accurate Sleep).

  2. Ah yes, reminds me how on my previous project I was in charge of writing a server to mix audio for multiple clients in real time. The server worked well on my local Windows 10 machine, but when deployed to a cloud instance of Windows Server 2016 it ran very very poorly, just barely quickly enough to process data in time.

    That's when I discovered that doing a "process more data if there is any, if not – sleep(1)" loop is a very bad way of doing it, as on Windows Server 2016 "sleep(1)" means "sleep 16ms". It all worked fine once the timer resolution was changed to 1ms, but yeah, the default value will screw you over if you have anything this time sensitive and are using sleeps or waits on windows.

  3. About the game fixing utilities, while it is annoying that these wont work at the moment, they should still be able to work by installing a hook that attaches itself to the game's process and calls timeBeginPeriod (several other unofficial game patches work like this already).

  4. At work we have an application that calls `timeBeginPeriod(1)` to get timer callbacks (from `CreateTimerQueue`) firing at 5ms resolutions but we are not seeing the behaviour described in the article. We observe no change to the timer resolution after calling `timeBeginPeriod(1)`, which unfortunatly is a breaking change to our app.

    The lack of information and response from Microsoft on this has been quite frustrating.

  5. <rant>

    Our models of computer timers are woefully inadequate. These things execute billions of instructions per second. Why shouldn't we be able to schedule a timer at sub-millisecond resolution?

    Answer: we can. But the APIs are very old and assume conditions no longer present. Or something like that. Anyway, they don't get the job done.

    Everybody seems to start a hardware timer at some regular period, then simulate 'timer interrupts' for applications off that timer's interrupt. If you want 12.5ms but the ol' ticker is ticking at 1ms intervals, you get 13 or so depending on where in an interval you asked, it could be 12.

    Even if nobody is using the timer, its ticking away wasting CPU time. So the tendency is, to make the period as long as possible without pissing everybody off.

    Even back in the 1980's, I worked on an OS running on the 8086 with a service called PIT (Programmable Interval Timer). You said what interval you wanted; it programmed the hardware timer for that. If it was already running, and your interval was shorter than what remained, it would reprogram it for your short time, then when it went off it reprogrammed it for the remainder.

    It kept a whole chain of scheduled expirations sorted by time. When the interrupt occurred it'd call the callback of the 1st entry and discard it. Then it'd reprogram the timer for the remaining time on the next.

    It took into account the time of the callback; the time to take the interrupt and reprogram. And it achieved sub-millisecond scheduling even back on that old sad hardware.

    And when nobody was using the timer, it didn't run! Zero wasted CPU.

    Imagine how precise timers could be today, on our super duper gigahertz hardware.

    But what do we get? We get broken, laggy, high-latency, late timer callbacks at some abominable minimum period. Sigh.

    </rant>

  6. I tested my own system using Bruce's "measure_interval.cpp" program (on Windows 1909):

    – Slack (sometimes) sets the global timer to 1ms when it is in the foreground, but restores it in background

    – Spotify sets the global timer to 1ms, no matter what. Even if it isn't playing.

    – Skype sets 1ms, if started at Startup (which it defaults to), even though I am logged out and it just has a tray icon. But when I manually start it, it doesn't (always) set it to 1ms.

    – VSCode will set it to 1ms when you are interacting with it, but will eventually revert to 15.6ms if left alone (even if it is still in foreground).

    – Firefox doesn't appear to set it (on its own; I presume that if I opened a tab that was using a low setTimeout or requestAnimFrame it might).

    Spotify is interesting. A lot of people probably have that app, and since it sets 1ms unconditionally, it would have been setting fast-timer mode prior to the 2004 update, which could inadvertently "speed up" whatever games people were running.

    That includes my own game, which uses a foreground sleep of as low as 1ms to try to hit its time target, and I don't call timeBeginPeriod. I guess I'll find out when I get the 2004 update.

  7. I once spent ages trying to determine why a Python unit test that sorted timestamps constantly failed on Windows. In the test, we compared the timestamps of performed operations, and checked to confirm that the operations happened in sequence based on their timestamp (I'm sure many of you see where this is going). On Windows, the timestamp for all the actions was exactly the same, so when sorted, the actions appeared out-of-order. It was then that I discovered Python's time library on Windows only reports times with a resolution of ~1ms, whereas on Linux the same code reports times with a resolution of ~10us. That one was actually super fun to track down, but super disappointing to discover it's not something that's easily remedied.

    (For those about to suggest how it should have been done, the application also stored an atomic revision counter, so the unit test was switched to that instead of a timestamp.)

  8. > A program might depend on a fast timer resolution and fail to request it. There have been multiple claims that some games have this problem (…)

    Yup, I wrote such a (small, freeware) game 15+ years ago. I wasn't aware of timeBeginPeriod at the time, but I observed that for some inexplicable reason, the game ran more smoothly when Winamp was running in the background. 🙂

  9. > and the timer interrupt is a global resource.

    Shouldn't this at least be per-core rather than global? Then most cores can keep scheduling at a low tick rate and only one or two have to take care of the jittery processes.

ترك الرد

من فضلك ادخل تعليقك
من فضلك ادخل اسمك هنا