Monotonic timers

Thursday, November 19, 2009

Factor has had a calendar library for several years now. While it’s great for converting timestamps to human-readable formats, calculating holidays, and finding the number of days between two dates, it’s the wrong concept to use for timing code, alarms, and thread switching. In such cases where you don’t need an actual date, you should use monotonic timers, which are counters that always increment from an unspecified time in the past and aren’t affected by changes to the system time. Even if the user changes the clock, these monotonic timers don’t go back in time – they keep increasing. Let’s look at the implementation.

Implementation of monotonic timers

Although I originally implemented monotonic timers as a Factor library, I moved the code into the C++ VM as a primitive called nano-count. To distinguish the usage of this word from the word formerly known as micros, I renamed micros to system-micros. Having the word “system” in the name of one time-returning word, and having “count” in the other, hopefully leads to less confusion on the user’s part.

Windows

The code I came up with for Windows looks like this:

u64 nano_count()
{
    static double scale_factor;

    static u32 hi = 0;
    static u32 lo = 0;

    LARGE_INTEGER count;
    BOOL ret = QueryPerformanceCounter(&count);
    if(ret == 0)
        fatal_error("QueryPerformanceCounter", 0);

    if(scale_factor == 0.0)
    {
        LARGE_INTEGER frequency;
        BOOL ret = QueryPerformanceFrequency(&frequency);
        if(ret == 0)
            fatal_error("QueryPerformanceFrequency", 0);
        scale_factor = (1000000000.0 / frequency.QuadPart);
    }

#ifdef FACTOR_64
    hi = count.HighPart;
#else
    /* On VirtualBox, QueryPerformanceCounter does not increment
    the high part every time the low part overflows.  Workaround. */
    if(lo > count.LowPart)
        hi++;
#endif
    lo = count.LowPart;

    return (u64)((((u64)hi << 32) | (u64)lo) * scale_factor);
}

It could probably be optimized by only calling QueryPerformanceFrequency once, but I don’t set the processor affinity yet, so I’m not convinced it will work in every case. As you can see, it’s pretty simple: the performance counter is queried and returns a number of clock cycles since some arbitrary beginning epoch, and then that time is scaled by the clock frequency to get nanoseconds.
Edit: This code contains a workaround for a VirtualBox counter bug.

Some Unix systems

u64 nano_count()
{
    struct timespec t;
    int ret;
    ret = clock_gettime(CLOCK_MONOTONIC,&t);
    if(ret != 0)
        fatal_error("clock_gettime failed", 0);
    return t.tv_sec * 1000000000 + t.tv_nsec;
}

Calling clock_gettime from the librt library or, on some platforms, as a system call, gives you the number of nanoseconds since an arbitrary start point in the past. The timespec struct has a seconds and a nanoseconds slots, while the timeval struct (used by system-micros) has seconds and microseconds.

Mac OSX

u64 nano_count()
{
    u64 time = mach_absolute_time();

    static u64 scaling_factor = 0;
    if(!scaling_factor)
    {
        mach_timebase_info_data_t info;
        kern_return_t ret = mach_timebase_info(&info);
        if(ret != 0)
            fatal_error("mach_timebase_info failed",ret);
        scaling_factor = info.numer/info.denom;
    }

    return time * scaling_factor;
}

The MacOSX code is a bit different because Apple didn’t implement clock_gettime. Instead, they have a couple of Mach functions that function just like the Windows code, with one returning a count and the other returning clock frequency information.

Upgraded alarms

The alarms vocabulary now uses monotonic timers instead of system time for scheduling alarms. Previously, the API for scheduling an alarm was the following, where passing f as the last input parameter would schedule a one-time alarm.

add-alarm ( quot start-timestamp interval-duration/f -- alarm )

However, this design is bad because the system time could change, resulting in a huge backlog of alarms to run. Also, most alarms were scheduled for less than a second into the future, which makes timestamps pretty useless since no date calculations are being performed. The new API takes a duration:

add-alarm ( quot start-duration interval-duration/f -- alarm)

Note that duration can be things like

300 milliseconds
5 seconds
200 nanoseconds

Using monotonic timers

Mouse drag alarm

Here’s an example of using an alarm from the mouse handling code:

: start-drag-timer ( -- )
    hand-buttons get-global empty? [
        [ drag-gesture ] 300 milliseconds 100 milliseconds
        add-alarm drag-timer get-global >box
    ] when ;

The drag-gesture word gets called 300 milliseconds after a mouse button has been clicked, and again every 100 milliseconds afterwards until the alarm gets cancelled when the user releases a mouse button. The alarm is put into a global box because storing into a full box throws an error, which in this case would represent impossibility of the user dragging two things at once. Once dragging stops, the alarm gets cancelled with a call to cancel-alarm. You can look at the full source here.

Benchmark word

The benchmark word times a quotation and returns the number of nanoseconds that its execution took. Its implementation follows:

: benchmark ( quot -- runtime )
    nano-count [ call nano-count ] dip - ; inline

This word simply gets the count from the monotonic timer, calls the quotation, gets a new count, and finds the elapsed time by subtraction.

Rescheduling alarms

After repeated alarms execute, they must be rescheduled to run again.

: reschedule-alarm ( alarm -- )
    dup interval>> nano-count + >>start register-alarm ;

The alarm gets rescheduled interval>> nanoseconds into the future.

Remaining issues

Putting the computer to sleep on Snow Leopard in the middle of bootstrap and then resuming does not affect timing. However, is this the case with other operating systems such as Snow Vista or Linux? If not, it might not be worth worrying about. If someone wanted to test, just start a Factor bootstrap and then put the computer to sleep for awhile and see if bootstrap time increases. Otherwise, I’ll get to it eventually.

Update: Someone on the Factor mailing list reported that putting the computer to sleep on bootstrap in Linux did not mess up the timing. Thank you!