Exposing hardware-backed CPU timers to limit overhead from Xen's software timers
Owner: thibodux
Time: Thursday 11:05 Final
Location: Contemporary, 6th Floor

Problem to Solve

Software-based virtual timers implemented in Xen are a source of overhead and non-determinism for virtualized applications. For some industries and use cases, these observables effects prevent Xen from being used - performance guarantees and determinism trump almost all other matters in some applications.

Ryan Thibodeaux and Christopher Clark seek to host a design session to discuss a proposal for exposing hardware-backed CPU timers to guests, with an initial emphasis on Intel CPUs and Linux guests. The approach considered would selectively expose the local APIC timers in each Intel CPU core, thereby allowing Linux guests to directly utilize high-resolution timers in hardware.

The proposed approach would likely entail a new guest configuration option that would control access to hardware timers. It is expected that the feature would be available to specific configurations where side-effects and guest features are limited, e.g., CPU pinned guests using the NULL scheduler and without migration support.

Attendee Contributions

Ryan and Christopher seek feedback and guidance from both the Xen and Linux maintainers. Ryan and Christopher will present an initial approach to expose CPU / hardware-backed timers (likely including patches for both projects). It is expected that the audience will review the design concepts and help to identify risks and limitations of this approach.

Ideally, the design session will conclude with a decision on the feasibility of an approach to improve timer performance and identify the configuration options to extend or add in support of this approach.

Preparation

Ryan Thibodeaux has already submitted a related patch to the Linux kernel project that allows a guest kernel to change (at boot) the minimum timer resolution in the kernel (see https://github.com/torvalds/linux/commit/2ec16bc0fc7ab544f2d405fd4fdd0d717c5ec0c5). This mirrors an existing feature in the Xen hypervisor (the "timer_slop" Xen option).