oxenstored currently critically relies on one [^1] final unstable interface in Xenctrl: xc_domain_getinfo_single. This is a design session to propose and discuss alternative API that could be added to a stable interface.
The interface itself is very inefficient: oxenstored has to iterate over all domains when a VIRQ is received and call this one by one to figure out which one has changed state to diying/shutdown. So although moving this API into a stable library would solve the immediate dependency problem an API more suitable for (o)xenstored’s needs would be preferable.
get_domains_diying(4K bitmap*)
Since oxenstored first only wants to determine which domain it got the notification for (a single bit of information) and there are only 0x7FF0 domains at most currently supported by Xen a bitmap of info.shutdown|info.diying would conveniently fit into a single 4KiB page, and performing one hypercall to fetch that should be more efficient than performing 1000 hypercalls in a row to figure out which of the 1000 domains died on a busy system. Once the domain is identified then a target hypercall could be made to retrieve information about (a batch of) domains, or just the shutdown code.
If the system is prepared to run N domains, then allocating allocating a few more pages/domain to signal when it dies shouldn’t consume a lot of resources, and it would be good if the notification delivered to oxenstored would also contain this information to avoid the need to iterate over all the N domains. Each domain having its own event channel on which oxenstored is notified that it has died should be more efficient (provided that ‘oxenstored’ would use something more efficient than its current poll-disguised-as-select loop, which is a separate problem to solve anyway), and would avoid the need for introducing new APIs.
If there was an alternative to this API call that is part of a library that guarantees a stable interface (i.e. not xenctrl which by definition does not), then oxenstored could be built and distributed separately from the hypervisor. This in turn would solve longstanding barriers to contribution in terms of having the ability to reuse existing (external) libraries, and using a more moden build system and review workflow that is more familiar to OCaml developers and should prevent oxenstored from stagnating.
[^1] technically there is also Xenctrl.map_foreign_range, but that already has a solution using stable interfaces at, but not yet committed https://lore.kernel.org/all/2e703b8a3e75370ed0208b2c1da9a3562df82a14.1620755943.git.edvin.torok@citrix.com/, which would currently involve vendoring/copying in some code from the mirage project to use ‘xenforeignmemory’ bindings. If external dependencies could be introduced to oxenstored it would prevent code duplication and we could just depend on the library already provided by the mirage sub-project.