Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Limit IOps for VMs #576

Open
sharnoff opened this issue Oct 21, 2023 · 5 comments · May be fixed by #693
Open

Feature: Limit IOps for VMs #576

sharnoff opened this issue Oct 21, 2023 · 5 comments · May be fixed by #693
Assignees
Labels
c/autoscaling/neonvm Component: autoscaling: NeonVM t/feature Issue type: feature, for new features or requests

Comments

@sharnoff
Copy link
Member

sharnoff commented Oct 21, 2023

Problem description / Motivation

This hasn't happened yet for VMs, but in theory a noisy tenant can saturate disk IO by itself, leading to significant degradation on the underlying k8s node (affecting other pods and kubelet itself).

Recent inspiration: https://neondb.slack.com/archives/C061XEGSCE7/p1697733194985739?thread_ts=1697732054.624899&cid=C061XEGSCE7

This is also potentially affected by recently moving the file cache to disk.

Feature idea(s) / DoD

IO rate limiting for VMs should not be an accidental side-effect of the speed of QEMU; i.e. we should have intentional safeguards to cap the amount of IO a single VM can do.

This could be implemented as a global constant, compiled in, or we could make it part of the VM spec (with some default value) — maybe could combine with settings from #547.

Implementation ideas

We're already running QEMU in a cgroup — we can additionally limit e.g. io.max based on CPU.

We should also consider how this looks from within the VM: if QEMU is blocked on disk, does the VM kernel observe that as the underlying device being slow, or does the VM get invisibly paused? Does time spent waiting on disk count towards the QEMU cgroup's cpu.max? (if so, do we need to change that?)

@sharnoff sharnoff added t/feature Issue type: feature, for new features or requests c/autoscaling/neonvm Component: autoscaling: NeonVM labels Oct 21, 2023
@rahulinux
Copy link

@cicdteam
Copy link
Member

cicdteam commented Oct 29, 2023

@sharnoff

Just FYI: we can manage IOPS limits by QEMU itself. There are params for -drive option (used on QEMU start) or we can use QMP to manage limits in runtime. Simple example from docs - link.

@lassizci
Copy link
Contributor

lassizci commented Dec 7, 2023

We by the way seem to have mq-deadline as scheduler in the quests. I think we should use noop instead, because the cpu cycles the guest uses for scheduling, go waste, because the host (or well, the storage controller in case of nvme drives..) will do scheduling anyway.

@cicdteam
Copy link
Member

cicdteam commented Dec 7, 2023

from one of NeonVMs

  • root disk IO scheduler
root@neonvm:~# cat /sys/block/vda/queue/scheduler 
[mq-deadline] kyber none
  • /neonvm/cache IO scheduler
root@neonvm:~# cat /sys/block/vdc/queue/scheduler 
[mq-deadline] kyber none

Disks in VMs are VirtIO-blk devices, so we can try none IO scheduler to see it improve anything or not.

@sharnoff sharnoff linked a pull request Dec 19, 2023 that will close this issue
@sharnoff
Copy link
Member Author

Blocked on design work. Moving back from "in progress" to "selected".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/autoscaling/neonvm Component: autoscaling: NeonVM t/feature Issue type: feature, for new features or requests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants