Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

inotify support for kubernetes_logs #20541

Open
mrzor opened this issue May 21, 2024 · 1 comment
Open

inotify support for kubernetes_logs #20541

mrzor opened this issue May 21, 2024 · 1 comment
Labels
source: kubernetes_logs Anything `kubernetes_logs` source related type: feature A value-adding code addition that introduce new functionality.

Comments

@mrzor
Copy link

mrzor commented May 21, 2024

A note for the community

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Use Cases

Some pods may succeed (enter the Succeeded state) very quickly. The lifecycle of the log folder is presently tied to the pod lifecycle itself, and the duration during which it is tracked is not straightforward to control (it is actually unclear to what extent that can be controlled at all).

Attempted Solutions

  • Adding artificial delays to workloads allow Vector to pick up the logs even with longer glob cooldowns. We found that 1.1-2.1x cooldown to restore perfect delivery. That is the best solution, but requires workload-author involvement.
  • Setting glob_minimum_cooldown_ms from 60s to 5s polling time lowers the amount of impacted workloads at the cost of vastly increased CPU usage. There must be side-effects to the node itself which has to service the extra filesystem syscalls as well.
  • A mix of the two above (extra latency to be safe and lowered cooldown) allows one to walk the tradeoff space.
  • Tried to find a way to have Kubernetes keep the log around longer. Did not find a sure way. (Source). Failed containers seem to stay longer.

Proposal

I can't do much outside of expressing my strong support for inotify support for the kubernetes source. That may or may not be implemented as part of the file source. It could be an entirely new source on which to base the kubernetes source.

In some of the referenced issues, an argument for platform-independence was made. While the file source certainly has to be platform independent, why should the kubernetes one be? I claim no expertise on the matters of Kubernetes on Windows, but a quick look at the Kubernetes | Windows User Guide | Capturing logs from workloads indicates that the prefered way is not to use files at all. Because of that, it seems to me that the kubernetes source does not support Windows adequately today - and as such, the platform independence argument is moot. Outside of Linux, Kubernetes only runs on Windows as far as I'm aware (k8s/BSD is hardly a thing).

One should not have to choose between efficiency and deliverability, which is the tradeoff we have to make here. Let's have both with inotify !

References

Version

vector 0.36.0 (x86_64-unknown-linux-gnu)

@mrzor mrzor added the type: feature A value-adding code addition that introduce new functionality. label May 21, 2024
@jszwedko jszwedko added the source: kubernetes_logs Anything `kubernetes_logs` source related label May 21, 2024
@jszwedko
Copy link
Member

Thanks for these thoughts @mrzor ! We had discussed this a bit before and I think we'd be open to seeing inotify used in the kubernetes_logs source and also the file source so long as there is also a fallback mechanism (scanning).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
source: kubernetes_logs Anything `kubernetes_logs` source related type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
Development

No branches or pull requests

2 participants