Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set GOMAXPROCS and GOMEMLIMIT environment variables #6977

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jnoordsij
Copy link

@jnoordsij jnoordsij commented Apr 30, 2024

Pull Request Motivation

Set GOMAXPROCS and GOMEMLIMIT environment variables based on container resources.

Inspired by traefik/traefik-helm-chart#1029, this should reduce potential CPU throttling and OOMKills on containers.

Kind

/kind feature

Release Note

Autoset `GOMAXPROCS` and `GOMEMLIMIT` environment variables for cert-manager pods based on requested CPU and memory values

@cert-manager-prow cert-manager-prow bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. area/deploy Indicates a PR modifies deployment configuration labels Apr 30, 2024
@cert-manager-prow
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign joshvanl for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cert-manager-prow cert-manager-prow bot added needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 30, 2024
@cert-manager-prow
Copy link
Contributor

Hi @jnoordsij. Thanks for your PR.

I'm waiting for a cert-manager member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cert-manager-prow cert-manager-prow bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 30, 2024
@inteon
Copy link
Member

inteon commented May 1, 2024

@wallrj FYI

@wallrj wallrj self-requested a review May 9, 2024 05:55
Copy link
Member

@SgtCoDFish SgtCoDFish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, appreciate you getting involved with the project!

I don't think we can merge this as is, and I think to merge something like this we'd need to discuss further about how to do this safely and in a way that won't lead to confusing outcomes. I'd encourage you to attend one of our meetings if you want to discuss this!

Comment on lines +166 to +177
{{- if (.Values.resources.limits).cpu }}
- name: GOMAXPROCS
valueFrom:
resourceFieldRef:
resource: limits.cpu
{{- end }}
{{- if (.Values.resources.limits).memory }}
- name: GOMEMLIMIT
valueFrom:
resourceFieldRef:
resource: limits.memory
{{- end }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: A valid value for limits.cpu would be 100m. That value wouldn't be valid for GOMAXPROCS as far as I can tell, and I think it'd be ignored.

Similarly, a valid amount of memory in limits.memory would be 1e6 or 120M, and both of those would be invalid entries for GOMEMLIMIT (this says that supported suffixes are "B, KiB, MiB, GiB, and TiB").

I don't think we can generally apply the values of resource limits like this. This won't have the expected effect for a lot of totally valid resource limits.

It's maybe possible to construct some Helm function to convert resource limits to GOMAXPROCS / GOMEMLIMIT, but I think that'd be hard to do and a pain to debug. My instinct is that it's probably not worth the effort to add this - what do you think?

Copy link
Member

@inteon inteon May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might actually work: https://billglover.me/2022/09/14/use-the-kubernetes-downwards-api-to-set-gomemlimit/
That article seems to suggest that the downwards API actually returns the computed int value.

Copy link
Member

@SgtCoDFish SgtCoDFish May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then this PR needs confirmation, tests and documentation of that 😁 That article might be worth adding to the PR description but an article isn't really enough to get this change over the line.

Specifically, I'd like to see examples of what GOMAXPROCS is set to if my CPU limits are 0.01, 1m or 2.5 as initial test cases. I'd also like to see what GOMEMLIMIT is set to if I specify 1e6, or 1KB as memory limits.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resourceFieldRef is a very specific Kubernetes directive that is created specifically for passing resource-related values, which rounds up the CPU value to the nearest whole number (e.g. 250m to 1) and passes the memory as a numeric value; so 64Mi would result in the environment variable being set to 67108864. This by design makes it completely compatible with Go's API.

An example is documented within Kubernetes documentation itself: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/#use-container-fields-as-values-for-environment-variables.

Would referencing to Kubernetes documentation suffice here, given that this just basically ensures the correct behavior by design? And if so, what would be a suitable please to add this reference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the link!

Honestly, having the docs linked in this PR is probably enough. I don't think we need to complicate the helm chart with a link, and anyone that cares will be able to git blame and find this PR easily enough.

I'll enable testing for this PR 👌

@SgtCoDFish
Copy link
Member

/ok-to-test

@cert-manager-prow cert-manager-prow bot added ok-to-test and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 14, 2024
@SgtCoDFish SgtCoDFish added kind/feature Categorizes issue or PR as related to a new feature. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels May 14, 2024
@SgtCoDFish SgtCoDFish added this to the 1.15 milestone May 14, 2024
@inteon inteon modified the milestones: 1.15, 1.16 May 21, 2024
@inteon
Copy link
Member

inteon commented Jun 17, 2024

@jnoordsij next step is to verify that this PR actually improves performance of cert-manager. @wallrj might be able to help here, he did some benchmarking of cert-manager in the past.

@wallrj
Copy link
Member

wallrj commented Jun 20, 2024

@jnoordsij thanks for bringing this to our attention. I've been testing the effect of GOMEMLIMIT by setting the environment variables directly.

My conclusion from the experiments below, is it that setting the GOMEMLIMIT equal to the memory limit has the following advantages:

  1. Significantly lower memory usage (at the cost of slightly higher CPU usage)
  2. Fewer OOMKILL container failures
  3. Improved ability to recover from OOMKILL failures and continue reconciling Certificates.

Memory Limit 200MiB / GOMEMLIMIT 200MiB

In this experiment I set both a memory limit of 200Mi and GOMEMLIMIT of 200MiB.

  1. cert-manager reconciles 5000 self-signed RSA 2048 Certificate resources in ~500s (same as baseline experiment below).
  2. peak memory usage was ~195MiB (around half the memory usage in the baseline experiment below).
  3. CPU usage looks higher and choppier during the ramp-up phase (compared to baseline experiment below)...maybe due to increased GC activity?
  4. The cert-manager controller was OOMKilled only once (highlighted in the second graph) and it recovered and was able to complete the benchmark.
  5. The Grafana graph shows the cert-manager-controller exceeding its memory limit, and I don't know if that's a glitch in the report, or a bug in the quota enforcement
#values.yaml
resources:
  requests:
    cpu: 1
    memory: 200Mi
  limits:
    memory: 200Mi
extraEnv:
  - name: GOMEMLIMIT
    value: 200MiB

prometheus:
  enabled: true
  servicemonitor:
    enabled: true

config:
  apiVersion: controller.config.cert-manager.io/v1alpha1
  kind: ControllerConfiguration
  kubernetesAPIQPS: 10000
  kubernetesAPIBurst: 10000
  maxConcurrentChallenges: 400
  numberOfConcurrentWorkers: 8
  featureGates:
    ServerSideApply: true

image
image

Memory Limit 200MiB

In this experiment I only set a memory limit, not GOMEMLIMIT.

  1. cert-manager fails to reconcile 5000 self-signed RSA 2048 Certificate resources
  2. The cert-manager controller was OOMKilled when the memory usage approached 200MiB and was oomkilled repeatedly, each time the container was restarted.
richard@LAPTOP-HJEQ9V9G:~$ kubectl get pods -n venafi --watch
NAME                                                   READY   STATUS    RESTARTS   AGE
cert-manager-69f6f7585f-vnplg                          1/1     Running   0          4m29s
cert-manager-approver-policy-75664b78fc-jsnp8          1/1     Running   0          3m49s
cert-manager-cainjector-56846796ff-j8rs7               1/1     Running   0          4m29s
cert-manager-webhook-6979b54d5f-brgmg                  1/1     Running   0          4m29s
tlspk-monitoring-kube-state-metrics-68945cb7fd-l5pwb   1/1     Running   0          34s
trust-manager-9445c9f58-d6p2x                          1/1     Running   0          3m26s
venafi-enhanced-issuer-6687f4dcd5-26cxn                1/1     Running   0          3m49s




cert-manager-69f6f7585f-vnplg                          0/1     OOMKilled   0          9m51s
cert-manager-69f6f7585f-vnplg                          1/1     Running     1 (2s ago)   9m52s
cert-manager-69f6f7585f-vnplg                          0/1     OOMKilled   1 (4s ago)   9m54s
cert-manager-69f6f7585f-vnplg                          0/1     CrashLoopBackOff   1 (7s ago)   10m
cert-manager-69f6f7585f-vnplg                          1/1     Running            2 (19s ago)   10m
cert-manager-69f6f7585f-vnplg                          0/1     OOMKilled          2 (21s ago)   10m
cert-manager-69f6f7585f-vnplg                          0/1     CrashLoopBackOff   2 (7s ago)    10m
cert-manager-69f6f7585f-vnplg                          1/1     Running            3 (33s ago)   10m
cert-manager-69f6f7585f-vnplg                          0/1     OOMKilled          3 (35s ago)   10m
cert-manager-69f6f7585f-vnplg                          0/1     CrashLoopBackOff   3 (3s ago)    10m
cert-manager-69f6f7585f-vnplg                          1/1     Running            4 (45s ago)   11m
cert-manager-69f6f7585f-vnplg                          0/1     OOMKilled          4 (46s ago)   11m
cert-manager-69f6f7585f-vnplg                          0/1     CrashLoopBackOff   4 (7s ago)    11m
cert-manager-69f6f7585f-vnplg                          1/1     Running            5 (92s ago)   13m
cert-manager-69f6f7585f-vnplg                          0/1     OOMKilled          5 (93s ago)   13m
cert-manager-69f6f7585f-vnplg                          0/1     CrashLoopBackOff   5 (5s ago)    13m
cert-manager-69f6f7585f-vnplg                          1/1     Running            6 (2m42s ago)   15m
cert-manager-69f6f7585f-vnplg                          0/1     OOMKilled          6 (2m44s ago)   15m
cert-manager-69f6f7585f-vnplg                          0/1     CrashLoopBackOff   6 (1s ago)      15m
cert-manager-69f6f7585f-vnplg                          1/1     Running            7 (5m10s ago)   20m
cert-manager-69f6f7585f-vnplg                          0/1     OOMKilled          7 (5m12s ago)   21m
cert-manager-69f6f7585f-vnplg                          0/1     CrashLoopBackOff   7 (10s ago)     21m
#values.yaml
resources:
  requests:
    cpu: 1
    memory: 200Mi
  limits:
    memory: 200Mi

prometheus:
  enabled: true
  servicemonitor:
    enabled: true

config:
  apiVersion: controller.config.cert-manager.io/v1alpha1
  kind: ControllerConfiguration
  kubernetesAPIQPS: 10000
  kubernetesAPIBurst: 10000
  maxConcurrentChallenges: 400
  numberOfConcurrentWorkers: 8
  featureGates:
    ServerSideApply: true

image
image

Without memory limit

In this base line experiment neither memory limit nor GOMEMLIMIT were set.

  1. cert-manager reconciled 5000 self-signed RSA 2048 Certificate resources in ~500s.
  2. Peak memory usage was 421MiB in cert-manager-controller
#values.yaml
resources:
  requests:
    cpu: 1
    memory: 200Mi

prometheus:
  enabled: true
  servicemonitor:
    enabled: true

config:
  apiVersion: controller.config.cert-manager.io/v1alpha1
  kind: ControllerConfiguration
  kubernetesAPIQPS: 10000
  kubernetesAPIBurst: 10000
  maxConcurrentChallenges: 400
  numberOfConcurrentWorkers: 8
  featureGates:
    ServerSideApply: true

image
image

@jnoordsij
Copy link
Author

@wallrj thanks a lot for your efforts to benchmark the changes, looking forward to your further findings!

Regarding the exceeding of the memory limit: I've observed similar measurements on my applications in the past, but have not yet been able to find a thorough explanation. My thoughts so far are probably some kind of mismeasurement and/or reporting caused by a restart of the container, causing the reported value to show the sum of both the just-killed container and the new one, although this is mostly speculative on my part.

@inteon
Copy link
Member

inteon commented Jun 20, 2024

/hold

@cert-manager-prow cert-manager-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 20, 2024
@inteon inteon removed this from the 1.16 milestone Jun 20, 2024
@wallrj
Copy link
Member

wallrj commented Jun 20, 2024

@jnoordsij wrote:

thanks a lot for your efforts to benchmark the changes, looking forward to your further findings!

I've updated the comment with the remaining findings.
Not claiming this is the most scientific analysis, but it seems to me that setting GOMEMLIMIT does have some clear benefits.

I have some doubts about setting GOMEMLIMIT equal to the resources.limits.memory.

  1. This article recommends setting GOMEMLIMIT to be 5-10% below the hard memory limit:
  2. This article recommends setting GOMEMLIMIT to be "slightly below" the cgroups memory limit:
  3. I wondered if it might ever be advantagous to setting only resources.requests.memory and GOMEMLIMIT (omit resources.limits.memory) as a sort of soft memory limit.
  4. If some users are already setting GOMEMLIMIT by adding extraEnv values when deploying the Helm chart, then the changes in this PR cause will force those users to adopt this new mechanism.
  5. Users will no longer be able to set bespoke GOMEMLIMIT values.

I might start by trying to document the advantages of setting GOMEMLIMIT in https://cert-manager.io/docs/devops-tips/scaling-cert-manager/#set-appropriate-memory-requests-and-limits

Regarding the exceeding of the memory limit: I've observed similar measurements on my applications in the past, but have not yet been able to find a thorough explanation. My thoughts so far are probably some kind of mismeasurement and/or reporting caused by a restart of the container, causing the reported value to show the sum of both the just-killed container and the new one, although this is mostly speculative on my part.

I found a couple of possibly related issues:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/deploy Indicates a PR modifies deployment configuration dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants