Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use node label instead of extracting from instance #2423

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

deepy
Copy link

@deepy deepy commented Dec 28, 2023

Given a common cAdvisor setup the instance label will refer to the node by IP rather than hostname
This uses the node label instead, relying on a setup where that label is properly configured rather than an unexpected instance setup.

What does this PR change?

  • Switches the querying to the node label instead of relying on users configuring prometheus to add the node to the instance hostname

How will this PR impact users?

  • Common monitoring setups will no longer break opencost (this also affects kubecost)
  • If you're configuring prometheus directly you need the relabel_configs from below
  • If you're using ServiceMonitors you need to make certain you have honorLabels: true

Does this PR address any GitHub or Zendesk issues?

How was this PR tested?

  • On a local cluster
  • In an IDE debugging
  • (metrics breaking kubecost was confirmed in local cluster as well)

Does this PR require changes to documentation?

Yes, the following relabel_configs needs to be added:

relabel_configs:
- action: labelmap
   regex: __meta_kubernetes_node_name
   replacement: node

Have you labeled this PR and its corresponding Issue as "next release" if it should be part of the next OpenCost release? If not, why not?

  • No access to labels

Fixes opencost#2281

Given a common cAdvisor setup the instance label will refer to
the node by IP rather than hostname
This uses the `node` label instead, relying on a setup where that
label is properly configured rather than an unexpected `instance`.

Signed-off-by: Alex Nordlund <[email protected]>
Copy link

vercel bot commented Dec 28, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
opencost ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 28, 2023 1:41pm

deepy added a commit to deepy/cost-analyzer-helm-chart that referenced this pull request Dec 28, 2023
This is a companion to opencost/opencost#2423

Without the changes from opencost/2423 this _will_ break queries in
kubecost.

Signed-off-by: Alex Nordlund <[email protected]>
@deepy
Copy link
Author

deepy commented Dec 28, 2023

image

@mattray mattray added P2 Estimated Priority (P0 is highest, P4 is lowest) E2 Estimated level of Effort (1 is easiest, 4 is hardest) opencost OpenCost issues vs. external/downstream kubecost Relevant to Kubecost's downstream project labels Jan 16, 2024
@mattray
Copy link
Collaborator

mattray commented Jan 16, 2024

@deepy thanks for the PR! This seems relatively straightforward, but I'd like the @opencost/opencost-maintainers (especially @AjayTripathy) to weigh in on this.

Copy link

sonarcloud bot commented Jan 16, 2024

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@deepy
Copy link
Author

deepy commented Jan 16, 2024

Absolutely, same here :-)
I'm currently running this setup live in AWS and while both the reports in the UI/API and the metrics are working fine and showing correctly results, I haven't fully validated every metric and every part of it.

And there's some places where I've left it as instance as everything resolved correctly when inspected under a debugger

Thanks for having a look at this!

@mattray
Copy link
Collaborator

mattray commented Jan 24, 2024

@opencost/opencost-maintainers anyone want to weigh in? This came from #2281

@mbolt35
Copy link
Contributor

mbolt35 commented Jan 24, 2024

This needs to be thoroughly tested across all providers. IIRC, there is a lot of nuance with this particular label.

@AjayTripathy
Copy link
Contributor

#2444 We're playing with making this configurable.

@r2k1
Copy link
Contributor

r2k1 commented Jan 29, 2024

One note. Currently recommended way of installing opencost is using latest image tag.
This can be a potentially breaking change that will be automatically delivered to many users.

@thomasvn
Copy link
Contributor

@deepy I've recently also noticed an environment with cAdvisor returning an IP instead of hostname. This ended up causing issues with OpenCost computing Allocations.

@deepy Do you know what versions of the Kubelet or what versions of cAdvisor show this behavior? Many of my environments still show the "Instance" label as having the hostname. Regardless, I agree with this PR's approach and think we should give users an option on how they want to query cAdvisor here.

Copy link

This pull request has been marked as stale because it has been open for 90 days with no activity. Please remove the stale label or comment or this pull request will be closed in 5 days.

@github-actions github-actions bot added the Stale label Jun 18, 2024
@mattray
Copy link
Collaborator

mattray commented Jun 18, 2024

@thomasvn what do you want to do with this PR?

@deepy
Copy link
Author

deepy commented Jun 18, 2024

I don't think kubelet or cAdvisor versioning is what factors in here, I'm willing to bet a very small sum on it being configuration. Potentially just as simple as honorLabels: true

I'll put a little effort into another investigation at some point next week, but for now I think everything is in #2281 (comment)

@github-actions github-actions bot removed the Stale label Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E2 Estimated level of Effort (1 is easiest, 4 is hardest) kubecost Relevant to Kubecost's downstream project needs-follow-up opencost OpenCost issues vs. external/downstream P2 Estimated Priority (P0 is highest, P4 is lowest)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No request or usage data found during CPU allocation calculation
6 participants