Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in sql_exporter in compute node? #7966

Open
Bodobolero opened this issue Jun 5, 2024 · 1 comment
Open

Memory leak in sql_exporter in compute node? #7966

Bodobolero opened this issue Jun 5, 2024 · 1 comment
Assignees
Labels
a/reliability Area: relates to reliability of the service c/compute Component: compute, excluding postgres itself t/bug Issue Type: Bug

Comments

@Bodobolero
Copy link
Contributor

Steps to reproduce

I ran multiple tests (pgvector indexing, index from subselect) etc., see https://neondb.slack.com/archives/C0732L0A4AH/p1717575836689089?thread_ts=1717499447.241659&cid=C0732L0A4AH

At the end we had an OOM in the compute node and the error log said:

2024-06-05 11:36:07.198 [12358.306313] Out of memory: Killed process 168 (sql_exporter) total-vm:1268868kB, anon-rss:12712kB, file-rss:8672kB, shmem-rss:0kB, UID:65534 pgtables:200kB oom_score_adj:0

1.2 GB RAM for a component just doing some SQL queries to collect metrics seems expensive.
Probably we have a memory leak in the sql exporter or it can not get rid of collected metrics quickly enough.

Expected result

Actual result

Environment

Logs, links

https://neonprod.grafana.net/goto/gT_JXPyIR?orgId=1

@Bodobolero Bodobolero added t/bug Issue Type: Bug c/cloud/compute a/reliability Area: relates to reliability of the service c/compute Component: compute, excluding postgres itself labels Jun 5, 2024
@bayandin
Copy link
Member

bayandin commented Jun 5, 2024

Currently, we use sql-exporter 0.13.

FROM burningalchemist/sql_exporter:0.13 AS sql-exporter

We can try to update it to 0.14.3, to check if that's help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/reliability Area: relates to reliability of the service c/compute Component: compute, excluding postgres itself t/bug Issue Type: Bug
Projects
None yet
Development

No branches or pull requests

4 participants