-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
getaddrinfo ENOTFOUND occasionally #4798
Comments
Same steps as in #4765
Most commonly, this issue is caused by you using a DNS resolver which does not like the level of DNS requests it is getting. |
|
I have no clue what could be causing this. Lets rule out the stupid cauases first:
|
I'm sure TTL is 600
I enable errors plugin but not log plugin. I'll try to enable log plugin to find more details. |
@sudoexec Alpine or other musl based Linux? Can you post a copy of your host's and the running container's /etc/resolv.conf? I have seen similar issues in the past, including with Kubernetes, usually involving multiple DNS servers or related to search domains. The musl resolver would send out multiple parallel queries and ignore all replies but the first one. If that response was an error, this is what you would get. If the "good" lookup would usually win the race, you wouldn't see this error often. Also, a regular https://jvns.ca/blog/2022/02/23/getaddrinfo-is-kind-of-weird/ (this is just a personal opinion, but I wouldn't touch |
@thielj Host machine is ubuntu 18.04.
Thanks for the info you provided, I've learned more abount DNS internal from it. Additionally, I've added another nameserver to uptime kuma pod, and there're no errors in the past 2 days. |
If you get more getadrinfo related errors: those resolv.conf settings and the internal DNS they lead to is the rabbit hole you need to dig into, all the way from the container/pod down your stack. https://coredns.io/2017/06/08/how-queries-are-processed-in-coredns/ |
We should likely document this here What is your second nameserver? (how did you find it's IP? Do you have multiple coredns instances running?) (Not a kubernetes/dns wizard 😅) |
@thielj Thanks again for your help. I'll try it
In fact,"another nameserver" is 1.1.1.1. In case it's caused by coredns. |
@sudoexec This probably doesn't do what you expect, and if it does, you're relying on specific implementation behaviour of POSIX getaddrinfo. There are at least four different major implementations, and most of them can be further configured, see nsswitch.conf for an example. The two most common, and their default behaviour with regards to the DNS resolver are:
So: If you specify more than one server in resolv.conf, BOTH should be able to resolve ALL your hosts. If you want to implement fallbacks, query routing and such, configure a coredns or dnsmasq instance appropriately and point your resolv.conf to that. If you still want two DNS entries in your resolv.conf, configure two identically redundant instances. Also, if you run frequent probes, you will eventually see failures. That's pretty normal. With a 99.99% reliability, a < 0.01% failure rate would be acceptable. Configure your probes to allow for one retry maybe? |
I started seeing this behavior after setting up AdGuard Home. In my previous setup I only had Unbound DNS running on my OPNsense router/firewall. Now, AdGuard will relay all requests that it doesn't decide to block to Unbound, so AdGuard is the primary DNS. My entire home network is whitelisted in AdGuard as is the Uptime Kuma IP, so no blocking should be happening there. I am running Uptime Kuma as an LXC container on my Proxmox host. |
Weeks age, I change my upstream DNS (which is provided by cloud service and managed by systemd-resolved) to another 2 public DNS server. There's no |
📑 I have found these related issues/pull requests
🛡️ Security Policy
Description
There are some
getaddrinfo ENOTFOUND
errors occasionally(0-3 errors per day).Uptime Kuma running in k8s. Upstream dns is k8s's coredns and coredns don't have any error logs.
I use
while true; do nslookup example.com && sleep 1; done
to test dns resolution and no errors.The error occurs randomly and I can't reproduce it.
Is there any methods to find details about this error?
👟 Reproduction steps
Can't reproduce.
👀 Expected behavior
No getaddrinfo ENOTFOUND errors.
😓 Actual Behavior
getaddrinfo ENOTFOUND
🐻 Uptime-Kuma Version
1.23.11
💻 Operating System and Arch
k8s
🌐 Browser
125.0.6422.112 (Official Build) Arch Linux (64-bit)
🖥️ Deployment Environment
📝 Relevant log output
The text was updated successfully, but these errors were encountered: