Thursday, May 2, 2024
 Popular · Latest · Hot · Upcoming
1
rated 0 times [  1] [ 0]  / answers: 1 / hits: 1683  / 2 Years ago, thu, march 10, 2022, 9:54:38

For the past 3 months, I have been struggling with a random issue on my homeserver where DNS resolution drops for a brief period of time (10-60 seconds) for absolutely no reason. Pinging via hostname results in ping: signal.org: Temporary failure in name resolution, and any services that attempt a DNS lookup fail near instantly. There are no systemd-resolved or dnsmasq logs in /var/log/syslog when these outages happen, but other services will report issues. For example:


ddclient[573749]: message repeated 14 times: [ WARNING: cannot connect to checkip.dyndns.org:80 socket: IO::Socket::INET: Bad hostname 'checkip.dyndns.org']


dockerd[1811]: time="2021-04-29T13:50:19.080258289-05:00" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers: [nameserver 8.8.8.8 nameserver 8.8.4.4]"


rsyslogd: DNS error: Can't resolve "<local_domain>" [v8.2001.0]


whoopsie[1816]: [17:38:15] Sent; server replied with: Couldn't resolve host name


Current setup: Ubuntu 20.04.2, Netplan set to static IP, dnsmasq is the DNS server, with dns-forward-max=1024, systemd-resolved is disabled and stopped. Server is a Ryzen 3950X, 64GB RAM, OS is installed on an NVMe drive. The server runs many webapp-type services, but the nosiest for DNS requests is easily matrix-synapse.


Things I have tried:


· I have restarted the systemd-resolved service hundreds of times, disabled the service a dozen times, turned off/on the stub resolver, and deleted and re-created the symlink.


· I set a static IP with netplan, and played with /etc/NetworkManager/NetworkManager.conf.


· I Installed pihole and unbound via apt for just the server itself. (pihole is currently uninstalled, and unbound is running but nothing is using it to resolve.


· I Installed dnsmasq and completely disabled systemd-resolved.


· I've disabled IPv6 completely on the server.


· I've set * soft nofile 1048576 and * hard nofile 1048576 in /etc/security/limits.conf, and /proc/sys/fs/file-max shows 9223372036854775807.


I suspect Docker is the issue, but I have no idea how to verify this. I've currently got 38 Docker containers running, and when I run sudo lsof -i :53 while the issue is happening, I will see:


thomcat@servername:~$ sudo lsof -i :53
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
dockerd 1623 root 217u IPv4 1577888 0t0 UDP localhost:46003->localhost:domain
dockerd 1623 root 226u IPv4 1605902 0t0 UDP localhost:50192->localhost:domain
dockerd 1623 root 227u IPv4 1610070 0t0 UDP localhost:52637->localhost:domain
dockerd 1623 root 228u IPv4 1605907 0t0 UDP localhost:55021->localhost:domain
dockerd 1623 root 229u IPv4 1618981 0t0 UDP localhost:57618->localhost:domain
dockerd 1623 root 230u IPv4 1610081 0t0 UDP localhost:35776->localhost:domain
dockerd 1623 root 231u IPv4 1610086 0t0 UDP localhost:60635->localhost:domain
dockerd 1623 root 232u IPv4 1589998 0t0 UDP localhost:43036->localhost:domain
dockerd 1623 root 234u IPv4 1602056 0t0 UDP localhost:58408->localhost:domain
dockerd 1623 root 235u IPv4 1614011 0t0 UDP localhost:43421->localhost:domain
dockerd 1623 root 236u IPv4 1589999 0t0 UDP localhost:60957->localhost:domain
dockerd 1623 root 237u IPv4 1597695 0t0 UDP localhost:53026->localhost:domain
dockerd 1623 root 242u IPv4 1590000 0t0 UDP localhost:41842->localhost:domain
dockerd 1623 root 244u IPv4 1597696 0t0 UDP localhost:49179->localhost:domain
dockerd 1623 root 246u IPv4 1572736 0t0 UDP localhost:46471->localhost:domain
dockerd 1623 root 266u IPv4 1616008 0t0 UDP localhost:35262->localhost:domain
dockerd 1623 root 267u IPv4 1616009 0t0 UDP localhost:54501->localhost:domain
dockerd 1623 root 268u IPv4 1579887 0t0 UDP localhost:33130->localhost:domain
dockerd 1623 root 269u IPv4 1579888 0t0 UDP localhost:33491->localhost:domain
dockerd 1623 root 270u IPv4 1613280 0t0 UDP localhost:49504->localhost:domain
dockerd 1623 root 273u IPv4 1579890 0t0 UDP localhost:43801->localhost:domain
dockerd 1623 root 278u IPv4 1613283 0t0 UDP localhost:44804->localhost:domain
dockerd 1623 root 279u IPv4 1568692 0t0 UDP localhost:39425->localhost:domain
dockerd 1623 root 293u IPv4 1577890 0t0 UDP localhost:52194->localhost:domain
dockerd 1623 root 296u IPv4 1605903 0t0 UDP localhost:50866->localhost:domain
dockerd 1623 root 319u IPv4 1605904 0t0 UDP localhost:58574->localhost:domain
dockerd 1623 root 341u IPv4 1605910 0t0 UDP localhost:37123->localhost:domain
dockerd 1623 root 342u IPv4 1610067 0t0 UDP localhost:48734->localhost:domain
dockerd 1623 root 343u IPv4 1610069 0t0 UDP localhost:35580->localhost:domain
dockerd 1623 root 344u IPv4 1605905 0t0 UDP localhost:45133->localhost:domain
dockerd 1623 root 345u IPv4 1618982 0t0 UDP localhost:53052->localhost:domain
dockerd 1623 root 346u IPv4 1589996 0t0 UDP localhost:56714->localhost:domain
dockerd 1623 root 347u IPv4 1614009 0t0 UDP localhost:37216->localhost:domain
dockerd 1623 root 348u IPv4 1589997 0t0 UDP localhost:38032->localhost:domain
dockerd 1623 root 349u IPv4 1618984 0t0 UDP localhost:53714->localhost:domain
dockerd 1623 root 350u IPv4 1610084 0t0 UDP localhost:42922->localhost:domain
dockerd 1623 root 351u IPv4 1577893 0t0 UDP localhost:32865->localhost:domain
dockerd 1623 root 352u IPv4 1608975 0t0 UDP localhost:58307->localhost:domain
dockerd 1623 root 353u IPv4 1597699 0t0 UDP localhost:33564->localhost:domain
dockerd 1623 root 354u IPv4 1608977 0t0 UDP localhost:58235->localhost:domain
dockerd 1623 root 355u IPv4 1577896 0t0 UDP localhost:46068->localhost:domain
dockerd 1623 root 356u IPv4 1597702 0t0 UDP localhost:32827->localhost:domain
systemd-r 106795 systemd-resolve 12u IPv4 980615 0t0 UDP localhost:domain
systemd-r 106795 systemd-resolve 13u IPv4 980616 0t0 TCP localhost:domain (LISTEN)
http 165553 _apt 3u IPv4 1611999 0t0 UDP localhost:54478->localhost:domain

More things to note:


· The upstream DNS server is a Raspberry Pi 3 B+ running pihole. Nothing else on my network has these DNS resolution problems, so the problem is not with the pihole.


· ssh sessions to the server do not drop when this issue is happening.


· pinging external IPs works just fine when the issue is happening.


 


I've been pulling my hair out trying to figure this out. If anyone has any ideas, I would be glad to hear them.


More From » dns

 Answers
2

TL;DR: Make sure your Pi-Hole isn't rate-limiting your requests.


Today, I finally Google'd "pihole rate limit", and low and behold this recent blog post mentioned:



...we decided to implement a customizable rate-limiting into FTL itself. It defaults to the rather conservative limit of allowing no more than 1000 queries in a 60 seconds window for each client.



I was beside myself and had completely missed this news. I've opened a feature request with Pi-Hole to get a log entry added for when this happens, hopefully to keep a future home sysadmin from pulling their hair out.


1,000 queries in 60 seconds might sound like a lot, but with 38 active Docker containers (and especially Watchtower and matrix-synapse) those get filled up in a hurry.


[#1626] Saturday, March 12, 2022, 2 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
tionavocad

Total Points: 189
Total Questions: 101
Total Answers: 118

Location: Liechtenstein
Member since Wed, Dec 8, 2021
2 Years ago
tionavocad questions
Tue, May 23, 23, 00:07, 1 Year ago
Tue, Jan 17, 23, 20:38, 1 Year ago
Sun, Oct 10, 21, 04:50, 3 Years ago
Tue, Jun 7, 22, 08:50, 2 Years ago
;