So the key symptom that makes me suspicious is that ping takes about a second or so to start printing when I run it on Ubuntu and starts almost immediately on an OSX laptop (I don't have ubuntu on a laptop, so OS is kind of possibly coincidence).
Normally I would suspect DNS but this is my Ubuntu DNS test:
dig +trace www.stackoverflow.com
; <<>> DiG 9.16.1-Ubuntu <<>> +trace www.stackoverflow.com
;; global options: +cmd
;; Received 40 bytes from 10.0.0.1#53(10.0.0.1) in 7 ms
mtr, ping and speed test are all fine in their metrics. For example this is ping on the Ubuntu desktop:
ping www.stackoverflow.com
PING stackoverflow.com (151.101.1.69) 56(84) bytes of data.
64 bytes from 151.101.1.69 (151.101.1.69): icmp_seq=1 ttl=57 time=9.87 ms
64 bytes from 151.101.1.69 (151.101.1.69): icmp_seq=2 ttl=57 time=8.95 ms
64 bytes from 151.101.1.69 (151.101.1.69): icmp_seq=3 ttl=57 time=9.17 ms
64 bytes from 151.101.1.69 (151.101.1.69): icmp_seq=4 ttl=57 time=8.83 ms
64 bytes from 151.101.1.69 (151.101.1.69): icmp_seq=5 ttl=57 time=9.14 ms
64 bytes from 151.101.1.69 (151.101.1.69): icmp_seq=6 ttl=57 time=9.08 ms
64 bytes from 151.101.1.69 (151.101.1.69): icmp_seq=7 ttl=57 time=9.16 ms
64 bytes from 151.101.1.69 (151.101.1.69): icmp_seq=8 ttl=57 time=9.03 ms
64 bytes from 151.101.1.69 (151.101.1.69): icmp_seq=9 ttl=57 time=8.91 ms
but the key thing is that it took nearly two seconds to start printing ping times.
I realize this might be something totally unrelated to Ubuntu but I am guessins that there might be some Linux internals or debugging knowledge here that might help.
I can look at the strace output of ping but not really sure what I am looking for. strace prints things and then hangs for two seconds while it is doing whatever it is doing. this is the output when I kill it at that point
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libnss_mdns4_minimal.so.2", O_RDONLY|O_CLOEXEC) = 5
read(5, "177ELF21100000000030>0100024023000000"..., 832) = 832
fstat(5, {st_mode=S_IFREG|0644, st_size=18504, ...}) = 0
mmap(NULL, 20496, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 5, 0) = 0x7f565ae48000
mmap(0x7f565ae49000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 5, 0x1000) = 0x7f565ae49000
mmap(0x7f565ae4b000, 4096, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 5, 0x3000) = 0x7f565ae4b000
mmap(0x7f565ae4c000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 5, 0x3000) = 0x7f565ae4c000
close(5) = 0
mprotect(0x7f565ae4c000, 4096, PROT_READ) = 0
munmap(0x7f565ae4e000, 155550) = 0
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 5
fstat(5, {st_mode=S_IFREG|0644, st_size=155550, ...}) = 0
mmap(NULL, 155550, PROT_READ, MAP_PRIVATE, 5, 0) = 0x7f565ae4e000
close(5) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libnss_dns.so.2", O_RDONLY|O_CLOEXEC) = 5
read(5, "177ELF21100000000030>01000 #000000"..., 832) = 832
fstat(5, {st_mode=S_IFREG|0644, st_size=31176, ...}) = 0
mmap(NULL, 32984, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 5, 0) = 0x7f565ae3f000
mmap(0x7f565ae41000, 16384, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 5, 0x2000) = 0x7f565ae41000
mmap(0x7f565ae45000, 4096, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 5, 0x6000) = 0x7f565ae45000
mmap(0x7f565ae46000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 5, 0x6000) = 0x7f565ae46000
close(5) = 0
mprotect(0x7f565ae46000, 4096, PROT_READ) = 0
munmap(0x7f565ae4e000, 155550) = 0
socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 5
setsockopt(5, SOL_IP, IP_RECVERR, [1], 4) = 0
connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, 16) = 0
poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}])
sendmmsg(5, [{msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base=">32610010000003www
stackoverflow3c"..., iov_len=39}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=39}, {msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="w25610010000003www
stackoverflow3c"..., iov_len=39}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=39}], 2, MSG_NOSIGNAL) = 2
poll([{fd=5, events=POLLIN}], 1, 5000^C) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
strace: Process 1242328 detached
Any ideas welcome.
UPDATE:
I now suspect it is the connection to the router/modem. But ther VERY strange thing is that the follow is slow on my Ubuntu desktop but normal on the laptops (OSX).
ping dsldevice.lan
PING dsldevice.lan (192.168.1.254) 56(84) bytes of data.
64 bytes from dsldevice.lan (192.168.1.254): icmp_seq=1 ttl=63 time=2.08 ms
64 bytes from dsldevice.lan (192.168.1.254): icmp_seq=2 ttl=63 time=1.89 ms
UPDATE:
After a lot of fooling around, running an apt update and restarting, it seems I am back to normal now. I looked at tcpdump before (when problem was occuring) and after and am not noticing much.
Still have no idea but now the problem is gone for the time being. Will update if it re-occurs. Seems like a good learning exercise.
UPDATE: (This is getting long)
For reference on the MTU questions, I am connecting from Ubuntu wirelessly to a netgear router that is wired to a plusnet fibre connection (using ADSL from the premises).
UPDATE: Testing for mtu size as in https://mike632t.wordpress.com/2019/03/03/determine-mtu-size-using-ping/
I think something is very wrong when I test mtu against 8.8.8.8? For larger sizes I get the "message too long" error but as I decrease the size the message no longer appaers but I get 100% packet loss until size 68=96-28. Maybe this is expected for some reason?
ping -c 4 -M do -s 1472 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 1472(1500) bytes of data.
From 192.168.1.254 icmp_seq=1 Frag needed and DF set (mtu = 1488)
ping: local error: message too long, mtu=1488
ping: local error: message too long, mtu=1488
ping -s $((97 - 28)) -D 8.8.8.8 -c 1
PING 8.8.8.8 (8.8.8.8) 69(97) bytes of data.
--- 8.8.8.8 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
ping -s $((96 - 28)) -D 8.8.8.8 -c 1
PING 8.8.8.8 (8.8.8.8) 68(96) bytes of data.
[1620817285.571725] 76 bytes from 8.8.8.8: icmp_seq=1 ttl=116 time=10.7 ms
--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 10.680/10.680/10.680/0.000 ms
UPDATE: another data point.
I am suspecting this belongs more in a networking forum as I will find out this is nothing to do with the Ubuntu driver side of things as I had suspected earlier, but not sure yet
ping -c 3 -s $((1489 - 28)) -M do bbc.co.uk
PING bbc.co.uk (151.101.0.81) 1461(1489) bytes of data.
ping: local error: message too long, mtu=1488
ping: local error: message too long, mtu=1488
ping: local error: message too long, mtu=1488
--- bbc.co.uk ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2037ms
ping -c 3 -s $((1488 - 28)) -M do bbc.co.uk
PING bbc.co.uk (151.101.0.81) 1460(1488) bytes of data.
1468 bytes from 151.101.0.81 (151.101.0.81): icmp_seq=1 ttl=58 time=11.8 ms
1468 bytes from 151.101.0.81 (151.101.0.81): icmp_seq=2 ttl=58 time=12.0 ms
1468 bytes from 151.101.0.81 (151.101.0.81): icmp_seq=3 ttl=58 time=10.7 ms
--- bbc.co.uk ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 10.702/11.519/12.029/0.583 ms
UPDATE:
The result of the tracepath query as requested. This is at mtu 1500 setting. Note that the usual mtr tests are showing good speed and latency as well as few packets.
tracepath www.ebay.com
1?: [LOCALHOST] pmtu 1488
1: www.routerlogin.com 1.048ms
1: www.routerlogin.com 1.003ms
2: dsldevice.lan 2.037ms
3: no reply
4: no reply
5: 128.hiper04.sheff.dial.plus.net.uk 10.954ms asymm 7
6: peer3-et3-1-1.slough.ukcore.bt.net 85.034ms asymm 7
7: peer2-xe8-0-2.telehouse.ukcore.bt.net 25.399ms asymm 8
8: no reply
9: no reply
10: no reply
11: no reply
UPDATE:
Just to confirm, mtu is set to 1500.
ip link | grep wlxa09f10b9ff56
3: wlxa09f10b9ff56: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DORMANT group default qlen 1000
UPDATE: complete log of the ping tests against 8.8.8.8
$ ping -c 4 -M do -s 1472 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 1472(1500) bytes of data.
From 192.168.1.254 icmp_seq=1 Frag needed and DF set (mtu = 1488)
ping: local error: message too long, mtu=1488
ping: local error: message too long, mtu=1488
ping: local error: message too long, mtu=1488
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 3028ms
$ ping -c 4 -M do -s 1462 8.8.8.8 # may show fragmentation
PING 8.8.8.8 (8.8.8.8) 1462(1490) bytes of data.
ping: local error: message too long, mtu=1488
ping: local error: message too long, mtu=1488
ping: local error: message too long, mtu=1488
ping: local error: message too long, mtu=1488
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 3049ms
$ ping -c 4 -M do -s 1452 8.8.8.8 # no fragmentation?
PING 8.8.8.8 (8.8.8.8) 1452(1480) bytes of data.
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3003ms
$ ping -c 4 -M do -s 1453 8.8.8.8 # still no fragmentation?
PING 8.8.8.8 (8.8.8.8) 1453(1481) bytes of data.
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3005ms
$ ping -c 4 -M do -s 69 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 69(97) bytes of data.
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3004ms
$ ping -c 4 -M do -s 68 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 68(96) bytes of data.
76 bytes from 8.8.8.8: icmp_seq=1 ttl=116 time=60.2 ms
76 bytes from 8.8.8.8: icmp_seq=2 ttl=116 time=9.07 ms
76 bytes from 8.8.8.8: icmp_seq=3 ttl=116 time=8.89 ms
76 bytes from 8.8.8.8: icmp_seq=4 ttl=116 time=9.04 ms
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 8.887/21.802/60.215/22.177 ms
UPDATE:
For what it is worth I am "on fibre" on Plusnet but with a short ADSL connection (that is what they tell me anyway). According to this thread that means the Plusnet router defaults to 1500 and I have everything upstream (netgear router, ubuntu desktop) set to 1500 as well.