Saturday, May 18, 2024
 Popular · Latest · Hot · Upcoming
1
rated 0 times [  1] [ 0]  / answers: 1 / hits: 1007  / 3 Years ago, thu, june 17, 2021, 2:31:26

The setup


I am finally decided to ask for help about an issue that started a couple weeks ago.
I am running a headless RPI 3B+ with Ubuntu Server 20.04, running mainly a Wireguard server and a couple of lightweight Docker containers (Homebridge, Pi-hole, Portainer etc).


My investigation


For a couple of weeks now (no exact date or action that started the issue) the Pi becomes randomly unaccessible over the network until hard-rebooted. After further investigation, here's what I can report:



  • No particular overload at any point that could crash the Pi, power levels are always good, power supply is brand new and definitely outputting enough power.

  • When the crash happens, the Pi is unreachable from the network, but continues to run: on-screen console is still visible, Dockers still run in the background according to later retrieved syslog, and activity LED occasionally lights-up.

  • The on-screen console shows an error message (attached here)

  • Syslog says the following:


Dec 15 18:07:51 rpi kernel: [47182.438053] ------------[ cut here ]------------
Dec 15 18:07:51 rpi kernel: [47182.438136] NETDEV WATCHDOG: eth0 (lan78xx): transmit queue 0 timed out
Dec 15 18:07:51 rpi kernel: [47182.438270] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:447 dev_watchdog+0x370/0x378
Dec 15 18:07:51 rpi kernel: [47182.438276] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter bridge stp llc iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter ip6_tables iptable_filter bpfilter wireguard ip6_udp_tunnel udp_tunnel aufs overlay dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua btsdio bluetooth ecdh_generic ecc brcmfmac brcmutil cfg80211 bcm2835_codec(CE) bcm2835_isp(CE) bcm2835_v4l2(CE) v4l2_mem2mem bcm2835_mmal_vchiq(CE) videobuf2_vmalloc videobuf2_dma_contig snd_bcm2835(CE) videobuf2_memops videobuf2_v4l2 snd_pcm raspberrypi_hwmon videobuf2_common snd_timer videodev snd mc vc_sm_cma(CE) uio_pdrv_genirq uio sch_fq_codel drm ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_ce spidev phy_generic aes_neon_bs aes_neon_blk crypto_simd cryptd
Dec 15 18:07:51 rpi kernel: [47182.438514] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G C E 5.4.0-1025-raspi #28-Ubuntu
Dec 15 18:07:51 rpi kernel: [47182.438521] Hardware name: Raspberry Pi 3 Model B Plus Rev 1.3 (DT)
Dec 15 18:07:51 rpi kernel: [47182.438529] pstate: 60400005 (nZCv daif +PAN -UAO)
Dec 15 18:07:51 rpi kernel: [47182.438539] pc : dev_watchdog+0x370/0x378
Dec 15 18:07:51 rpi kernel: [47182.438547] lr : dev_watchdog+0x370/0x378
Dec 15 18:07:51 rpi kernel: [47182.438554] sp : ffff80001000bd80
Dec 15 18:07:51 rpi kernel: [47182.438559] x29: ffff80001000bd80 x28: ffff0000363a4380
Dec 15 18:07:51 rpi kernel: [47182.438569] x27: 00000000ffffffff x26: ffff00002bf0f680
Dec 15 18:07:51 rpi kernel: [47182.438579] x25: ffffd97df4309018 x24: ffff00002bf0f740
Dec 15 18:07:51 rpi kernel: [47182.438588] x23: ffff0000352cf45c x22: ffff0000352cf000
Dec 15 18:07:51 rpi kernel: [47182.438598] x21: ffff0000352cf480 x20: ffffd97df4607000
Dec 15 18:07:51 rpi kernel: [47182.438607] x19: 0000000000000000 x18: 0000000000000000
Dec 15 18:07:51 rpi kernel: [47182.438616] x17: 0000000000000000 x16: 0000000000000000
Dec 15 18:07:51 rpi kernel: [47182.438626] x15: ffff000035a322f0 x14: ffffffffffffffff
Dec 15 18:07:51 rpi kernel: [47182.438636] x13: 0000000000000000 x12: ffffd97df4742000
Dec 15 18:07:51 rpi kernel: [47182.438646] x11: ffffd97df462c000 x10: ffffd97df4742a80
Dec 15 18:07:51 rpi kernel: [47182.438655] x9 : 0000000000000000 x8 : 0000000000000004
Dec 15 18:07:51 rpi kernel: [47182.438663] x7 : 0000000000000000 x6 : 0000000000000000
Dec 15 18:07:51 rpi kernel: [47182.438672] x5 : 0000000000000000 x4 : 0000000000000002
Dec 15 18:07:51 rpi kernel: [47182.438681] x3 : ffffd97df3c15790 x2 : 0000000000000040
Dec 15 18:07:51 rpi kernel: [47182.438689] x1 : 0000000000000000 x0 : 0000000000000000
Dec 15 18:07:51 rpi kernel: [47182.438699] Call trace:
Dec 15 18:07:51 rpi kernel: [47182.438708] dev_watchdog+0x370/0x378
Dec 15 18:07:51 rpi kernel: [47182.438720] call_timer_fn+0x40/0x1e8
Dec 15 18:07:51 rpi kernel: [47182.438729] run_timer_softirq+0x1d4/0x590
Dec 15 18:07:51 rpi kernel: [47182.438738] __do_softirq+0x170/0x424
Dec 15 18:07:51 rpi kernel: [47182.438748] irq_exit+0xb4/0xe8
Dec 15 18:07:51 rpi kernel: [47182.438760] __handle_domain_irq+0x74/0xc8
Dec 15 18:07:51 rpi kernel: [47182.438768] bcm2836_arm_irqchip_handle_irq+0x78/0xf0
Dec 15 18:07:51 rpi kernel: [47182.438775] el1_irq+0x108/0x200
Dec 15 18:07:51 rpi kernel: [47182.438784] arch_cpu_idle+0x40/0x238
Dec 15 18:07:51 rpi kernel: [47182.438793] default_idle_call+0x28/0x6c
Dec 15 18:07:51 rpi kernel: [47182.438805] do_idle+0x214/0x2a0
Dec 15 18:07:51 rpi kernel: [47182.438813] cpu_startup_entry+0x2c/0x78
Dec 15 18:07:51 rpi kernel: [47182.438825] secondary_start_kernel+0x18c/0x1c8
Dec 15 18:07:51 rpi kernel: [47182.438833] ---[ end trace 8fa731254680f7cd ]---


  • A simple reboot restores full functionality

  • The crash seems to be happening ever day and a half or so (don't know yet if it always is an exact time)

  • Being busy this week, I tried a temporary workaround by scheduling a daily software reboot at 4 am, possibly preventing the Pi from crashing, but without success. It seems like a full power-cycle is required.


My understanding


From what I understand, the issue comes/affects eth0 along with its linked modules, which would explain the impossibility to remote into the Pi, but the services still working.
Other than that, I am not sure which steps to take towards resolving the problem, any help would be greatly appreciated. Also let me know if I need to attach more logs.


Thank you very much for reading me, and let's resolve this!


_cilusse


More From » 20.04

 Answers
7

See https://bugs.launchpad.net/bugs/1861936. A fix is coming shortly.


[#2217] Thursday, June 17, 2021, 3 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
ainubt

Total Points: 496
Total Questions: 98
Total Answers: 126

Location: Sao Tome and Principe
Member since Wed, Dec 21, 2022
1 Year ago
;