-
Notifications
You must be signed in to change notification settings - Fork 31
Description
The spherion and tomato platforms experience serial issues sometimes. Example on Spherion:
https://lava.collabora.dev/scheduler/job/13962478
This issue happens at random, and by just re-running the job I got it to work: https://lava.collabora.dev/scheduler/job/13995464
Log snippet:
/ # /lava-13962478/bin/lava-test-runner /lava-13962478/0
[ 9.705822] hub 1-1.4:1.0: USB hub found
/lava-13962478/bin/lava-tes[ 9.706251] hub 1-1.4:1.0: 2 ports detected
[ 9.782304] r8152-cfgselector 2-1.3: reset SuperSpeed USB device number 4 using xhci-mtk
[ 9.811444] r8152 2-1.3:1.0: Direct firmware load for rtl_nic/rtl8153a-4.fw failed with error -2
[ 9.811497] r8152 2-1.3:1.0: unable to load firmware patch rtl_nic/rtl8153a-4.fw (-2)
[ 9.844050] r8152 2-1.3:1.0 eth0: v1.12.13
[ 9.844408] r8152-cfgselector 2-1.3: USB disconnect, device number 4
[ 9.919491] Console: switching to colour frame buffer device 240x67
t-runa-13962478/[ 9.953121] ttyS ttyS0: 1 input overrun(s)
[ 10.006751] mediatek-drm mediatek-drm.11.auto: [drm] fb0: mediatekdrmfb frame buffer device
0
/bin/sh: /lava-13962478/bin/lava-test-runa-13962478/0: not found
/ # [ 10.034163] usb 2-1: reset SuperSpeed USB device number 2 using xhci-mtk
After that LAVA reports lava-test-shell timed out after 60 seconds.
As can be seen from the log, the lava-test-runner command gets messed up, and the shell replies with not found. There's also a ttyS ttyS0: 1 input overrun(s) message, which is worrying.
Example on Tomato: https://lava.collabora.dev/scheduler/job/13962479
It looks slightly different. It seems that the shell prompt itself got split, so LAVA didn't recognize it
. /lava-13962479[ 10.704192] mtk-mdp3 14001000.dma-controller: can't get SCP node
/environment
/ [ 10.711195] mtk-mdp3 14001000.dma-controller: Driver registered as /dev/video2
[ 10.711728] mtk-mdp3 1400c000.dma-controller: Adding to iommu group 1
# [ 10.726424] mtk-mdp3 14f08000.dma-controller: Adding to iommu group 1
And the LAVA error message is wait for prompt timed out.
Researching a bit I found this LAVA documentation page, which suggests setting boot_character_delay and/or test_character_delay as ways to avoid missing characters in the serial when the device can't keep up.
Also, when the login prompt is reached, there are still many messages being printed by the kernel, which might be interfering, so another idea would be to add a delay so LAVA only sends the command once the serial output has settled down.