-
Notifications
You must be signed in to change notification settings - Fork 5
Description
I would like to see mgd get away from completely unmanaged threads that expose nothing for reporting, insight or control.
I don't think all threads need explicit management/joining since the Arc<AtomicBool> signaling we have today suffices for most situations, but I think having a reusable coding pattern would enable consistency (no bespoke threading logic), control (ability to join where appropriate w/o needing one-off types), and insight (at its most basic, reporting which state a thread is currently in).
I would like to see mgd take a more structured/deliberate approach to thread lifecycle management.
In particular, I would like to start holding onto :
- Explicit joining of threads on shutdown where it makes sense
- Explicit state reporting for threads (ready, running, shutdown/panicked)
- Panic reporting
What I'm envisioning is something akin to FRR's show thread cpu (recently renamed to show event cpu):
lima-ubuntu-22-04# show thread cpu <cr> FILTER Display filter (rwtexb) lima-ubuntu-22-04# show thread cpu zebra
Thread statistics for zebra:
Showing statistics for pthread default
--------------------------------------
CPU (user+system): Real (wall-clock):
Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Type Thread
1 0.095 1 95 95 96 96 0 0 R zserv_accept
1 0.060 3 20 37 21 39 0 0 R vtysh_accept
1 1.616 9 179 496 182 500 0 0 R kernel_read
0 0.008 2 4 8 5 9 0 0 E rib_process_dplane_results
0 0.076 1 76 76 76 76 0 0 E zserv_process_messages
1 12.645 75 168 873 170 873 0 0 R vtysh_read
0 0.009 1 9 9 10 10 0 0 E frr_config_read_in
Showing statistics for pthread Zebra dplane thread
--------------------------------------------------
CPU (user+system): Real (wall-clock):
Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Type Thread
1 1.092 9 121 300 124 300 0 0 R dplane_incoming_read
0 0.114 2 57 66 225 402 0 0 E dplane_thread_loop
Showing statistics for pthread Zebra Opaque thread
--------------------------------------------------
CPU (user+system): Real (wall-clock):
Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Type Thread
0 0.002 1 2 2 3 3 0 0 E process_messages
Showing statistics for pthread Zebra API client thread
------------------------------------------------------
CPU (user+system): Real (wall-clock):
Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Type Thread
1 0.014 1 14 14 14 14 0 0 R zserv_read
Total thread statistics
-------------------------
CPU (user+system): Real (wall-clock):
Active Runtime(ms) Invoked Avg uSec Max uSecs Avg uSec Max uSecs CPU_Warn Wall_Warn Type Thread
6 15.731 105 149 873 154 873 0 0 R E TOTAL
FRR's implementation here collects stats/usage in each pthread's event loop as each event is handled, then displays it when the CLI command is run.
I'd like to have something similar integrated into mgd so we can simply and easily query the thread status/stats from mgadm/API, as a quick way to get an idea of what's going on without needing to immediately jump into mdb or DTrace.