-
Notifications
You must be signed in to change notification settings - Fork 9
RSDK-11788 Forward recent systemd agent logs upon startup #184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
RSDK-11788 Forward recent systemd agent logs upon startup #184
Conversation
benjirewis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is still in "draft" since I'm adding tests and want to make sure "OOM killed" logs specifically get saved, but feel free to review @danielbotros + @jmatth .
103c870 to
6759c78
Compare
|
Tests added; now ready for full review @jmatth + @danielbotros . |
| // A nil lastShutdown value means none was saved in the cache. Bail in this case. We | ||
| // _could_ attempt to upload all historical systemd agent logs if we do not know when | ||
| // the agent last shutdown, but there may be quite a few of these logs, and doing so | ||
| // could slow startup. | ||
| if s.lastShutdown == nil { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are some edge cases here:
- If agent gets OOM killed or similar during its first run we won't upload any logs on the next start
- If agent gets OOM killed or similar after one or more clean shutdowns then we're going to upload all the systemd logs since the last shutdown. I'm assuming agent has average uptimes in days or weeks so that could still be a lot of logs. Is it the case that
viam-agent.systemdlogs are only created very rarely during the lifetime of the process?
| journalCmd *exec.Cmd | ||
| cancelFunc context.CancelFunc | ||
| noJournald bool | ||
| lastShutdown *time.Time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needing to keep this extra value on the struct even though it's only really used during startup is kind of annoying. I don't see a good way around it given the current code though. Just flagging it as one more reason we should get rid of the weird subsystem duck typing at some point.
| } | ||
|
|
||
| func TestVersionCacheJSONRoundtrip(t *testing.T) { | ||
| someTime, err := time.Parse("2006-01-02 15:04:05", "2011-11-11 00:00:00" /* https://tinyurl.com/dm4ytr3c */) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RSDK-11788
Adds a
last_shutdowntimestamp to the version cache. Upon startup, usesjournalctl -u viam-agent -t systemd -o json -S [last_shutdown]to forward allsystemd-identified logs from theviam-agentsystemd service since the last shutdown to app. This allows logs like the following to make it up to app: