Skip to content

Conversation

@SantiagoPittella
Copy link
Collaborator

This PR uses main as base branch

Improves the instrumentation in the miden-network-monitor binary.

This is the current view of the traces (in jaeger):

Screenshot 2025-11-13 at 17 21 17

@SantiagoPittella SantiagoPittella force-pushed the santiagopittella-improve-tracing-in-monitor branch from 7c6ec2b to 97531ec Compare November 13, 2025 20:32
@SantiagoPittella
Copy link
Collaborator Author

@bobbinth @Mirko-von-Leipzig mdbook publishing a new, non compatible, version a few hours ago. I tried removing the multilingual option but other errors arised. Should I lock the Ci version to the previous (v0.4.52) until we complete the migration? mdbook people provides a migration guide https://github.com/rust-lang/mdBook/blob/master/CHANGELOG.md#05-migration-guide

Copy link
Collaborator

@igamigo igamigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the plan is still to merge this and patch release it, but LGTM! Sorry for the delay in the review.
Not super familiar with the frequencies of some processes. Doesn't seem like these changes would be very spammy but it's probably something to watch out for.

Comment on lines 39 to 42
debug!(target: COMPONENT, "Initializing monitoring tasks");

// Initialize the RPC Status endpoint checker task.
debug!(target: COMPONENT, "Initializing RPC status checker");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Emitting 2 events here feels a bit redundant/unnecessary

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the first one. For the debug! level I left one log per component.

level = "info",
ret(level = "debug")
)]
pub(crate) async fn check_remote_prover_status(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that could be nice for troubleshooting is to raise info events for service status changes. Something like

  let mut last = Status::Unknown;
  // ...
  let status = check_remote_prover_status(...).await;
  if status.status != last {
      info!(target: COMPONENT, prover = %name, status = ?status.status, "Remote prover status changed");
      last = status.status;
  }

But maybe this is already easily filterable with the current traces

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 443 we have this:

   debug!(target: COMPONENT, prover_name = %name, remote_prover_status = ?status, "Remote prover status check successful");

Is that what you meant?

@SantiagoPittella
Copy link
Collaborator Author

Not sure if the plan is still to merge this and patch release it, but LGTM! Sorry for the delay in the review.

Yeah, maybe now we want to change the base to next and avoid this release + deploy cc @bobbinth

Not super familiar with the frequencies of some processes. Doesn't seem like these changes would be very spammy but it's probably something to watch out for.

Maybe this can be controlled changing the sample rate if we decide that we need to reduce it? That will require some probably minor changes in the utils crate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants