Skip to content

Conversation

@meddle0x53
Copy link
Contributor

@meddle0x53 meddle0x53 commented Nov 28, 2025

Motivation

To test upgrading snarkos versions in a network we run a special denvet test as described in #4026

Test Plan

Run the CI and the new test passes.

Documentation

The test works following the steps:

  1. Download latest snarkOS release source from GitHub.
  2. Build it via cargo install --locked --path . --features test_network into a separate prefix (SNARKOS_RELEASE_DIR).
  3. Probe latest_consensus_version for release and PR binaries.
  4. Compute CONSENSUS_VERSION_HEIGHTS for release and PR.
  5. Rebuild release snarkOS with its heights.
  6. Rebuild PR snarkOS with its heights.
  7. Start a devnet with the release binary.
  8. Restart nodes one-by-one to the PR binary.

Added a new workflow for the new test, extracting it from the devnet tests, as it is time-taking with all the compilations and waiting on the consensus version.

@meddle0x53 meddle0x53 marked this pull request as ready for review November 28, 2025 09:12
@meddle0x53 meddle0x53 marked this pull request as draft November 28, 2025 09:12
@meddle0x53 meddle0x53 marked this pull request as ready for review December 2, 2025 10:20
@vicsn vicsn requested a review from kaimast December 5, 2025 09:46
vicsn
vicsn previously approved these changes Dec 5, 2025
Copy link
Collaborator

@vicsn vicsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's await Kai's review

Comment on lines +215 to +221
local node_index="$2"
local role="$3" # "validator" or "client"
local log_file="$4"

local flags=( "${common_flags[@]}" "--dev=$node_index" )
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be able to support passing in custom flags when new required flags are added in new releases CC #4033

Also, it might be easier to read if we just pass a list of flags to start_node like we do in other tests. Currently it is abstraction on top of abstraction (common flags, role, etc.). CC @kaimast what do you think?

@kaimast
Copy link
Contributor

kaimast commented Dec 10, 2025

Looks good to me! I pushed one more commit with some minor cleanups (83355c6).

I also rebased #3902 on it. That PR has some networking changes and the tests succeed. We might still want to test it with a branch that has breaking changes to see that those are caught, or did you do that already?

Also, a side note, it would be great if we could keep the number of commits in PRs to a manageable amount. This one has 14 commits, for ~700 LOC. It's not a huge deal, but it can be a pain when reading the commit logs and or when using git bisect.

@meddle0x53
Copy link
Contributor Author

@kaimast @vicsn Squashed all into one commit now.

@vicsn
Copy link
Collaborator

vicsn commented Dec 11, 2025

@kaimast good idea. You want to give it another go with a toy PR which changes the gateway handshake in a breaking way?

@vicsn vicsn mentioned this pull request Dec 19, 2025
2 tasks
@vicsn
Copy link
Collaborator

vicsn commented Jan 1, 2026

We might still want to test it with a branch that has breaking changes to see that those are caught, or did you do that already?

@meddle0x53 can you try on a toy commit which makes a breaking change to both struct ChallengeRequest objects?

@meddle0x53
Copy link
Contributor Author

Yeah if I update ChallengeRequest the test fails and I get:

2026-01-07T13:03:59.066676Z DEBUG [MemoryPool] Shaking hands with 127.0.0.1:5003...
2026-01-07T13:03:59.066786Z DEBUG [MemoryPool] Shaking hands with 127.0.0.1:5002...
2026-01-07T13:03:59.066822Z DEBUG [MemoryPool] Shaking hands with 127.0.0.1:5001...
2026-01-07T13:03:59.068359Z  WARN [MemoryPool] Unable to connect to '127.0.0.1:5002' - I/O error: [MemoryPool] the peer disconnected before sending "Event::ChallengeResponse", likely due to peer saturation or shutdown
2026-01-07T13:03:59.068460Z  WARN [MemoryPool] Unable to connect to '127.0.0.1:5003' - I/O error: [MemoryPool] the peer disconnected before sending "Event::ChallengeResponse", likely due to peer saturation or shutdown
2026-01-07T13:03:59.068884Z  WARN [MemoryPool] Unable to connect to '127.0.0.1:5001' - I/O error: [MemoryPool] the peer disconnected before sending "Event::ChallengeResponse", likely due to peer saturation or shutdown
2026-01-07T13:04:00.579279Z DEBUG Skipping batch proposal for round 112 (node is syncing)
2026-01-07T13:04:01.410926Z  INFO Received 'GET /block/height/latest' from '127.0.0.1:54017'
2026-01-07T13:04:03.081466Z DEBUG Skipping batch proposal for round 112 (node is syncing)
2026-01-07T13:04:05.583033Z DEBUG Skipping batch proposal for round 112 (node is syncing)
2026-01-07T13:04:06.454622Z  INFO Received 'GET /block/height/latest' from '127.0.0.1:54020'
2026-01-07T13:04:08.085138Z DEBUG Skipping batch proposal for round 112 (node is syncing)
2026-01-07T13:04:10.587016Z DEBUG Skipping batch proposal for round 112 (node is syncing)
2026-01-07T13:04:11.504465Z  INFO Received 'GET /block/height/latest' from '127.0.0.1:54026'
2026-01-07T13:04:13.089705Z DEBUG Skipping batch proposal for round 112 (node is syncing)
2026-01-07T13:04:14.067889Z  INFO No connected validators
2026-01-07T13:04:14.068108Z DEBUG   Not connected to aleo1ashyu96tjwe63u0gtnnv8z5lhapdu4l5pjsl2kha7fv7hvz2eqxs5dz0rg (25.00% of total stake)
2026-01-07T13:04:14.068159Z DEBUG   Not connected to aleo12ux3gdauck0v60westgcpqj7v8rrcr3v346e4jtq04q7kkt22czsh808v2 (25.00% of total stake)
2026-01-07T13:04:14.068205Z DEBUG   Not connected to aleo1s3ws5tra87fjycnjrwsjcrnw2qxr8jfqqdugnf0xzqqw29q9m5pqem2u4t (25.00% of total stake)
2026-01-07T13:04:14.068246Z  WARN Not connected to 3 validators (of 3 bonded validators) (75.00% of total stake not connected)

Logs like this, @vicsn


upgrade-workflow:
jobs:
- upgrade-test
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that it's a bit more expensive and unlikely to be an issue for most PRs, can we make sure to only run it on merges? See for example: ProvableHQ/snarkVM@b5ddc0c

You can also move the verify-windows job in there to only run on merges

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants