Fix: random() reverts once every 3 blocks #458

GabrielMartinezRodriguez · 2025-02-25T15:15:49Z

Description

This PR aims to solve the issue that Aodhgan found in the randomness service, where every 3 blocks, for one of them, the random() function reverts. This occurs in the testnet but not locally, making it quite difficult to determine what is happening.

The error occurs because the drand value is not found. This happens every 3 blocks because, in every 3-block cycle, one of them has a lower margin of error. Specifically, the time we have to include the drand value is only about 2.3 seconds from its generation, whereas for the other drands, it is 3.3 and 4.3 seconds, respectively. This makes the 2.3-second case the most problematic.

2.3 seconds should be more than enough time to include the drand value. However, I have detected that when a drand value is requested immediately after being emitted, the drand network can take up to 1.6 seconds to return it. This would cause us to exceed the time limit, given that the current timeout is set to 1 second. For this reason, I have increased the timeout to 2 seconds.

Include all relevant context (but no need to repeat the issue's content).
Draw attention to new, noteworthy & unintuitive elements.

Toggle Checklist

Checklist

Basics

B1. I have applied the proper label & proper branch name (e.g. norswap/build-system-caching).
B2. This PR is not so big that it should be split & addresses only one concern.
B3. The PR targets the lowest branch it can (ideally master).

Reminder: PR review guidelines

Correctness

C1. Builds and passes tests.
C2. The code is properly parameterized & compatible with different environments (e.g. local,
testnet, mainnet, standalone wallet, ...).
C3. I have manually tested my changes & connected features.

< INDICATE BROWSER, DEMO APP & OTHER ENV DETAILS USED FOR TESTING HERE >

< INDICATE TESTED SCENARIOS (USER INTERFACE INTERACTION, CODE FLOWS) HERE >

C4. I have performed a thorough self-review of my code after submitting the PR,
and have updated the code & comments accordingly.

Architecture & Documentation

D1. I made it easy to reason locally about the code, by (1) using proper abstraction boundaries,
(2) commenting these boundaries correctly, (3) adding inline comments for context when needed.
D2. All public-facing APIs & meaningful (non-local) internal APIs are properly documented in code
comments.
D3. If appropriate, the general architecture of the code is documented in a code comment or
in a Markdown document.
D4. An appropriate Changeset has been generated (and committed) for changes that touch npm published packages (currently pacakges/core and packages/react), see here for more info.

cloudflare-workers-and-pages · 2025-02-25T15:15:51Z

Deploying happychain with Cloudflare Pages

Latest commit:	`4ffb47a`
Status:	✅ Deploy successful!
Preview URL:	https://8f0f84df.happychain.pages.dev
Branch Preview URL:	https://gabriel-delay-post-drand.happychain.pages.dev

View logs

GabrielMartinezRodriguez · 2025-02-25T15:16:20Z

Fix: random() reverts once every 3 blocks #458 👈 (View in Graphite)
Added Interrupted status to transaction entity #434 : 1 other dependent PR (#435 )
Deploy randomness service #416 : 1 other dependent PR (#441 )
Add config contract #404
Randomness service: Improve performance and reliability #372
Add transaction submission failure hook and error handling #366
Increase default RPC timeout to 2000ms #365
Add configurable HTTP polling interval for block monitoring #362
Force initialization before using the TXM #361
Script to monitor randomness service #325
Script to easily launch the randomness service locally #324
Submit drand numbers #318
Random oracle contracts #307
Only prune finalized randomness entries #305
Add NewBlock hook #304
Refactors the randomness service to be able to process pending commitments #303
Add SQLite persistence to randomness service #300
Txm no longer receives Viem objects #298 : 1 other dependent PR (#299 )
Add flush mechanism #289 : 1 other dependent PR (#297 )
Allow run multiple txm simultaneously within the same service #285
Added docs for txm #272
Mutex in tx monitor #264
Purge transactions #232
Txm hooks #229
Use kysely instead of mikro-orm #250
master

This stack of pull requests is managed by Graphite. Learn more about stacking.

norswap · 2025-03-14T01:52:40Z

apps/randomness/src/DrandService.ts

-                    }
+                const maxRetries = 10
+                const retryIntervalMs = 100
+                let retryCount = 0


Do we really need to retry? Since handleNewDrandBeacons is called every 3s, and the timeout is 2s, this will be called the next time, right? At most it could make sense to retry once with a lower timeout of 1s.

Yes, I think you are right; retrying is not necessary since it will be retried in a few seconds. However, the timeout has to be 2000ms because sometimes when we request a drand, if we request it too early, the query gets delayed by around 1600ms. If we limit the timeout to 1 second, it will fail. That was the reason we were failing to obtain the drand on time in some blocks

I also modified the fetch so that it doesn't stop an already fired fetch and has an absolute timeout. I think this is exactly what we want because, otherwise, for example, we were making two attempts with a 2000-millisecond timeout, meaning we could wait up to 4 seconds. But what happens if the request arrives after 2100 milliseconds? I think it's better to establish a short timeout period to trigger another fetch without canceling the previous one. So now, we retry after one second, but if the previous fetch completes after 1500 milliseconds, we handle it correctly as well

norswap

Nice, this is very neat!

GabrielMartinezRodriguez mentioned this pull request Feb 25, 2025

Added Interrupted status to transaction entity #434

Merged

11 tasks

GabrielMartinezRodriguez marked this pull request as ready for review February 25, 2025 15:16

GabrielMartinezRodriguez mentioned this pull request Feb 25, 2025

debug #435

Closed

11 tasks

GabrielMartinezRodriguez self-assigned this Feb 25, 2025

GabrielMartinezRodriguez marked this pull request as draft February 25, 2025 15:17

GabrielMartinezRodriguez added the no-merge For showcase, not to be merged label Feb 25, 2025

GabrielMartinezRodriguez force-pushed the gabriel/transaction-reorder branch from 52ba575 to caa054f Compare February 26, 2025 10:53

GabrielMartinezRodriguez force-pushed the gabriel/delay-post-drand branch from 4803de4 to 7906e19 Compare February 26, 2025 10:53

Base automatically changed from gabriel/transaction-reorder to master February 26, 2025 11:03

GabrielMartinezRodriguez force-pushed the gabriel/delay-post-drand branch from 7906e19 to ac39d24 Compare March 6, 2025 13:39

GabrielMartinezRodriguez added reviewing-1 Ready for, or undergoing first-line review and removed no-merge For showcase, not to be merged labels Mar 6, 2025

GabrielMartinezRodriguez changed the title ~~fix(randomness): reduce sleep time when too early getting drand number~~ Fix: random() reverts once every 3 blocks Mar 6, 2025

GabrielMartinezRodriguez added draft Not ready for review and removed draft Not ready for review reviewing-1 Ready for, or undergoing first-line review labels Mar 6, 2025

GabrielMartinezRodriguez force-pushed the gabriel/delay-post-drand branch from b6ad880 to 0d63755 Compare March 12, 2025 10:20

GabrielMartinezRodriguez marked this pull request as ready for review March 12, 2025 13:53

GabrielMartinezRodriguez added reviewing-1 Ready for, or undergoing first-line review and removed draft Not ready for review labels Mar 12, 2025

norswap reviewed Mar 14, 2025

View reviewed changes

norswap added question Something has to be cleared up after review and removed reviewing-1 Ready for, or undergoing first-line review labels Mar 14, 2025

GabrielMartinezRodriguez added 4 commits March 14, 2025 11:57

fix(randomness): reduce sleep time when too early getting drand number

037d074

chore(randomness): move sleep after get the first drand

e2d28b7

chore: format

040835e

chore(randomness): increased timeout

c845ff9

chore(randomness): remove retries on _handleNewDrandBeacons

50f228b

GabrielMartinezRodriguez force-pushed the gabriel/delay-post-drand branch from 0d63755 to 50f228b Compare March 14, 2025 11:23

feat(common): modified fetch

4ffb47a

GabrielMartinezRodriguez added reviewing-2 Ready for, or undergoing final review and removed question Something has to be cleared up after review labels Mar 14, 2025

GabrielMartinezRodriguez requested a review from norswap March 14, 2025 15:06

norswap approved these changes Mar 14, 2025

View reviewed changes

norswap merged commit 89ecd5f into master Mar 14, 2025
3 checks passed

norswap deleted the gabriel/delay-post-drand branch March 14, 2025 19:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: random() reverts once every 3 blocks #458

Fix: random() reverts once every 3 blocks #458

Uh oh!

GabrielMartinezRodriguez commented Feb 25, 2025 •

edited

Loading

Uh oh!

cloudflare-workers-and-pages bot commented Feb 25, 2025 •

edited

Loading

Uh oh!

GabrielMartinezRodriguez commented Feb 25, 2025 •

edited

Loading

Uh oh!

norswap Mar 14, 2025

Uh oh!

GabrielMartinezRodriguez Mar 14, 2025

Uh oh!

GabrielMartinezRodriguez Mar 14, 2025

Uh oh!

norswap left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix: random() reverts once every 3 blocks #458

Fix: random() reverts once every 3 blocks #458

Uh oh!

Conversation

GabrielMartinezRodriguez commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Basics

Correctness

Architecture & Documentation

Uh oh!

cloudflare-workers-and-pages bot commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying happychain with Cloudflare Pages

Uh oh!

GabrielMartinezRodriguez commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

norswap Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

GabrielMartinezRodriguez Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

GabrielMartinezRodriguez Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

norswap left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

GabrielMartinezRodriguez commented Feb 25, 2025 •

edited

Loading

cloudflare-workers-and-pages bot commented Feb 25, 2025 •

edited

Loading

GabrielMartinezRodriguez commented Feb 25, 2025 •

edited

Loading