Skip to content

Conversation

@GabrielMartinezRodriguez
Copy link
Contributor

@GabrielMartinezRodriguez GabrielMartinezRodriguez commented Jan 23, 2025

Linked Issues

Description

This PR aims to improve the performance and reliability of the randomness service, after noticing the following problems:
1 - The 'random()' method in the Random contract was failing in some blocks
2 - We were experiencing multiple drand network errors
3 - The first drand was delayed in submission by about 30 seconds
To fix the first problem, I discovered that it was caused because the drand delay wasn't sufficient to include the drand to be used in the random function in all blocks. For example, if the drand delay is 2, and you are in the block with timestamp 11, you are using the drand for timestamp 9. To include transactions in the block with timestamp 11, you must submit transactions with at least 1 second of margin, meaning you must submit the drand for the block with timestamp 11 at timestamp 11 or later.
This is problematic because we don't receive the block with timestamp 9 exactly at timestamp 9, and simultaneously, we must request drands from the network. The network delay is approximately 300ms, with peaks of 1s. The situation was even worse because we were performing gas estimation for every 'postDrand' transaction, adding about 600ms extra to include the transaction.
These multiple delays were causing us to miss multiple blocks.
To address these issues, this PR adds:

Increased drand delay from 2 to 4
Hardcoded gas limit to avoid estimating gas for every transaction
Increased timeout for the drand network from 500 ms to 1000 ms

With these changes, we fix problems 1 and 2.
To solve problem 3, I found a bug: if there were no reveal transactions, we weren't sending the post drand transaction. This was causing a delay of about 30 seconds after the randomness service starts to reveal commitments that were committed approximately 30 seconds earlier.

Toggle Checklist

Checklist

Basics

  • B1. I have applied the proper label & proper branch name (e.g. norswap/build-system-caching).
  • B2. This PR is not so big that it should be split & addresses only one concern.
  • B3. The PR targets the lowest branch it can (ideally master).

Reminder: PR review guidelines

Correctness

  • C1. Builds and passes tests.

  • C2. The code is properly parameterized & compatible with different environments (e.g. local,
    testnet, mainnet, standalone wallet, ...).

  • C3. I have manually tested my changes & connected features.

  • C4. I have performed a thorough self-review of my code after submitting the PR,
    and have updated the code & comments accordingly.

Architecture & Documentation

  • D1. I made it easy to reason locally about the code, by (1) using proper abstraction boundaries,
    (2) commenting these boundaries correctly, (3) adding inline comments for context when needed.
  • D2. All public-facing APIs & meaningful (non-local) internal APIs are properly documented in code
    comments.
  • D3. If appropriate, the general architecture of the code is documented in a code comment or
    in a Markdown document.

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Jan 23, 2025

Deploying happychain with  Cloudflare Pages  Cloudflare Pages

Latest commit: ca26b17
Status: ✅  Deploy successful!
Preview URL: https://8a1f88a4.happychain.pages.dev
Branch Preview URL: https://graphite-base-404.happychain.pages.dev

View logs

@GabrielMartinezRodriguez GabrielMartinezRodriguez changed the title feat(randomness): increase Drand delay and improve service reliability Improve performance and reabilit Jan 23, 2025
@GabrielMartinezRodriguez GabrielMartinezRodriguez marked this pull request as ready for review January 23, 2025 14:08
@GabrielMartinezRodriguez GabrielMartinezRodriguez changed the title Improve performance and reabilit Improve performance and Reliability Jan 23, 2025
@GabrielMartinezRodriguez GabrielMartinezRodriguez changed the title Improve performance and Reliability Improve performance and reliability Jan 23, 2025
@linear
Copy link

linear bot commented Jan 23, 2025

@GabrielMartinezRodriguez GabrielMartinezRodriguez changed the title Improve performance and reliability Randomness service: Improve performance and reliability Jan 23, 2025
@linear
Copy link

linear bot commented Jan 23, 2025

@GabrielMartinezRodriguez GabrielMartinezRodriguez added the reviewing-1 Ready for, or undergoing first-line review label Jan 23, 2025
@GabrielMartinezRodriguez GabrielMartinezRodriguez force-pushed the gabriel/txm-submission-failed-hook branch from a98fbb3 to 82a0cb6 Compare January 23, 2025 14:45
@GabrielMartinezRodriguez GabrielMartinezRodriguez force-pushed the gabriel/improve-performance-and-reability-randomness branch from 62ee648 to 53e3fb6 Compare January 23, 2025 14:46
* will be posted on a block with timestamp T (even if such a block exists) because of network delays.
*/
uint256 public constant DRAND_DELAY = 2;
uint256 public constant DRAND_DELAY = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i always forget block.timestamp is in seconds, not ms 😄

this needs 4s, to be 2 block delay?

I don't full understand how this works, but would it be possible to call randomForTimestamp and getRevealedValue to 'frontrun' someone else calling random() by 1 block then? effectively seeing the future slightly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was a bit confusing. We changed the name of all the time-related variables to indicate whether they are in seconds or milliseconds. Now, this variable is called DRAND_DELAY_SECONDS.

Regarding "frontrunning," yes, you could know the drand value that we are going to use for random(), but you will never know the revealed value that we mix with the drand value. We emit the reveal transaction as soon as we receive the previous block. For example, if we are targeting the reveal of block N, we send the reveal transaction as soon as we receive block N-1.

Additionally, we have modified the node in this PR: HappyChainDevs/op-geth#1 to force the node to always execute the first transaction of the block—actually, the second, since the first is the system transaction "setL1BlockValuesEcotone".

With this, you can never know the output of random() before it has already been executed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@not-reed The delay's goal is that the "canonical" drand value for block T=X is the latest Drand at X-DRAND_DELAY. That's because sometimes there is a block at X and a Drand value at X, and in those cases we cannot post the Drand value to the chain ontime. There are also issues due to needing to hit submit 1s before the block, network latency to get the Drand value etc...

That's why we had to increase this value: it turned at time T, it would sometimes take 1s to get the Drand value, and with the need to submit enough in advance, we'd sometimes miss the block.

Regarding frontrunning, I don't know I would frame it those terms — but anyone knows the Drand T at time T (+ network delay). So people will always know the Drand value before it hits the chain (we have to know it to post it).

Drand is still useful for two things:

  1. commitments to future randomness, if you commit enough in advance (we have this defined in the contracts), then nobody will know the drand value at that time
  2. removing our ability to influence randomness: synchronous randomness if formed by taking the Drand value + a random value to which we committed before the Drand value was knowns — that way, there's absolutely no way we can influence the randomness
    • we will still know the value before anyone else, so there is trust is we don't share it


const url = `${env.EVM_DRAND_URL}/rounds/${round}`
const response = await ResultAsync.fromPromise(fetchWithRetry(url, {}, 2, 500), unknownToError)
const response = await ResultAsync.fromPromise(fetchWithRetry(url, {}, 2, 1000), unknownToError)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1000 is the delay?

block 1 at ts 0 => tries and fails
block 1 at ts 1000 => tries and fails
block 2 at ts 2000 => tries and succeeds

is it a problem that the final try if everything fails is guaranteed to be next block now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to submit the drand value at most 4 seconds after it is generated because, after that moment, it will be used by "random()". So, that is not a problem

@GabrielMartinezRodriguez GabrielMartinezRodriguez force-pushed the gabriel/txm-submission-failed-hook branch from 82a0cb6 to 98067ed Compare January 31, 2025 14:44
@GabrielMartinezRodriguez GabrielMartinezRodriguez force-pushed the gabriel/improve-performance-and-reability-randomness branch from 53e3fb6 to 6e8d5d5 Compare January 31, 2025 14:44
@GabrielMartinezRodriguez GabrielMartinezRodriguez mentioned this pull request Feb 3, 2025
10 tasks
@norswap norswap force-pushed the gabriel/txm-submission-failed-hook branch from f5a61b6 to 05dc512 Compare February 10, 2025 20:23
@GabrielMartinezRodriguez GabrielMartinezRodriguez force-pushed the gabriel/txm-submission-failed-hook branch from 05dc512 to b9ba93f Compare February 10, 2025 20:33
Base automatically changed from gabriel/txm-submission-failed-hook to master February 10, 2025 20:43
@GabrielMartinezRodriguez GabrielMartinezRodriguez force-pushed the gabriel/improve-performance-and-reability-randomness branch from 82297e0 to c2dee4f Compare February 11, 2025 14:28
@GabrielMartinezRodriguez GabrielMartinezRodriguez added reviewing-2 Ready for, or undergoing final review and removed updating Updating after review labels Feb 12, 2025
@GabrielMartinezRodriguez GabrielMartinezRodriguez mentioned this pull request Feb 12, 2025
11 tasks
@aodhgan aodhgan added question Something has to be cleared up after review and removed reviewing-2 Ready for, or undergoing final review labels Feb 13, 2025
@GabrielMartinezRodriguez GabrielMartinezRodriguez added reviewing-2 Ready for, or undergoing final review and removed question Something has to be cleared up after review labels Feb 14, 2025
@GabrielMartinezRodriguez GabrielMartinezRodriguez force-pushed the gabriel/improve-performance-and-reability-randomness branch from 4ac15e2 to ca26b17 Compare February 14, 2025 10:28
@norswap norswap merged commit 993c12c into master Feb 15, 2025
3 checks passed
@norswap norswap deleted the gabriel/improve-performance-and-reability-randomness branch February 15, 2025 03:30
@GabrielMartinezRodriguez GabrielMartinezRodriguez restored the gabriel/improve-performance-and-reability-randomness branch February 18, 2025 17:26
@norswap norswap deleted the gabriel/improve-performance-and-reability-randomness branch July 24, 2025 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

reviewing-2 Ready for, or undergoing final review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants