-
Notifications
You must be signed in to change notification settings - Fork 108
[RFC007] Migrate to the new compact value representation #2302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
b9ebd1b to
01c0f94
Compare
01c0f94 to
488881e
Compare
9f948c7 to
8102183
Compare
This was referenced Sep 15, 2025
d8edff9 to
03c6a19
Compare
65a5b91 to
5952c73
Compare
7e40ccc to
f65039f
Compare
cafc083 to
833b5ec
Compare
7 tasks
ce6badd to
6805e21
Compare
5920bf7 to
3fa44f7
Compare
3b7ae11 to
66b2be4
Compare
Co-authored-by: jneem <[email protected]>
Co-authored-by: jneem <[email protected]>
5999b59 to
c4d37a8
Compare
jneem
approved these changes
Nov 4, 2025
Member
jneem
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! A bunch of little comments and questions, but nothing blocking.
Co-authored-by: jneem <[email protected]>
Co-authored-by: jneem <[email protected]>
Co-authored-by: jneem <[email protected]>
Co-authored-by: jneem <[email protected]>
Co-authored-by: jneem <[email protected]>
Co-authored-by: jneem <[email protected]>
ba9e448 to
58f4bf9
Compare
Contributor
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Migration to the Compact Value Representation
This PR refactors the Nickel codebase to use the new compact value introduced in #2282, and improved in subsequent PRs. Unfortunately, this representation is used everywhere, so the migration is huge. I tried to separate independent parts of it in some other PRs, but I quickly reached a point where things were too coupled to do that meaningfully. I'll first describe the main changes, and then a review guide.
Impact
On two stages/sizes of the same real-life code base, I got a consistent improvement of runtime between 20% and 25%. This is not as drastic as I could hope, but the compact values are an enabling milestone for other memory optimizations; see Follow-ups.
On the memory side, a run with Valgrind on the small version of the codebase (takes a few second to run in release mode) shows that the total number of allocated bytes drops by around 33%. Those are very preliminary results, just to give an order of magnitude; more serious benchmarking is due.
Content
NickelValueUsing
NickelValuein many places led to changes and improvements of its interface, as I was using it more and more.ValueContent& LensesA new module
value::lens, as well asValueContenttypes, allow to conditionally take ownership of the content of aNickelValue. Albeit the usefulness is probably limited at runtime, where we tend to duplicateRcs a lot and thus not really take advantage of avoiding clones, this is useful during transformation time, where allRcs are 1-counted, which avoids duplicating the whole AST. This interface might also provide more gains in the future, once we have a proper bytecode compiler, where environments are handled quite differently and much less duplicated.Position indices
The original design had two different variant for position indices,
PosIdxandInlinePosIdx. The idea is that in inline values, there's not a lot of available space (if we want to keep them pointer-sized), so the position index is encoded on 32 bits. But in the block, there's plenty of space, so we can use a properusize, that is usually 64 bits. However, this was causing a lot of trouble, when inheriting positions during evaluation, because if you need to set the position of an inline value which is inherited from a block, you need to have a mutable access to the position table to allocate a new inline index with the same content, just so that it fits in 32 bits. We decided internally to encode everything on 32bits, and to extend it accordingly (there's still space in inline values) if need.XXXBody wrappers
Any data that goes in
ValueBlockRcis wrapped as a structXXXBody, which is just a wrapper around the actual data (RecordData,ForeignId, etc.). Those were introduced initially because I wanted to control the size and alignment of what goes into a block, using#[repr(_)]. This plan changed for different reasons, and it was annoying to match on those structs, or to use.0everywhere. I got rid of them and replaced them with simple type aliases (which are namedXXXDatanow instead).PosTableownershipThe position table needs to be threaded through many stages of the pipeline now, and included in errors. Most of this state has been moved into
VmContext, that VM instances borrow from. Implemented separately in #2381.In general, the presence of the pos table imposes to better separate the pure AST phase from the
NickelValuephase, if we don't want to require pos tables where they should not be needed (typically during AST import resolution). This has been achieved for cache and typechecking errors in #2359 and #2361. This has been continued in this PR with a leftover, namelywildcards, that were stilllRichTerm-only.Migration to edition 2024
I think this has been a mistake, but when I realized it, it was too late to disentangle the changes. I moved to edition 2024 to get let-chains, and this unfortunately pollutes the diff, because edition 2024 is stricter with a number of things, resulting in unrelated diffs (around
unsafeandrefpatterns in particular). Formatting also has changed. I'll format the remaining files withcargo fmtafter the review, to avoid additional unrelated diff.Review
The diff is huge. I think a lot of the changes are almost mechanical, and don't necessarily deserve deep attention - typically in
eval::operationoreval::merge, although the refactoring is not entirely trivial either. The changes tonickel-lang-cli,nickel-lang-lsp, and various tests modules and thetestscrate should be really mechanical.Modules where more substantial modifications were done, that should be reviewed in priority:
eval::value(originallybytecode::value)nickel-langfacade and the C APIcacheserializetermFollow-ups
NickelValueusizefor the data field to preserve provenance.primops
When migrating
eval::operation, there were a lot of micro decisions to make around when to take something by reference or to try to extract it "by value" (i.e. likeRc::try_unwrap). At some point I tended to maintain the old behavior, but I think a new pass is deserved, to decide when it is worth trying to avoid copying. Same thing in the mainevalmodule, where we currently take by reference and clone when needed.Termterm::recordtovalue::record. I think it doesn't make much sense now to have record data in thetermmodule.CustomContracttoTerm. I kept function-like stuff inTerm, because in the future call-by-push-value VM, functions are computations, and not values. But for some reasonCustomContractis aNickelValue.Term::Value. I introduced it because I thought I would need both direction of the mapTerm <-> NickelValue, but sinceNickelValueis the entry point everywhere, we only ever need to wrap a term as a value, and not the other way round.Other memory savings
Termby boxing large argumentsstack::Marker