[RFC007] Migrate to the new compact value representation #2302

yannham · 2025-07-23T17:27:00Z

Migration to the Compact Value Representation

This PR refactors the Nickel codebase to use the new compact value introduced in #2282, and improved in subsequent PRs. Unfortunately, this representation is used everywhere, so the migration is huge. I tried to separate independent parts of it in some other PRs, but I quickly reached a point where things were too coupled to do that meaningfully. I'll first describe the main changes, and then a review guide.

Impact

On two stages/sizes of the same real-life code base, I got a consistent improvement of runtime between 20% and 25%. This is not as drastic as I could hope, but the compact values are an enabling milestone for other memory optimizations; see Follow-ups.

On the memory side, a run with Valgrind on the small version of the codebase (takes a few second to run in release mode) shows that the total number of allocated bytes drops by around 33%. Those are very preliminary results, just to give an order of magnitude; more serious benchmarking is due.

Content

`NickelValue`

Using NickelValue in many places led to changes and improvements of its interface, as I was using it more and more.

`ValueContent` & Lenses

A new module value::lens, as well as ValueContent types, allow to conditionally take ownership of the content of a NickelValue. Albeit the usefulness is probably limited at runtime, where we tend to duplicate Rcs a lot and thus not really take advantage of avoiding clones, this is useful during transformation time, where all Rcs are 1-counted, which avoids duplicating the whole AST. This interface might also provide more gains in the future, once we have a proper bytecode compiler, where environments are handled quite differently and much less duplicated.

Position indices

The original design had two different variant for position indices, PosIdx and InlinePosIdx. The idea is that in inline values, there's not a lot of available space (if we want to keep them pointer-sized), so the position index is encoded on 32 bits. But in the block, there's plenty of space, so we can use a proper usize, that is usually 64 bits. However, this was causing a lot of trouble, when inheriting positions during evaluation, because if you need to set the position of an inline value which is inherited from a block, you need to have a mutable access to the position table to allocate a new inline index with the same content, just so that it fits in 32 bits. We decided internally to encode everything on 32bits, and to extend it accordingly (there's still space in inline values) if need.

XXXBody wrappers

Any data that goes in ValueBlockRc is wrapped as a struct XXXBody, which is just a wrapper around the actual data (RecordData, ForeignId, etc.). Those were introduced initially because I wanted to control the size and alignment of what goes into a block, using #[repr(_)]. This plan changed for different reasons, and it was annoying to match on those structs, or to use .0 everywhere. I got rid of them and replaced them with simple type aliases (which are named XXXData now instead).

`PosTable` ownership

The position table needs to be threaded through many stages of the pipeline now, and included in errors. Most of this state has been moved into VmContext, that VM instances borrow from. Implemented separately in #2381.

In general, the presence of the pos table imposes to better separate the pure AST phase from the NickelValue phase, if we don't want to require pos tables where they should not be needed (typically during AST import resolution). This has been achieved for cache and typechecking errors in #2359 and #2361. This has been continued in this PR with a leftover, namely wildcards, that were stilll RichTerm-only.

Migration to edition 2024

I think this has been a mistake, but when I realized it, it was too late to disentangle the changes. I moved to edition 2024 to get let-chains, and this unfortunately pollutes the diff, because edition 2024 is stricter with a number of things, resulting in unrelated diffs (around unsafe and ref patterns in particular). Formatting also has changed. I'll format the remaining files with cargo fmt after the review, to avoid additional unrelated diff.

Review

The diff is huge. I think a lot of the changes are almost mechanical, and don't necessarily deserve deep attention - typically in eval::operation or eval::merge, although the refactoring is not entirely trivial either. The changes to nickel-lang-cli, nickel-lang-lsp, and various tests modules and the tests crate should be really mechanical.

Modules where more substantial modifications were done, that should be reviewed in priority:

eval::value (originally bytecode::value)
nickel-lang facade and the C API
cache
serialize
term

Follow-ups

`NickelValue`

use a pointer instead of usize for the data field to preserve provenance.

primops

When migrating eval::operation, there were a lot of micro decisions to make around when to take something by reference or to try to extract it "by value" (i.e. like Rc::try_unwrap). At some point I tended to maintain the old behavior, but I think a new pass is deserved, to decide when it is worth trying to avoid copying. Same thing in the main eval module, where we currently take by reference and clone when needed.

`Term`

move term::record to value::record. I think it doesn't make much sense now to have record data in the term module.
move CustomContract to Term. I kept function-like stuff in Term, because in the future call-by-push-value VM, functions are computations, and not values. But for some reason CustomContract is a NickelValue.
get rid of Term::Value. I introduced it because I thought I would need both direction of the map Term <-> NickelValue, but since NickelValue is the entry point everywhere, we only ever need to wrap a term as a value, and not the other way round.

Other memory savings

reduce the size of Term by boxing large arguments
reduce the size of stack elements by reworking the layout of stack::Marker
reduce the size of errors by boxing the actual variant

Co-authored-by: jneem <[email protected]>

jneem

Looks good! A bunch of little comments and questions, but nothing blocking.

nickel/src/lib.rs

nickel/src/capi.rs

core/src/cache.rs

core/src/closurize.rs

core/src/term/mod.rs

core/src/eval/operation.rs

Co-authored-by: jneem <[email protected]>

github-actions · 2025-11-04T17:08:09Z

Bencher Report

Branch	rfc007/migrate-term-to-value
Testbed	ubuntu-latest

Click to view all benchmark results

Benchmark	Latency	microseconds (µs)
diagnostics-benches/inputs/goto-perf.ncl	📈 view plot 🚷 view threshold	10,997.00 µs
diagnostics-benches/inputs/large-record-tree.ncl	📈 view plot 🚷 view threshold	187,100.00 µs
diagnostics-benches/inputs/redis-replication-controller.ncl	📈 view plot 🚷 view threshold	301.12 µs
diagnostics-benches/inputs/small-record-tree.ncl	📈 view plot 🚷 view threshold	429.69 µs
fibonacci 10	📈 view plot 🚷 view threshold	212.77 µs
foldl arrays 50	📈 view plot 🚷 view threshold	636.78 µs
foldl arrays 500	📈 view plot 🚷 view threshold	14,536.00 µs
foldr strings 50	📈 view plot 🚷 view threshold	3,710.60 µs
foldr strings 500	📈 view plot 🚷 view threshold	33,640.00 µs
generate normal 250	📈 view plot 🚷 view threshold	44,406.00 µs
generate normal 50	📈 view plot 🚷 view threshold	1,237.40 µs
generate normal unchecked 1000	📈 view plot 🚷 view threshold	41,643.00 µs
generate normal unchecked 200	📈 view plot 🚷 view threshold	1,844.60 µs
init-diagnostics-benches/inputs/goto-perf.ncl	📈 view plot 🚷 view threshold	52,154.00 µs
init-diagnostics-benches/inputs/large-record-tree.ncl	📈 view plot 🚷 view threshold	205,490.00 µs
init-diagnostics-benches/inputs/redis-replication-controller.ncl	📈 view plot 🚷 view threshold	50,058.00 µs
init-diagnostics-benches/inputs/small-record-tree.ncl	📈 view plot 🚷 view threshold	49,886.00 µs
pidigits 100	📈 view plot 🚷 view threshold	1,936.60 µs
pipe normal 20	📈 view plot 🚷 view threshold	706.29 µs
pipe normal 200	📈 view plot 🚷 view threshold	5,968.50 µs
product 30	📈 view plot 🚷 view threshold	347.95 µs
requests-benches/inputs/goto-perf.ncl-000	📈 view plot 🚷 view threshold	3,014.00 µs
requests-benches/inputs/large-record-tree.ncl-000	📈 view plot 🚷 view threshold	575,950.00 µs
requests-benches/inputs/large-record-tree.ncl-001	📈 view plot 🚷 view threshold	87.32 µs
scalar 10	📈 view plot 🚷 view threshold	592.63 µs
sum 30	📈 view plot 🚷 view threshold	349.64 µs

🐰 View full continuous benchmarking report in Bencher

yannham force-pushed the rfc007/migrate-term-to-value branch from b9ebd1b to 01c0f94 Compare July 25, 2025 10:02

yannham mentioned this pull request Jul 25, 2025

[RFC007] Move position-related stuff to the position module, add mutable content ref #2303

Merged

yannham force-pushed the rfc007/migrate-term-to-value branch from 01c0f94 to 488881e Compare August 26, 2025 09:21

yannham mentioned this pull request Aug 29, 2025

Factor out the parser into a separate crate #2328

Merged

yannham force-pushed the rfc007/migrate-term-to-value branch from 9f948c7 to 8102183 Compare September 8, 2025 08:30

This was referenced Sep 15, 2025

Initial nickel-lang API proposal #2334

Merged

Reduce memory footprint: tracking issue #2349

Open

yannham force-pushed the rfc007/migrate-term-to-value branch from d8edff9 to 03c6a19 Compare September 17, 2025 10:31

jneem mentioned this pull request Sep 18, 2025

Eliminate generated ident collisions #2355

Open

yannham force-pushed the rfc007/migrate-term-to-value branch 4 times, most recently from 65a5b91 to 5952c73 Compare September 24, 2025 08:09

yannham force-pushed the rfc007/migrate-term-to-value branch from 7e40ccc to f65039f Compare October 3, 2025 16:53

yannham mentioned this pull request Oct 9, 2025

Refactor: split external state from VM #2381

Merged

yannham force-pushed the rfc007/migrate-term-to-value branch from cafc083 to 833b5ec Compare October 14, 2025 16:21

yannham mentioned this pull request Oct 16, 2025

Performance improvement ideas: tracking issue #1484

Open

7 tasks

yannham force-pushed the rfc007/migrate-term-to-value branch 2 times, most recently from ce6badd to 6805e21 Compare October 17, 2025 14:54

jneem mentioned this pull request Oct 22, 2025

chore(lints): introduce typos-cli config in Cargo.toml and handle typos #2390

Draft

yannham force-pushed the rfc007/migrate-term-to-value branch from 5920bf7 to 3fa44f7 Compare October 23, 2025 17:24

yannham mentioned this pull request Oct 24, 2025

Adds support for any_of, all_of and Sequence to nls #2391

Merged

yannham force-pushed the rfc007/migrate-term-to-value branch 2 times, most recently from 3b7ae11 to 66b2be4 Compare October 30, 2025 17:51

yannham added 6 commits October 31, 2025 10:18

refactor(wip): migrate term to NickelValue

5f32a36

refactor(wip): migrate eval to NickelValue

db35246

chore: rename Closure::body to value

8505235

chore: formatting

361cac0

refactor(wip): migrate eval to NickelValue (continued)

e4c8e2f

refactor(wip): migrate to term to NickelValue (continued)

079ab7c

yannham and others added 2 commits November 3, 2025 14:14

chore: fix typo in comment

8733488

Co-authored-by: jneem <[email protected]>

chore: use unwrap_or_alloc instead of special case

c4d37a8

Co-authored-by: jneem <[email protected]>

yannham force-pushed the rfc007/migrate-term-to-value branch from 5999b59 to c4d37a8 Compare November 3, 2025 13:21

jneem approved these changes Nov 4, 2025

View reviewed changes

yannham and others added 20 commits November 4, 2025 11:47

fix: typo in comment

451e00e

Co-authored-by: jneem <[email protected]>

fix: typo in comment

4cbb544

Co-authored-by: jneem <[email protected]>

chore: use PosIdx instead of TermPos in import resolution

c00d1fc

chore: remove obsolete comment

186361c

fix: wrong catch-all case and other errors in should_share

7a79e3c

fix: typo in comment

8577724

Co-authored-by: jneem <[email protected]>

chore: remove duplicated comment

c1be918

chore: remove obsolete comment

98edbf8

chore: fix typo in comment

7784939

Co-authored-by: jneem <[email protected]>

chore: Str -> String in deserialize error messages

df29371

chore: remove commented out dead code

fcf2d03

chore: move impls to the right module

2d09ae9

fix: typo in comment

579b79f

Co-authored-by: jneem <[email protected]>

chore: remove commented out dead code

f9629ed

fix: typo in comment

a4189ab

Co-authored-by: jneem <[email protected]>

chore: Context::to_xxx -> Context::expr_to_xxx

eac815d

chore: remove trailing spaces in comment

e8ff3d0

fix: typo in comment

a581867

fix: do not share environment between bench runs

265343e

chore: formatting (ed 2024)

58f4bf9

yannham force-pushed the rfc007/migrate-term-to-value branch from ba9e448 to 58f4bf9 Compare November 4, 2025 16:41

yannham added this pull request to the merge queue Nov 4, 2025

Merged via the queue into master with commit 59095e0 Nov 4, 2025
5 checks passed

yannham deleted the rfc007/migrate-term-to-value branch November 4, 2025 17:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC007] Migrate to the new compact value representation #2302

[RFC007] Migrate to the new compact value representation #2302

Uh oh!

yannham commented Jul 23, 2025 •

edited

Loading

Uh oh!

jneem left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[RFC007] Migrate to the new compact value representation #2302

[RFC007] Migrate to the new compact value representation #2302

Uh oh!

Conversation

yannham commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Migration to the Compact Value Representation

Impact

Content

NickelValue

ValueContent & Lenses

Position indices

XXXBody wrappers

PosTable ownership

Migration to edition 2024

Review

Follow-ups

NickelValue

primops

Term

Other memory savings

Uh oh!

jneem left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bencher Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yannham commented Jul 23, 2025 •

edited

Loading

`NickelValue`

`ValueContent` & Lenses

`PosTable` ownership

`NickelValue`

`Term`

github-actions bot commented Nov 4, 2025 •

edited

Loading