Skip to content

Conversation

@f0ster
Copy link
Contributor

@f0ster f0ster commented Oct 24, 2025

Optimizing javascript formatting with direct string concatenation
Fixed memory leak in previous feature branch #3070

Pull Request Checklist

  • Description which clearly states what problems the PR solves.
  • Description contains a link to the Github Issue, and any relevent
    Discussions, this PR applies to.
  • Include new tests that fail without this PR but passes with it.
  • Include any relevent Documentation changes related to this change.
  • Verify all commits have been signed in accordance with the DCO policy.
  • Reviewed PR commit history to remove unnecessary changes.
  • Make sure your PR passes build, test and lint steps completely.

Signed-off-by: f0ster <[email protected]>

fix: remove unused _keyCache Map to prevent memory accumulation

The _keyCache Map was initialized but never used, creating unnecessary
memory overhead that could accumulate across test runs in shared
Playwright instances, contributing to leak test failures.
Copy link
Member

@texodus texodus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Micro-benchmarks in JavaScript are dangerous (in that VM optimization can play tricks on you), but also aren't relevant to how the project uses benchmarks in the built-in tools/bench suite, which we use to track regressions between Perspective public API versions. They are great supporting data for your PR summary though!

I ultimately believe this is a faster implementation. I assume both the spread operator and Object.values() iteration are not free, in addition to the array allocation. However - incremental string concatenation is also not free, and Object.join (in theory) has the string builder advantage of knowing the total string size in advance (though the measurability of such methods in JavaSCript has a rocky history). I had to do a bit of research myself to be convinced this did anything at all, or that there was any feature impact.

Formatting in this case is a frequently-called internal method of the datagrid rendering process, so I would expect this change to make datagrid rendering faster. <perspective-viewer> has a getRenderStats() method, and <regular-table> has a getDrawFPS() method, which output render times of the viewer and datagrid, respectively. I set up a simple test harness to run capture the output of these methods in a loop while scrolling (which triggers rendering continuously), and I see no overall impact on rendering speed. I can imagine several reason why this may be the case - formatting is not within the margin of error impact on overall render speed, complex optimization behavior makes this method perform differently in benchmarks than in production, my computer/browser is a special case, etc etc.

Theoretical concerns aside, these benchmarks aren't in the right place and don't actually benchmark the production code (see inline note). You've placed them in the general node.js suite, which retroactively runs your tests ~20x against different versions of Perspective, only one of which has this API (and the API itself is not being benchmarked).

Implementation wise, as I said above I believe this is faster on principle, but without test or practical benchmarks I think this needs further scrutiny. Is caching the formatter objects even necessary? Would implementing an overall package policy of object serialization/hashing/etc and implementing it consistently everywhere (instead of bespoke just this one instance) show up in the practical benchmarks, e.g. is this pattern common in the repo and are there libraries that do this optimization for us? Is this the fastest method of property testing?

It is hard to merge these without a hefty burden of proof. Hypothetically, the very next PR to the repo could be another author which reverses this change with the summary "Code quality is improved without impacting overall performance" and I would not want to merge that either, for the same reason - both code quality and microbenchmarks need to have practical impact.

As I'm sure this is a faster implementation and you're not a first-time contributor (thanks!), I am down to merge this PR if the dead code and benchmarks are removed, and some due diligence is given to answering the implementation questions. It would be great to add a new benchmark suite in the spirit of the existing node.js suite that directly benchmarks render performance of <perspective-viewer> (e.g. like this example of getDrawFPS()), but this is a medium-sized undertaking maybe for future work.

new_superstore_table,
} from "./src/js/superstore.mjs";

// Legacy: Array-based key generation with spread operators
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The point of benchmarks is to assert the performance of the code we publish, copying an implementation out of the source code and writing a micro-benchmark does not accomplish this.
  2. createCacheKeyOptimized is not the same implementation in datagrid

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants