Skip to content

Conversation

@Chlorobite
Copy link

@Chlorobite Chlorobite commented Nov 17, 2024

Inspired by a performance problem rightfully pointed out in PR #92, I set out to greatly optimize the process scan:

Scenario Time Notes
main
Before 🟨 133.7ms
After 🟩 2.3ms Highly skewed due to what I think are GC spikes, most scans run way below 1ms!
With #92
Before 🟥 2699ms
After 🟩 0.3ms NOT faster than main, simply no GC spikes happened here

The change works on my end, but is also provably correct (assuming DetectableDB does not change) because:

  • the scan results depend on DetectableDB's contents, args, and toCompare
  • toCompare is derived from splitPath
  • splitPath is derived from path

Notes:

  • for the times, I subtracted the time it takes to get processes using await Native.getProcesses();. Improve linux game detection #92 does add functionality to this call but the performance difference is negligible (16.3ms -> 18.3ms, possibly within margin of error)
  • my system has a Ryzen 9 7940HS, the difference is much more significant on lower end setups where the process scan with the other PR included could have taken more than the 5 second budget and got into a loop of continuously consuming 1 CPU core!
  • the results were recorded under no system activity, new processes coming to be will cause spikes but hopefully never worse than redundantly running every single process by DetectableDB.

Open questions:

  • I'm not sure that the theoretically possible case of multiple DetectableDB entries being detected with one process is intentional, but I cache an array of all detected entries to keep this behavior
  • Could it be worth to reduce the frequency of process scans (from 5000ms to, say, 1000ms) along with this change to make the detection of games quicker?
  • A memory leak might be possible if many, many processes with different paths and args are created, should I purge unseen processes from the cache?

@DekoDX
Copy link

DekoDX commented Nov 18, 2024

Seems like it works with some slight reduction in cpu and memory usage as well. I will update after trying in combination with #92 but it seems good so far.

@DekoDX
Copy link

DekoDX commented Nov 18, 2024

It appears in combo with 92 the cpu and memory usage goes up but not by much. I do see the memory usage going up at a very slow rate consistently while idling and slightly faster while multiple different processes are running.
I do think purging unseen processes from the cache may solve this, but I lack the expertise to know for sure.

I have direct contact with chlorobyte if anyone asks how I managed to test the combined pr.

Copy link
Contributor

@CanadaHonk CanadaHonk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow this is great, thank you! Small request: could you just store detected in the cache instead of { detected }?

This prevents a hypothetical case of rotating lots of processes to bloat the RAM usage
@Chlorobite
Copy link
Author

Sure, that makes sense. Also implemented some sort of garbage collection to resolve @Dekomoro's concern, in a normal use case you basically never hit this (or maybe you do like once an hour of runtime or something)

@hazzuk
Copy link

hazzuk commented Apr 9, 2025

When evaluating this change, I was really blown away by the performance improvements.

Using the logging in src/process/index.js, on average I saw a before scan time of 95ms, and an after of <0.5ms. Inline with the original performance testing documented above.

Additionally, running clinic.js clinic doctor -- node src gives another view of the improvements to CPU usage, memory stability, and latency.

Before

std-Screenshot 2025-04-09 142712

After

cache+std-Screenshot 2025-04-09 142550

Looking forward to seeing this merged, amazing work.



const timestamps = {}, names = {}, pids = {};
const timestamps = {}, names = {}, pids = {}, cache = {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know most of these already existed before this PR, but using plain objects for this is bad. When you have frequent writes / deletes, Maps perform way better, both in terms of speed and memory usage. Changing all of these to Maps might also make it perform better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants