This Luanti mod lets you use LuaJIT's builtin profiler to create a flame graph data file. This profiler is good for pinpointing expensive parts of games and mods.
the flame graph data being analyzed inside speedscope. you can use a different flame graph visualizer of course 😁Please try it out and give feedback! This is still very much untested, so i invite game and mod developers to try it out. Aside from github, you can talk to ACorp in Luanti's official discord.
TODO+HELPin-game flame graph visualizer. no more going to speedscope or running flamegraph.pl or whatever.HELPapproach to making zones that the profiler can pick up on. attempting to do something likezone_start('a') ... zone_stop('a')does not work.HELPneed ideas of other existing visualizations than flame graph and the data format to achieve it. i don't want to lock people into using flame graphs and only flame graphs.HELPneed testing for coroutine contexts. does it work? how accurate is the data?HELPUsing this mod with Mesecons luacontrollers can cause crashes for some reason.- TODO is this still true?
This profiler implementation leverages LuaJIT low-level profiling API to record stack dumps, number of samples between each profiler callbacks, and estimated average execution time between all samples in resolution of x0.1 milliseconds. the output is formatted in the "folded stack" format - go ahead and read the output in plaintext. If you read the code, you'll see that it's quite simple at its core all extra features aside.
CAVEAT profiling is not a free or zero-cost operation. It will slow down the server. This mod tries its best to put away compute into separate steps outside the profiling session and profile as lightly as possible. Compared to debug.* or instrumentation approach, this is much more efficient at discovering potential slow code. however, you should consider doing a more precise instrumentation profiling after identifying slow parts for precise analysis.
- Add the mod to the
secure.trusted_modslist. required to be able to load the low-level profiling module. I cannot claim that this mod is 100% secure, so please don't use this outside of development and testing. - Enable the mod in your world.
- In-game start the profiler with
/jp_start <interval> <filename>. The profiling data will be written to<worldpath>/jitprofiles/<filename>.<interval>is usually between1to10. its unit is milliseconds.
- Collect data for a while. prepare what you're trying to profile first as the collected data may easily increases to 1MB in a few minute depending on the game you are playing.
- Stop profiling with
/jp_stop - Give your data file to a flame graph visualizer.
- for example with the original flame graph by Brendan Gregg:
./flamegraph.pl $worldpath/jitprofiles/$filename >graph.svg
- for example with the original flame graph by Brendan Gregg:
LuaJIT is not always able to deduce the correct function names in frames e.g. @M/my_mod/init.lua:91. Fortunately, the lua code is available for reading.
-
After profiling, you can run
/jp_learn <data filename> <mapping name>to create frame mappings by crawling through the profiling data and then reading lua code. -
Then, run
/jp_remap <data filename> <mapping name>to apply the frame mapping on the profiling data. this will overwrite the file. -
Done! now (most) of the frames will have nicer names e.g.
@M/my_mod/init.lua:91 (core.register_on_globalstep HOOK)
-
module prettifying. LuaJIT low-level profiling API can emit the full path to the lua module that contains the relevant function in its stack dump. this results in really large frames (e.g.
C:\Users\USER\Documents\...\games\my_game\mods\my_mod\my_module\init.lua:my_fn). this is useful to identify where functions are. throughjitprofiler.pretty(), the modules are transformed following this convention:@Bpoints to<luanti>/builtin@Wpoints to<luanti world>/worldmods@Gpoints to<luanti>/games/<current game>/mods@Mpoints to<luanti>/mods- so for the example given, it would be prettified into
@G/my_mod/my_module/init.lua:my_fn.
in order to prettify the paths particularly for NOT
RUN_IN_PLACEluanti installs, the mod expands each of the<luanti>path component above. for example, builtin lua code might be installed in/usr/share/luantion linux, so that's where<luanti>in@Bpoints to. -
filter out stack dumps. Without any filtering, there's lots of unhelpful noises collected by the profiler. you can pattern match frames to ignore. Read the code for the current defaults and how to use it. This is performed through
jitprofiler.pretty(). -
filter out negligible stacks. Some stacks have a
0score meaning it wouldn't appear normally in flame graph visualizers. this can be filtered out by settingjitprofiler.trim_zeroes = true. by default this is enabled. this is performed throughjitprofiler.pretty() -
frame re-mapping. you can re-map a frame. Read the code for the current defaults and how to use it. This is performed through
jitprofiler.pretty(). -
frame re-mapping learning. it's impractical to expect game developers and modders to manually insert all re-mapping manually or write their own crawler. this mod tries to alleviate this problem by doing that for you through
jitprofiler.learn()andjitprofiler.remap(). -
custom scoring. you may want a different score based on the collected data per callback. set
jitprofiler.scorer = function(time, samples) return ... end. e.g. you want to give a larger weight to higher and higher frequency calls. The function should return an integer, as that's the only supported number format for flame graphs- i'm open to a feature request for specific data to be given 😊.
-
chat commands for convenience, but also a lua API for programmatic profiling needs. Not sure if the latter is even remotely useful, but it's there.
-
/jp_start [<period>] [<data filename>]starts the profiler with a sampling resolution ofperiodmilliseconds. profiling data is stored into<world dir>/jitprofiles/<data filename>. Arguments can be omitted after the first run, as this mod remembers the immediate last period and file. A good<period>number to start with is 10 milliseconds. -
/jp_stopstops the profiler. profiling data is then prettified immediately. -
/jp_learn <data filename> <mapping name>creates a frame mapping by crawling every frame of every stack. updates or writes frame mappings in<jitprofiler mod>/mappings/<mapping name>.lua. -
/jp_remap <data filename> <mapping name>applies the frame mapping to the profiling data.
NOTE to exploit duck typing, please read the code to understand how it quacks. The following is a quick reference, optionally see the annotation of each function using LuaLS (if you don't have that or don't want to use that, just read the code 😁).
-
jitprofiler.start(period, data_path, open_data)starts the profiler with a sampling resolution ofperiodmilliseconds, and opes thedata_pathfor writing raw profiling data. -
jitprofiler.stop()stops the profiler. -
HEAVY
jitprofiler.pretty(data_path, open_data)opens theopen_datawhich must contain raw profiling data. then, prettifies the contents.open:io.open. will close the file. duck typing this is allowed, and it must create a (duck typed) file handle.
-
HEAVY
jitprofiler.learn(data_path, open_data, map_path, open_map)opens thefilenamewhich must contain pretty profiling data. then, it creates a frame mapping by crawling every frame of every stack. this is performed by reading lua modules loaded by luanti, even builtin. updateload()-able frame mapping in<jitprofiler mod>/mappings/<map_path>if it exists, otherwise simply write a new one. it's stored in the same format asjitprofiler.frame_map. -
HEAVY
jitprofiler.remap(data_path, open_data, map_path, open_map)opens thefilenamewhich must contain pretty profiling data. them, it transforms the stacks according to the frame mapping inmap_path. This function allows you to perform only the frame re-mapping part of prettify.jitprofiler.frame_mapoverrides the mapping file.
- TODO document more jitprofiler API
My deepest gratitude to the late Jude Melton-Houghton (known online as TurkeyMcMac) for bringing this profiling approach to light.
