-
Notifications
You must be signed in to change notification settings - Fork 467
chore(internal): add process_tags to APM payload #15146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 243 ± 3 ms. The average import time from base is: 245 ± 2 ms. The import time difference between this PR and base is: -2.1 ± 0.1 ms. Import time breakdownThe following import paths have appeared:
|
ffb1dea to
660bd64
Compare
Performance SLOsComparing candidate dubloom/process-tags-collection (b66d6a4) with baseline main (83a032e) 📈 Performance Regressions (2 suites)📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 4.328µs (SLO: <10.000µs 📉 -56.7%) vs baseline: +2.8% Memory: ✅ 37.277MB (SLO: <39.000MB -4.4%) vs baseline: +4.9% ✅ ospathbasename_noaspectTime: ✅ 1.086µs (SLO: <10.000µs 📉 -89.1%) vs baseline: +1.3% Memory: ✅ 37.297MB (SLO: <39.000MB -4.4%) vs baseline: +4.9% ✅ ospathjoin_aspectTime: ✅ 6.223µs (SLO: <10.000µs 📉 -37.8%) vs baseline: +1.3% Memory: ✅ 37.297MB (SLO: <39.000MB -4.4%) vs baseline: +5.0% ✅ ospathjoin_noaspectTime: ✅ 2.307µs (SLO: <10.000µs 📉 -76.9%) vs baseline: +0.5% Memory: ✅ 37.297MB (SLO: <39.000MB -4.4%) vs baseline: +4.9% ✅ ospathnormcase_aspectTime: ✅ 3.506µs (SLO: <10.000µs 📉 -64.9%) vs baseline: -0.2% Memory: ✅ 37.277MB (SLO: <39.000MB -4.4%) vs baseline: +4.8% ✅ ospathnormcase_noaspectTime: ✅ 0.572µs (SLO: <10.000µs 📉 -94.3%) vs baseline: +0.6% Memory: ✅ 37.336MB (SLO: <39.000MB -4.3%) vs baseline: +5.1% ✅ ospathsplit_aspectTime: ✅ 4.816µs (SLO: <10.000µs 📉 -51.8%) vs baseline: -0.8% Memory: ✅ 37.257MB (SLO: <39.000MB -4.5%) vs baseline: +4.9% ✅ ospathsplit_noaspectTime: ✅ 1.595µs (SLO: <10.000µs 📉 -84.1%) vs baseline: +0.4% Memory: ✅ 37.257MB (SLO: <39.000MB -4.5%) vs baseline: +4.7% ✅ ospathsplitdrive_aspectTime: ✅ 4.141µs (SLO: <10.000µs 📉 -58.6%) vs baseline: 📈 +13.0% Memory: ✅ 37.336MB (SLO: <39.000MB -4.3%) vs baseline: +5.1% ✅ ospathsplitdrive_noaspectTime: ✅ 0.703µs (SLO: <10.000µs 📉 -93.0%) vs baseline: +0.8% Memory: ✅ 37.316MB (SLO: <39.000MB -4.3%) vs baseline: +5.0% ✅ ospathsplitext_aspectTime: ✅ 5.245µs (SLO: <10.000µs 📉 -47.6%) vs baseline: 📈 +14.9% Memory: ✅ 37.277MB (SLO: <39.000MB -4.4%) vs baseline: +4.7% ✅ ospathsplitext_noaspectTime: ✅ 1.382µs (SLO: <10.000µs 📉 -86.2%) vs baseline: -0.9% Memory: ✅ 37.218MB (SLO: <39.000MB -4.6%) vs baseline: +4.6% 📈 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 2.978µs (SLO: <20.000µs 📉 -85.1%) vs baseline: +0.5% Memory: ✅ 31.909MB (SLO: <34.000MB -6.1%) vs baseline: +4.7% ✅ 1-count-metrics-100-timesTime: ✅ 202.690µs (SLO: <220.000µs -7.9%) vs baseline: +2.2% Memory: ✅ 31.929MB (SLO: <34.000MB -6.1%) vs baseline: +4.9% ✅ 1-distribution-metric-1-timesTime: ✅ 3.299µs (SLO: <20.000µs 📉 -83.5%) vs baseline: +1.9% Memory: ✅ 31.929MB (SLO: <34.000MB -6.1%) vs baseline: +4.6% ✅ 1-distribution-metrics-100-timesTime: ✅ 213.750µs (SLO: <220.000µs -2.8%) vs baseline: +0.5% Memory: ✅ 31.949MB (SLO: <34.000MB -6.0%) vs baseline: +4.9% ✅ 1-gauge-metric-1-timesTime: ✅ 2.463µs (SLO: <20.000µs 📉 -87.7%) vs baseline: 📈 +12.6% Memory: ✅ 31.988MB (SLO: <34.000MB -5.9%) vs baseline: +4.9% ✅ 1-gauge-metrics-100-timesTime: ✅ 136.639µs (SLO: <150.000µs -8.9%) vs baseline: +0.2% Memory: ✅ 31.968MB (SLO: <34.000MB -6.0%) vs baseline: +5.4% ✅ 1-rate-metric-1-timesTime: ✅ 3.480µs (SLO: <20.000µs 📉 -82.6%) vs baseline: 📈 +12.9% Memory: ✅ 32.027MB (SLO: <34.000MB -5.8%) vs baseline: +5.5% ✅ 1-rate-metrics-100-timesTime: ✅ 214.787µs (SLO: <250.000µs 📉 -14.1%) vs baseline: ~same Memory: ✅ 31.890MB (SLO: <34.000MB -6.2%) vs baseline: +4.6% ✅ 100-count-metrics-100-timesTime: ✅ 20.098ms (SLO: <22.000ms -8.6%) vs baseline: +0.2% Memory: ✅ 31.949MB (SLO: <34.000MB -6.0%) vs baseline: +5.0% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.274ms (SLO: <2.300ms 🟡 -1.1%) vs baseline: +1.9% Memory: ✅ 31.949MB (SLO: <34.000MB -6.0%) vs baseline: +4.8% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.402ms (SLO: <1.550ms -9.5%) vs baseline: +0.2% Memory: ✅ 31.929MB (SLO: <34.000MB -6.1%) vs baseline: +5.0% ✅ 100-rate-metrics-100-timesTime: ✅ 2.196ms (SLO: <2.550ms 📉 -13.9%) vs baseline: -0.3% Memory: ✅ 31.909MB (SLO: <34.000MB -6.1%) vs baseline: +4.9% ✅ flush-1-metricTime: ✅ 4.548µs (SLO: <20.000µs 📉 -77.3%) vs baseline: +3.3% Memory: ✅ 31.949MB (SLO: <34.000MB -6.0%) vs baseline: +4.7% ✅ flush-100-metricsTime: ✅ 176.354µs (SLO: <250.000µs 📉 -29.5%) vs baseline: -0.2% Memory: ✅ 31.988MB (SLO: <34.000MB -5.9%) vs baseline: +4.9% ✅ flush-1000-metricsTime: ✅ 2.118ms (SLO: <2.500ms 📉 -15.3%) vs baseline: -1.7% Memory: ✅ 32.657MB (SLO: <34.500MB -5.3%) vs baseline: +4.5% 🟡 Near SLO Breach (5 suites)🟡 djangosimple - 30/30✅ appsecTime: ✅ 20.509ms (SLO: <22.300ms -8.0%) vs baseline: +0.3% Memory: ✅ 66.122MB (SLO: <67.000MB 🟡 -1.3%) vs baseline: +5.0% ✅ exception-replay-enabledTime: ✅ 1.342ms (SLO: <1.450ms -7.4%) vs baseline: ~same Memory: ✅ 64.315MB (SLO: <67.000MB -4.0%) vs baseline: +5.1% ✅ iastTime: ✅ 20.431ms (SLO: <22.250ms -8.2%) vs baseline: -0.2% Memory: ✅ 66.107MB (SLO: <67.000MB 🟡 -1.3%) vs baseline: +5.0% ✅ profilerTime: ✅ 15.622ms (SLO: <16.550ms -5.6%) vs baseline: +0.5% Memory: ✅ 53.941MB (SLO: <54.500MB 🟡 -1.0%) vs baseline: +4.6% ✅ resource-renamingTime: ✅ 20.581ms (SLO: <21.750ms -5.4%) vs baseline: -0.2% Memory: ✅ 66.111MB (SLO: <67.000MB 🟡 -1.3%) vs baseline: +4.8% ✅ span-code-originTime: ✅ 25.437ms (SLO: <28.200ms -9.8%) vs baseline: ~same Memory: ✅ 67.059MB (SLO: <69.500MB -3.5%) vs baseline: +4.6% ✅ tracerTime: ✅ 20.647ms (SLO: <21.750ms -5.1%) vs baseline: +0.7% Memory: ✅ 66.033MB (SLO: <67.000MB 🟡 -1.4%) vs baseline: +4.8% ✅ tracer-and-profilerTime: ✅ 22.709ms (SLO: <23.500ms -3.4%) vs baseline: +0.1% Memory: ✅ 67.678MB (SLO: <68.000MB 🟡 -0.5%) vs baseline: +4.7% ✅ tracer-dont-create-db-spansTime: ✅ 19.353ms (SLO: <21.500ms -10.0%) vs baseline: +0.3% Memory: ✅ 66.074MB (SLO: <67.000MB 🟡 -1.4%) vs baseline: +4.9% ✅ tracer-minimalTime: ✅ 16.610ms (SLO: <17.500ms -5.1%) vs baseline: ~same Memory: ✅ 65.973MB (SLO: <67.000MB 🟡 -1.5%) vs baseline: +4.8% ✅ tracer-nativeTime: ✅ 20.482ms (SLO: <21.750ms -5.8%) vs baseline: ~same Memory: ✅ 67.672MB (SLO: <72.500MB -6.7%) vs baseline: +4.7% ✅ tracer-no-cachesTime: ✅ 18.490ms (SLO: <19.650ms -5.9%) vs baseline: +0.2% Memory: ✅ 66.035MB (SLO: <67.000MB 🟡 -1.4%) vs baseline: +4.9% ✅ tracer-no-databasesTime: ✅ 18.727ms (SLO: <20.100ms -6.8%) vs baseline: ~same Memory: ✅ 65.936MB (SLO: <67.000MB 🟡 -1.6%) vs baseline: +4.6% ✅ tracer-no-middlewareTime: ✅ 20.202ms (SLO: <21.500ms -6.0%) vs baseline: ~same Memory: ✅ 65.916MB (SLO: <67.000MB 🟡 -1.6%) vs baseline: +4.8% ✅ tracer-no-templatesTime: ✅ 20.252ms (SLO: <22.000ms -7.9%) vs baseline: -0.2% Memory: ✅ 66.022MB (SLO: <67.000MB 🟡 -1.5%) vs baseline: +5.0% 🟡 errortrackingdjangosimple - 6/6✅ errortracking-enabled-allTime: ✅ 18.195ms (SLO: <19.850ms -8.3%) vs baseline: +0.7% Memory: ✅ 66.257MB (SLO: <66.500MB 🟡 -0.4%) vs baseline: +5.1% ✅ errortracking-enabled-userTime: ✅ 18.228ms (SLO: <19.400ms -6.0%) vs baseline: +1.0% Memory: ✅ 66.237MB (SLO: <66.500MB 🟡 -0.4%) vs baseline: +5.0% ✅ tracer-enabledTime: ✅ 18.107ms (SLO: <19.450ms -6.9%) vs baseline: +0.4% Memory: ✅ 65.711MB (SLO: <66.500MB 🟡 -1.2%) vs baseline: +4.6% 🟡 errortrackingflasksqli - 6/6✅ errortracking-enabled-allTime: ✅ 2.096ms (SLO: <2.300ms -8.9%) vs baseline: +1.6% Memory: ✅ 52.632MB (SLO: <53.500MB 🟡 -1.6%) vs baseline: +4.7% ✅ errortracking-enabled-userTime: ✅ 2.064ms (SLO: <2.250ms -8.3%) vs baseline: -0.6% Memory: ✅ 52.534MB (SLO: <53.500MB 🟡 -1.8%) vs baseline: +4.9% ✅ tracer-enabledTime: ✅ 2.071ms (SLO: <2.300ms -10.0%) vs baseline: +0.3% Memory: ✅ 52.612MB (SLO: <53.500MB 🟡 -1.7%) vs baseline: +4.9% 🟡 flasksimple - 18/18✅ appsec-getTime: ✅ 4.597ms (SLO: <4.750ms -3.2%) vs baseline: -0.2% Memory: ✅ 62.388MB (SLO: <65.000MB -4.0%) vs baseline: +4.9% ✅ appsec-postTime: ✅ 6.659ms (SLO: <6.750ms 🟡 -1.3%) vs baseline: +0.3% Memory: ✅ 62.387MB (SLO: <65.000MB -4.0%) vs baseline: +4.9% ✅ appsec-telemetryTime: ✅ 4.594ms (SLO: <4.750ms -3.3%) vs baseline: -0.6% Memory: ✅ 62.427MB (SLO: <65.000MB -4.0%) vs baseline: +5.0% ✅ debuggerTime: ✅ 1.861ms (SLO: <2.000ms -7.0%) vs baseline: +0.4% Memory: ✅ 45.517MB (SLO: <47.000MB -3.2%) vs baseline: +5.4% ✅ iast-getTime: ✅ 1.858ms (SLO: <2.000ms -7.1%) vs baseline: -0.1% Memory: ✅ 42.151MB (SLO: <49.000MB 📉 -14.0%) vs baseline: +4.6% ✅ profilerTime: ✅ 1.915ms (SLO: <2.100ms -8.8%) vs baseline: ~same Memory: ✅ 46.679MB (SLO: <47.000MB 🟡 -0.7%) vs baseline: +4.8% ✅ resource-renamingTime: ✅ 3.382ms (SLO: <3.650ms -7.3%) vs baseline: +0.2% Memory: ✅ 52.666MB (SLO: <53.500MB 🟡 -1.6%) vs baseline: +5.0% ✅ tracerTime: ✅ 3.351ms (SLO: <3.650ms -8.2%) vs baseline: -0.2% Memory: ✅ 52.587MB (SLO: <53.500MB 🟡 -1.7%) vs baseline: +4.8% ✅ tracer-nativeTime: ✅ 3.364ms (SLO: <3.650ms -7.8%) vs baseline: +0.4% Memory: ✅ 54.148MB (SLO: <60.000MB -9.8%) vs baseline: +4.8% 🟡 flasksqli - 6/6✅ appsec-enabledTime: ✅ 3.963ms (SLO: <4.200ms -5.6%) vs baseline: ~same Memory: ✅ 62.384MB (SLO: <66.000MB -5.5%) vs baseline: +4.9% ✅ iast-enabledTime: ✅ 2.432ms (SLO: <2.800ms 📉 -13.1%) vs baseline: -0.7% Memory: ✅ 58.963MB (SLO: <60.000MB 🟡 -1.7%) vs baseline: +4.9% ✅ tracer-enabledTime: ✅ 2.060ms (SLO: <2.250ms -8.5%) vs baseline: +0.4% Memory: ✅ 52.652MB (SLO: <54.500MB -3.4%) vs baseline: +4.9%
|
brettlangdon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implementation seems like overkill for what needs to be implemented. this is going to have a negative performance impact, especially with the introduction of the lock
.riot/requirements/1645326.txt
Outdated
| # This file is autogenerated by pip-compile with Python 3.13 | ||
| # by the following command: | ||
| # | ||
| # pip-compile --allow-unsafe --no-annotate .riot/requirements/1645326.in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any riotfile.py changes, does this need to be here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 0428dcd
|
|
||
| @pytest.mark.snapshot | ||
| def test_process_tags_activated(self): | ||
| with patch("sys.argv", ["/path/to/test_script.py"]), patch("os.getcwd", return_value="/path/to/workdir"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there any edge cases/failure scenarios we should be testing?
e.g. what if sys.argv[0] is outside of os.getcwd?
what if the basename of sys.argv[0] doesn't have an extension?
is it possible for Path(sys.argv[0]).resolve() to not have a parent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 0428dcd
I'm not sure if you said:
sys.argv[0] is outside of os.getcwd?
as a real example. Because in terms of code, everything would work. Maybe it would not make sense in terms of tag values but this is not something we can control
| assert normalize_tag(input_tag) == expected | ||
|
|
||
|
|
||
| class TestProcessTags(TracerTestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably want a tearDown to reset process tags to their original configuration value and generated value?
since all of these tests are polluting the global state/configuration and the state stored at the module level of process_tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 0428dcd
tests/internal/test_process_tags.py
Outdated
| pass | ||
|
|
||
| @pytest.mark.snapshot | ||
| def test_process_tags_only_on_local_root_span(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need this test case, you can just roll it into the ones above (have the ones above also create a child span and the snapshot will encode the "only on the root span" for you).
not sure if you care about validating the case with partial flushing?
you could add/update any tests for the TraceTagsProcessor as well, since controlling better the scenarios like partial flushing are pretty basic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing partial is a good idea. I tried to address it in addressed in 0428dcd, tell me if it looks good to you
|
|
||
|
|
||
| class TestProcessTags(TracerTestCase): | ||
| def test_process_tags_deactivated(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be a snapshot test too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 0428dcd
| "DD_TRACE_RESOURCE_RENAMING_ALWAYS_SIMPLIFIED_ENDPOINT", default=False, modifier=asbool | ||
| ) | ||
| self._process_tags_enabled = _get_config( | ||
| "DD_EXPERIMENTAL_PROPAGATE_PROCESS_TAGS_ENABLED", default=False, modifier=asbool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a comment in the RFC that this should be enabled by default if the user also adds DEFAULT_TRACE_EXPERIMENTAL_FEATURES_ENABLED. I'm not sure what that means for Python since we have something called DD_TRACE_EXPERIMENTAL_FEATURES_ENABLED:
https://ddtrace.readthedocs.io/en/stable/configuration.html#DD_TRACE_EXPERIMENTAL_FEATURES_ENABLED, which currently lists runtime metrics as an experimental feature:
dd-trace-py/ddtrace/internal/constants.py
Line 143 in b7a2d2a
| class EXPERIMENTAL_FEATURES: |
I'm not sure if that's also needed in this PR on top of this new environment variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This env variable is meant to be used like:
DD_TRACE_EXPERIMENTAL_FEATURES_ENABLED=DD_RUNTIME_METRICS_ENABLED.
So I don't think that covers our use case.
|
|
||
| @pytest.mark.snapshot | ||
| def test_process_tags_activated(self): | ||
| with patch("sys.argv", ["/path/to/test_script.py"]), patch("os.getcwd", return_value="/path/to/workdir"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to do a windows version of this in the test suite? Windows path do \ instead of /: https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/path#examples .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.path.basename() uses the path separator of the operating system it’s running on. Therefore, if we want to add a test for Windows, we'll have to run the test on a Windows machine which might be overkill.
This PR implements this RFC for the tracing product.
Add process_tags in APM payload. According to the RFC and as the python tracer supports only v0.4 and v0.5 the tag
_dd.tags.processis set in the first first span of each trace chunk set.The set tag are:
entrypoint.workdir. The working dir name, should be the last path segment.entrypoint.name. The entrypoint of the application. It is the the script name.entrypoint.type. I'd assume that it can be only script for Python.entrypoint.basedir. The base dir is the directory where the called executable resides. Should be the last path segment.