Skip to content

Conversation

@thiagohora
Copy link
Contributor

Details

This PR adds an Otel extension to the Otel agent so that it adds the workspace header to our HTTP-related metrics. This is required to avoid losing the filter granularity we have today.

Change checklist

  • User facing
  • Documentation update

Issues

  • Resolves #
    NA

Testing

NA

Documentation

NA

- WebhookSubscriber: Fixed 5 async/non-blocking issues
  * Replaced Mono.fromRunnable with Mono.fromCallable for proper exception handling
  * Wrapped blocking deserialization with Mono.fromCallable + subscribeOn
  * Moved attributes building inside Mono chain
  * Pre-built metrics counters to avoid repeated instantiation
  * Added timeout and validated HTTP client usage

- BaseRedisSubscriber: Added backpressure monitoring
  * Added pre-built backpressure drop counter
  * Logs when backpressure causes events to be dropped
  * Helps identify system overload conditions

- WebhookHttpClient: Major async/resource management improvements
  * Async JSON serialization on boundedElastic scheduler
  * Try-with-resources for automatic response cleanup
  * Optional for null-safe string handling
  * Retry attempt logging for observability
  * Returns response body or 'ok' instead of exposing Response object
  * No duplicate error logging

- AlertService: Updated to work with new Mono<String> return type
  * Handles response body directly
  * Defaults to 200 status code on success

All changes follow reactive best practices and enterprise-grade patterns.
…move unnecessary wrappers, consolidate schedulers, fix resource management, and clarify logging messages
…ound JSON work, remove misleading comment, simplify validation with Mono.fromRunnable()
…hreads support (enableVirtualThreads flag) for future efficiency
@thiagohora thiagohora marked this pull request as ready for review October 27, 2025 08:58
@thiagohora thiagohora requested a review from a team as a code owner October 27, 2025 08:58
Copilot AI review requested due to automatic review settings October 27, 2025 08:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a custom OpenTelemetry extension to the Otel agent to preserve workspace and SDK version information in HTTP-related metrics. The extension is built as a separate Maven module and integrated into the backend's Docker image and startup script to maintain metric filtering granularity.

Key changes:

  • Custom Otel extension module added to enhance HTTP metrics with workspace context
  • Updated OpenTelemetry version from 2.20.0 to 2.21.0
  • Modified Docker build and entrypoint to conditionally load the extension

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
apps/opik-backend/pom.xml Bumped OpenTelemetry version to 2.21.0
apps/opik-backend/entrypoint.sh Added conditional logic to attach telemetry extension to Otel agent
apps/opik-backend/Dockerfile Integrated telemetry extension build and deployment into Docker image

COPY --from=build /opt/opik-backend/target/openapi.yaml redoc/
COPY --from=build /opt/opik-backend/target/*.jar ./
# Copy telemetry extension JAR (built as part of the build process)
COPY --from=build /opt/opik-backend/opik-telemetry-extension/target/opik-telemetry-extension-*.jar ./opik-telemetry-extension.jar
Copy link

Copilot AI Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wildcard pattern opik-telemetry-extension-*.jar may match multiple JAR files if version suffixes or classifiers are present. This could result in copy failures or unpredictable behavior. Consider using a more specific pattern or renaming the artifact during the build phase to ensure exactly one file is copied.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think copilot has a point here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not a problem, we're the ones generating the .jar, it's fixed as 1.0.0-SNAPSHOT

@github-actions
Copy link
Contributor

github-actions bot commented Oct 27, 2025

SDK E2E Tests Results

0 tests   0 ✅  0s ⏱️
0 suites  0 💤
0 files    0 ❌

Results for commit d40a987.

♻️ This comment has been updated with latest results.

@thiagohora thiagohora force-pushed the thiagohora/NA_add_extension_to_add_workspace_and_sdk_version_to_otel_metrics branch from d40a987 to 213328f Compare October 27, 2025 08:59
@github-actions
Copy link
Contributor

github-actions bot commented Oct 27, 2025

Backend Tests Results

5 205 tests   5 198 ✅  49m 27s ⏱️
  258 suites      7 💤
  258 files        0 ❌

Results for commit a30439d.

♻️ This comment has been updated with latest results.

BorisTkachenko
BorisTkachenko previously approved these changes Oct 27, 2025
Copy link
Contributor

@idoberko2 idoberko2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I Had some comments I believe we should attend before merging


@Override
public void customize(AutoConfigurationCustomizer customizer) {
System.out.println("[OpikTelemetryExtension] Applying Opik-specific OpenTelemetry metric provider customizations");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
System.out.println("[OpikTelemetryExtension] Applying Opik-specific OpenTelemetry metric provider customizations");
log.info("[OpikTelemetryExtension] Applying Opik-specific OpenTelemetry metric provider customizations");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replied here #3812 (comment)

Comment on lines +50 to +51
assertNotNull(result);
assertEquals(meterProviderBuilder, result);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use assertJ assertions instead of JUnit

int order = customizer.order();

// Then
assertEquals(100, order);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, assertJ

COPY --from=build /opt/opik-backend/target/openapi.yaml redoc/
COPY --from=build /opt/opik-backend/target/*.jar ./
# Copy telemetry extension JAR (built as part of the build process)
COPY --from=build /opt/opik-backend/opik-telemetry-extension/target/opik-telemetry-extension-*.jar ./opik-telemetry-extension.jar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think copilot has a point here


// Create a single view that includes workspace context attributes
// This view will be applied to all instruments without changing their names
View workspaceView = View.builder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry i cant understand how this is injecting workspaceId and the sdk version inside the metrics. What am I missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header is added in the propagation context via OTEL_INSTRUMENTATION_HTTP_SERVER_CAPTURE_REQUEST_HEADERS. However, it removes all attributes not specified by the instrumentation by default. Then I had to create this extension following this: https://github.com/open-telemetry/opentelemetry-java/tree/main/sdk-extensions/incubator#file-configuration

Copy link
Contributor

@ldaugusto ldaugusto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thiagohora thiagohora merged commit 37dd25e into main Oct 27, 2025
14 checks passed
@thiagohora thiagohora deleted the thiagohora/NA_add_extension_to_add_workspace_and_sdk_version_to_otel_metrics branch October 27, 2025 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants