- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.2k
[NA] [BE] Add extension to add workspace and SDK version to Otel metrics #3812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NA] [BE] Add extension to add workspace and SDK version to Otel metrics #3812
Conversation
- WebhookSubscriber: Fixed 5 async/non-blocking issues * Replaced Mono.fromRunnable with Mono.fromCallable for proper exception handling * Wrapped blocking deserialization with Mono.fromCallable + subscribeOn * Moved attributes building inside Mono chain * Pre-built metrics counters to avoid repeated instantiation * Added timeout and validated HTTP client usage - BaseRedisSubscriber: Added backpressure monitoring * Added pre-built backpressure drop counter * Logs when backpressure causes events to be dropped * Helps identify system overload conditions - WebhookHttpClient: Major async/resource management improvements * Async JSON serialization on boundedElastic scheduler * Try-with-resources for automatic response cleanup * Optional for null-safe string handling * Retry attempt logging for observability * Returns response body or 'ok' instead of exposing Response object * No duplicate error logging - AlertService: Updated to work with new Mono<String> return type * Handles response body directly * Defaults to 200 status code on success All changes follow reactive best practices and enterprise-grade patterns.
…move unnecessary wrappers, consolidate schedulers, fix resource management, and clarify logging messages
…ound JSON work, remove misleading comment, simplify validation with Mono.fromRunnable()
…th for future virtual threads
…le for virtual threads
…hreads support (enableVirtualThreads flag) for future efficiency
…ture with proper error handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a custom OpenTelemetry extension to the Otel agent to preserve workspace and SDK version information in HTTP-related metrics. The extension is built as a separate Maven module and integrated into the backend's Docker image and startup script to maintain metric filtering granularity.
Key changes:
- Custom Otel extension module added to enhance HTTP metrics with workspace context
- Updated OpenTelemetry version from 2.20.0 to 2.21.0
- Modified Docker build and entrypoint to conditionally load the extension
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description | 
|---|---|
| apps/opik-backend/pom.xml | Bumped OpenTelemetry version to 2.21.0 | 
| apps/opik-backend/entrypoint.sh | Added conditional logic to attach telemetry extension to Otel agent | 
| apps/opik-backend/Dockerfile | Integrated telemetry extension build and deployment into Docker image | 
| COPY --from=build /opt/opik-backend/target/openapi.yaml redoc/ | ||
| COPY --from=build /opt/opik-backend/target/*.jar ./ | ||
| # Copy telemetry extension JAR (built as part of the build process) | ||
| COPY --from=build /opt/opik-backend/opik-telemetry-extension/target/opik-telemetry-extension-*.jar ./opik-telemetry-extension.jar | 
    
      
    
      Copilot
AI
    
    
    
      Oct 27, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wildcard pattern opik-telemetry-extension-*.jar may match multiple JAR files if version suffixes or classifiers are present. This could result in copy failures or unpredictable behavior. Consider using a more specific pattern or renaming the artifact during the build phase to ensure exactly one file is copied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think copilot has a point here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not a problem, we're the ones generating the .jar, it's fixed as 1.0.0-SNAPSHOT
| SDK E2E Tests Results0 tests   0 ✅  0s ⏱️ Results for commit d40a987. ♻️ This comment has been updated with latest results. | 
d40a987    to
    213328f      
    Compare
  
    | Backend Tests Results5 205 tests   5 198 ✅  49m 27s ⏱️ Results for commit a30439d. ♻️ This comment has been updated with latest results. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I Had some comments I believe we should attend before merging
|  | ||
| @Override | ||
| public void customize(AutoConfigurationCustomizer customizer) { | ||
| System.out.println("[OpikTelemetryExtension] Applying Opik-specific OpenTelemetry metric provider customizations"); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| System.out.println("[OpikTelemetryExtension] Applying Opik-specific OpenTelemetry metric provider customizations"); | |
| log.info("[OpikTelemetryExtension] Applying Opik-specific OpenTelemetry metric provider customizations"); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replied here #3812 (comment)
| assertNotNull(result); | ||
| assertEquals(meterProviderBuilder, result); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use assertJ assertions instead of JUnit
| int order = customizer.order(); | ||
|  | ||
| // Then | ||
| assertEquals(100, order); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, assertJ
| COPY --from=build /opt/opik-backend/target/openapi.yaml redoc/ | ||
| COPY --from=build /opt/opik-backend/target/*.jar ./ | ||
| # Copy telemetry extension JAR (built as part of the build process) | ||
| COPY --from=build /opt/opik-backend/opik-telemetry-extension/target/opik-telemetry-extension-*.jar ./opik-telemetry-extension.jar | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think copilot has a point here
        
          
                ...telemetry-extension/src/main/java/com/comet/opik/telemetry/OpikMetricProviderCustomizer.java
          
            Show resolved
            Hide resolved
        
      |  | ||
| // Create a single view that includes workspace context attributes | ||
| // This view will be applied to all instruments without changing their names | ||
| View workspaceView = View.builder() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry i cant understand how this is injecting workspaceId and the sdk version inside the metrics. What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The header is added in the propagation context via OTEL_INSTRUMENTATION_HTTP_SERVER_CAPTURE_REQUEST_HEADERS. However, it removes all attributes not specified by the instrumentation by default. Then I had to create this extension following this:  https://github.com/open-telemetry/opentelemetry-java/tree/main/sdk-extensions/incubator#file-configuration
…_and_sdk_version_to_otel_metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Details
This PR adds an Otel extension to the Otel agent so that it adds the workspace header to our HTTP-related metrics. This is required to avoid losing the filter granularity we have today.
Change checklist
Issues
NA
Testing
NA
Documentation
NA