Skip to content

Conversation

@Cirilla-zmh
Copy link
Member

@Cirilla-zmh Cirilla-zmh commented Nov 11, 2025

Partly resolve #15174

As the prior PR of #12878

cc @anuraaga @vasantteja

…37.0.

Change-Id: I5fb76c82be9f53414fefe008b02b295410a72078

Change-Id: I2108810fbbb4ea96408637807e6da6abac0875d7
Change-Id: Ifb06d47236bd800cce3f8933e366fc6d69afc5cd
@Cirilla-zmh Cirilla-zmh requested a review from a team as a code owner November 11, 2025 16:12
@Override
public String getOperationTarget(ExecutionAttributes executionAttributes) {
// FIXME: Only work when operation name are chat or text_completion. Fix this if there's more
// kinds of operation names.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anuraaga I believe it works well now, but it's necessary to let you know :).

CaptureMessageOptions captureMessageOptions) {
if (captureMessageOptions.emitExperimentalConventions()) {
// According to https://opentelemetry.io/docs/specs/semconv/gen-ai/openai/
// skip if experimental conventions are not enabled
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see OpenTelemetryBuilder.java, where I explain why it looks like a weakening of instrumentation.

private boolean captureMessageContent;

// TODO(cirilla-zmh): Java Instrumentation is still not support structural attributes for
// 'gen_ai.input.messages'. Implement this once it's supported.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laurit
Sorry to bother you but I couldn't find attributes like gen_ai.input.messages in the io.opentelemetry.semconv:opentemetry-semconv-incubating:1.37.0-alpha library. I believe these attributes have been appended in OTel 1.37.0.

Is it the reason why Java currently does not support attributes of type any?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the help @Cirilla-zmh - complex attributes are not trivial to implement unfortunately and there are several ideas being prototyped (/cc @trask )

open-telemetry/opentelemetry-java#7814

At this rate, maybe we'll get to prototype Z before it's in the SDK ;)

Sorry for not reading the actual new spec yet, I was expecting a smaller change but it's huge.... Since it seems the new genai attributes require complex attributes, how about holding off on supporting 1.37 (aka this PR) until the SDK gets support for them?

Also, while I understand the spec strongly suggests outputting both old and new conventions, I would strongly advocate to ignore this advice. They're not simple renames like the http semconv changes, logs vs attributes is large and will make instrumentation really hard to follow I think. I could imagine caring more about migration paths for a widely used AI language like Python, but in Java I think we can take some liberty. If others agree, then we wouldn't need the new experimental flag, we'd just switch to the latest conventions when the SDK is ready for them and only support them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anuraaga Thanks for your suggestions!

At this rate, maybe we'll get to prototype Z before it's in the SDK ;)

Yes, I had a short discussion and maybe we could serialize it into JSON string before we have an available SDK.

Since it seems the new genai attributes require complex attributes, how about holding off on supporting 1.37 (aka this PR) until the SDK gets support for them?

I hope so but we highly depend on this instrumentation. Many users are waiting for this refactor to build data pipelines. So I tend to emit complex attributes with JSON string first.

while I understand the spec strongly suggests outputting both old and new conventions, I would strongly advocate to ignore this advice.

Absolutely agree. 👍
cc @laurit @vasantteja

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I missed the note about the JSON fallback. Makes sense to go for it if we can work out including the serializer, i.e., shading Jackson 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the reason why Java currently does not support attributes of type any?

if you want to use complex attributes on events today, you can use the incubating api ExtendedLogRecordBuilder.set(ExtendedAttributeKey.extendedAttributesKey("key"), ExtendedAttributes)

note this incubating api will probably change in open-telemetry/opentelemetry-java#7814, and can't go stable before Jan 15 based on open-telemetry/opentelemetry-specification#4485

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you want to use complex attributes on events today, you can use the incubating api ExtendedLogRecordBuilder.set(ExtendedAttributeKey.extendedAttributesKey("key"), ExtendedAttributes)

What would happend if I serialize complex attributes and record them as stringValue? This at least won't cause compatibility issues due to API changes - it can work according to the standard for a longer period of time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is json string is compliant with the semconv definition?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is json string is compliant with the semconv definition?

I think so. According to OTEP#4485, we now have an unsupported otlp exporter in java, so we could record it as a json string.

In semconv we also have relative documents.

When recorded on spans, it MAY be recorded as a JSON string if structured format is not supported and SHOULD be recorded in structured form otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good we have support for complex in incubator - since the instrumentation is alpha, personally it seems fine to use the incubator, OTel versions should be synced up by BOM especially for alpha libraries and users would generally not have a problem. It allows us to avoid the Jackson issue which seems like it would be more work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It allows us to avoid the Jackson issue which seems like it would be more work.

Maybe you're right... We now don't depend on jackson even in javaagent-tooling module, and it's a little troublesome to shade it. But I don't understand ExtendedAttributes well so this is not my first choice before. Let me have a try and come back to discuss with you later. cc @trask @anuraaga

Change-Id: I265e7f65dfd7694a4d8623daca5eabc8a6f49574
@Override
public String getOutputType(ChatCompletionCreateParams request) {
if (request.responseFormat().isPresent()) {
ResponseFormat responseFormat = request.responseFormat().get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider map instead of isPresent+get

return this;
}

/** Sets whether emitted the latest experimental version of GenAI conventions. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/** Sets whether emitted the latest experimental version of GenAI conventions. */
/** Sets whether the latest experimental version of GenAI conventions is emitted. */

private boolean captureMessageContent;

// TODO(cirilla-zmh): Java Instrumentation is still not support structural attributes for
// 'gen_ai.input.messages'. Implement this once it's supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the help @Cirilla-zmh - complex attributes are not trivial to implement unfortunately and there are several ideas being prototyped (/cc @trask )

open-telemetry/opentelemetry-java#7814

At this rate, maybe we'll get to prototype Z before it's in the SDK ;)

Sorry for not reading the actual new spec yet, I was expecting a smaller change but it's huge.... Since it seems the new genai attributes require complex attributes, how about holding off on supporting 1.37 (aka this PR) until the SDK gets support for them?

Also, while I understand the spec strongly suggests outputting both old and new conventions, I would strongly advocate to ignore this advice. They're not simple renames like the http semconv changes, logs vs attributes is large and will make instrumentation really hard to follow I think. I could imagine caring more about migration paths for a widely used AI language like Python, but in Java I think we can take some liberty. If others agree, then we wouldn't need the new experimental flag, we'd just switch to the latest conventions when the SDK is ready for them and only support them.

| System property | Type | Default | Description |
|------------------------------------------------------|---------|---------|------------------------------------------------------------------------|
| `otel.instrumentation.genai.capture-message-content` | Boolean | `false` | Record content of user and LLM messages. |
| `otel.semconv-stability.opt-in` | Boolean | `` | Enable experimental features when set as `gen_ai_latest_experimental`. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think this follows our standard convention. This config is used elsewhere, and not as a boolean, but a string that can contain a list of values ConfigPropertiesUtil.getString("otel.semconv-stability.opt-in");

I would think the value would simply be "genai" following the other examples ("database", "code" etc)

Also, the description "gen_ai_latest_experimental" indicates to me that this opts into experimental features, but isn't it the opposite ("stability.opt-in")?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaydeluca Thanks for your reply!

i dont think this follows our standard convention. This config is used elsewhere, and not as a boolean, but a string that can contain a list of values ConfigPropertiesUtil.getString("otel.semconv-stability.opt-in");

Sorry for this typo, it's a string for a comma-separated config list for sure.

Also, the description "gen_ai_latest_experimental" indicates to me that this opts into experimental features, but isn't it the opposite ("stability.opt-in")?

Just follow this spec - https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans.

IMO, otel.semconv-stability.opt-in is just an option to control the behaviors gen-ai instrumentations emit the telemetry data. Setting this configuration is only to make a choice among different 'stabilities', and does not mean choosing the more stable version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm, in #15268 (comment) it seems we agree we don't need to support the old logs, so we wouldn't need this config anymore right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate the semantic conventions of OpenAI instrumentation to 1.37.0.

4 participants