Skip to content

Invalid MetricSnapshot breaks prometheus scraping #1780

@Antibrumm

Description

@Antibrumm

Hi
In one of my projects I just updated the KafkaStreams library and discovered that they implemented a counter with a negative value. This leads to an IllegalArgumentException during the collection phase in the scraping call, which results in a complete loss of the data and ends up in a response code of 500.

While the implementation in KafkaStreams is wrong I also feel that prometheus could handle such invalid cases a little better by simply dropping the invalid MetricSnapshots instead of failing the complete processing.

The Kafka team seems to be aware of the issue but it's not planned to be solved before version 5.0: https://issues.apache.org/jira/browse/KAFKA-18495
The metric defines a counter sensor with the value of -1.

This results in an IllegalArgumentException here: https://github.com/prometheus/client_java/blob/main/prometheus-metrics-model/src/main/java/io/prometheus/metrics/model/snapshots/CounterSnapshot.java#L153
Which in turn bubbles up to the servlet and results in a 500 response error code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions