Detections API (v1/text/contents) integration with base interface pattern and extensible client architecture #14

srikartondapu · 2025-11-26T15:12:49Z

Description

This PR adds support for the Detections API v1/text/contents protocol, enabling NeMo Guardrails to communicate with external detector services that implement this standardized interface (e.g., TrustyAI guardrails-detectors, FMS Guardrails Orchestrator detectors).

Key Changes:

Base Interface Pattern: Introduced BaseDetectorClient abstract class that eliminates code duplication when supporting multiple detector API protocols. Common logic (HTTP communication, session management, authentication, error handling) is shared, while API-specific logic (request/response formats) is isolated in subclass implementations.
Detections API Client: Implemented DetectionsAPIClient that handles:
- Request format: {"contents": [text], "detector_params": {}}
- Response parsing: Nested array structure [[{detection1}, detection2}]]
- Multiple detections per text with threshold-based filtering
- Rich metadata extraction (spans, categories, confidence scores)
Configuration Support: Added DetectionsAPIConfig to RailsConfigData enabling ConfigMap-driven detector management without code changes.
Action Functions: Implemented detections_api_check_all_detectors() and detections_api_check_detector() for NeMo rails.co integration with parallel execution and proper error separation (system errors vs content violations).
Comprehensive Documentation: Added deployment guide with Granite Guardian HAP example, testing instructions, and guide for adding new detectors.

Design Benefits:

Extensible: Add new API protocols by implementing build_request() and parse_response() methods only
No code duplication: Shared orchestration, HTTP, and error handling across all detector types
Configuration-driven: Add/remove detectors via ConfigMap updates

Testing Performed:

Deployed Granite Guardian HAP detector using TrustyAI guardrails-detectors
Verified safe content passes through to LLM
Verified harmful content blocked by detector (jailbreak, harm, unethical_behavior detection)

Related Issue(s)

Addresses the need for standardized detector API integration to support multiple detector service protocols (Detections API, KServe V1, future protocols) through a unified, extensible architecture.

Checklist

I've read the CONTRIBUTING guidelines.
I've updated the documentation if applicable.
I've added tests if applicable.
@m-misiura for review

- Implement base interface pattern for extensible detector clients - Add DetectionsAPIClient for v1/text/contents protocol - Support configuration-driven detector management via ConfigMap - Add comprehensive documentation and deployment guide

m-misiura

You should run pre-commit to make this code adhere to the NeMo style

m-misiura · 2025-12-02T09:59:21Z

nemoguardrails/library/detector_clients/detections_api.py

+import logging
+from typing import Any, Dict, List
+
+from .base import BaseDetectorClient, DetectorResult


module imports are hanled inconsistenly, e.g. here they are relative but in actions.py, they are absolute, e.g.

from nemoguardrails.library.detector_clients.base import DetectorResult from nemoguardrails.library.detector_clients.detections_api import DetectionsAPIClient

m-misiura · 2025-12-02T10:08:13Z

nemoguardrails/library/detector_clients/detections_api.py

+        Returns:
+            DetectorResult with parsed detection outcome
+        """
+        if http_status != 200:


can this distinguish between e.g. 404: Detector not found and 422: Validation error (invalid request)?

if http_status != 200: return DetectorResult( allowed=False, score=0.0, reason=f"HTTP {http_status} error", label="ERROR", detector=self.detector_name, metadata={"http_status": http_status} )

see Detector API spec: https://foundation-model-stack.github.io/fms-guardrails-orchestrator/docs/api/openapi_detector_api.yaml

m-misiura · 2025-12-02T10:18:51Z

nemoguardrails/library/detector_clients/detections_api.py

+        scores = [d.get("score", 0.0) for d in detections]
+        return max(scores) if scores else 0.0
+
+    def _calculate_average_score(self, detections: List[Dict[str, Any]]) -> float:


I am not sure how informative it is to calculate average score across detectors; it might be best to remove _calculate_average_score

m-misiura · 2025-12-02T15:51:17Z

docs/user-guides/detections-api-integration.md

I am not sure if the defined Colang flow definition is correct or if there is something wrong with the implementation

Re-running the same message gives me inconsistent outputs, e.g. sometimes

variant 1

{"messages":[{"role":"assistant","content":"I'm sorry, but I couldn't process your request due to the following reason: Blocked by 2 Detections API detector(s): toxic-prompt-roberta-detector, ibm-hap-38m-detector. Please feel free to ask something else or try rephrasing your question."}]

variant 2:

{"messages":[{"role":"assistant","content":"Sorry, but I'm unable to assist with that request."}]}%

variant 3

{"messages":[{"role":"assistant","content":"This prompt is blocked by 2 Detections API detector(s): toxic-prompt-roberta-detector, ibm-hap-38m-detector"}]}%

and so on; please investigate

m-misiura · 2025-12-02T16:02:15Z

nemoguardrails/library/detector_clients/detections_api.py

+            }
+        )
+
+    def _extract_detections_from_response(


there could be opportunities to potentially simplify _extract_detections_from_response since I am not sure if the API can return a flat array, but please check

m-misiura · 2025-12-02T16:03:34Z

nemoguardrails/library/detector_clients/detections_api.py

+
+        return response
+
+    def _calculate_highest_score(self, detections: List[Dict[str, Any]]) -> float:


is this really necessary or is it possible to use e.g. in-built max function instead of the custom one?

m-misiura · 2025-12-02T16:07:28Z

nemoguardrails/library/detector_clients/detections_api.py

+                "average_score": average_score,
+                "individual_scores": individual_scores,
+                "highest_detection": highest_detection,
+                "detections": filtered_detections


L176 seems inconsistent with L149?

if metadata is needed, would it be better to have a consistent format where you always display all detectors and then just say pass / fail per detector?

m-misiura · 2025-12-02T16:15:35Z

nemoguardrails/library/detector_clients/actions.py

+        result = await client.detect(text)
+        return result
+
+    except Exception as e:


is this dead code as detections_api.py already has try/except that always returns DetectorResult?

m-misiura · 2025-12-02T16:17:05Z

nemoguardrails/library/detector_clients/actions.py

+    if isinstance(user_message, dict):
+        user_message = user_message.get("content", "")
+
+    detections_api_detectors = getattr(


a safer pattern or proper check might be worthwhile to implement

m-misiura · 2025-12-02T16:18:35Z

nemoguardrails/library/detector_clients/actions.py

+        f"{list(detections_api_detectors.keys())}"
+    )
+
+    tasks_with_names = [


Why store tuples then extract with task[1] and tasks_with_names[i][0]?

m-misiura · 2025-12-02T16:21:21Z

nemoguardrails/library/detector_clients/actions.py

+    if not config:
+        return {"allowed": False, "reason": "No configuration"}
+
+    user_message = context.get("user_message", "")


does this mean I can only set this up as in input guardrail?

what if I would like to also configure this fr other message types?

m-misiura · 2025-12-02T16:27:26Z

nemoguardrails/library/detector_clients/actions.py

it seems to me that only input guardrails have been implemented; it would be good to also have this working as output guardrails and work on more than just user_message; perhaps check out how other providers handle this

m-misiura · 2025-12-02T16:45:13Z

nemoguardrails/library/detector_clients/base.py

is there a HTTP session leak in this file? please investigate

m-misiura

Please go through all the comments; most importantly:

colang flow in the user guide does not appear to be quite correct (there is stochasticity in the outputs received when sending a request with the same input text
I think the current implementation only works on inputs, consider how this could be extended by look at other providers
there are some redundancies and unnecessary code; consider removing any dead code
metadata -- I am not sure if all fields are needed especially things like average score across detectors
pre commit should be run on all files
consider adding some unit tests

srikartondapu force-pushed the feature/detections-api-integration branch from 3c0323a to 548e85a Compare December 1, 2025 16:53

m-misiura reviewed Dec 2, 2025

View reviewed changes

nemoguardrails/library/detector_clients/base.py

Copy link

Collaborator

m-misiura Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a HTTP session leak in this file? please investigate

m-misiura requested changes Dec 2, 2025

View reviewed changes


		return response

		def _calculate_highest_score(self, detections: List[Dict[str, Any]]) -> float:

Detections API (v1/text/contents) integration with base interface pattern and extensible client architecture #14

Are you sure you want to change the base?

Detections API (v1/text/contents) integration with base interface pattern and extensible client architecture #14

Uh oh!

Conversation

srikartondapu commented Nov 26, 2025

Description

Related Issue(s)

Checklist

Uh oh!

m-misiura left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

m-misiura Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

m-misiura left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

m-misiura Dec 2, 2025 •

edited

Loading