Skip to content

Conversation

@aryangorwade
Copy link
Collaborator

@aryangorwade aryangorwade commented Sep 4, 2025

Helm

The operator now generates validatingwebhookconfiguration in the operator code itself (in cmd/main.go). Also detects orchestrator type in main.go. validatingwebhookconfiguration removed from Helm charts.

internal/webhook/apps/v1alpha1/configuration.go contains this generation.

New environment variables introduced: TLS_MODE, TLS_CA, TLS_SECRET, OPERATOR_NAMESPACE, OPERATOR_NAME_PREFIX.

Additionally implemented webhook signing configuration in a similar manner to DRA driver:
This change introduces a new Helm chart value to supply a CA bundle (e.g., from ca.crt) directly into the ValidatingWebhookConfiguration when cert-manager is not used. Secret must be created with tls.key and tls.crt.
This allows clusters without automated CA injection to configure the webhook manually and ensure proper TLS verification.

Note: in deployment.yaml, this is how TLS_CA is expected to be provided in values with an indent. Should that be removed or kept for aesthetic reasons:

          - name: TLS_CA
          {{- if and (eq .Values.operator.admissionController.tls.mode "secret") (.Values.operator.admissionController.tls.secret.caBundle) }}
            value: |-
{{ .Values.operator.admissionController.tls.secret.caBundle | nindent 14 }}
          {{- else }}
            value: ""
          {{- end }}

OLM

Validatingwebhookconfiguration (called webhookdefinitions in OLM) removed from CSV and moved to the operator code (same as above). The non-Helm specific changes (changes not in deployments are common to OLM as well (including new ENV variables).

In OLM, "OLM manages the webhook” means the CSV must include webhookdefinitions. If you remove them and generate the ValidatingWebhookConfiguration in your operator, OLM stops managing the webhook (including cert volume injection). In short, OLM manages the webhook lifecycle (including cert injection) only if webhookdefinitions are present in the CSV.

To work around this:

  • Removed webhookdefinitions from the CSV.
  • Added RBAC for validatingwebhookconfigurations in the CSV (add new apiGroup)
  • Added OpenShift inject CA annotation in webhook: service.beta.openshift.io/inject-cabundle": "true"
  • Created new file in bundle/manifests to create certificate and key: bundle/manifests/k8s-nim-operator.webhookservice.yaml with an OpenShift Service annotation: service.beta.openshift.io/serving-cert-secret-name: k8s-nim-operator-webhook-server-cert
  • Added the cert under manager container's volumeMounts and under the deployment's volumes.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Sep 4, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@aryangorwade aryangorwade force-pushed the operator-generating-validatingwebhookconfiguration branch 5 times, most recently from cb2bb64 to 7b9b3b6 Compare September 5, 2025 09:13
@aryangorwade aryangorwade force-pushed the operator-generating-validatingwebhookconfiguration branch from abf9c6e to 14cdda5 Compare September 5, 2025 22:52
@aryangorwade
Copy link
Collaborator Author

In deployment.yaml, this is how TLS_CA is expected to be provided in values with an indent so it looks neat. Should that be removed or kept for aesthetic reasons:

          - name: TLS_CA
          {{- if and (eq .Values.operator.admissionController.tls.mode "secret") (.Values.operator.admissionController.tls.secret.caBundle) }}
            value: |-
{{ .Values.operator.admissionController.tls.secret.caBundle | nindent 14 }}
          {{- else }}
            value: ""
          {{- end }}
```

@varunrsekar
Copy link
Collaborator

Thanks for the change @aryangorwade!
One comment: Can we add another option to generate self-signed certs as part of an init-container or application startup? This will help eliminate any dependencies for easy-install.

@aryangorwade
Copy link
Collaborator Author

@varunrsekar It will most likely not used in production or enterprise customers, but would help with individual customers to avoid having to generate certs and create an additional secret with them. @shivamerla is looking into its use case

@varunrsekar
Copy link
Collaborator

@varunrsekar It will most likely not used in production or enterprise customers, but would help with individual customers to avoid having to generate certs and create an additional secret with them. @shivamerla is looking into its use case

Thanks. To add to the usecase to have an init-container, we can also use it to validate the input cert when using a SECRET

@aryangorwade
Copy link
Collaborator Author

Closed. Split into #659 and #660.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants