Skip to content

cpToPod failed after 30 attempts (related to permission issue) #257

@vvanouytsel

Description

@vvanouytsel

I have been trying to implement #244 by using version 0.8.0 of runner-container-hooks.

However I have been struggling so far to get it to work.
This is the template I have been using.

template:
  spec:
    securityContext:
        fsGroup: 1000
    serviceAccountName: xxx-redacted
    containers:
    - name: runner
      image: my-company.registry.com/arc-linux-runner:2.328.0-3dacb977f4376955e068080cf4e8a42efe84e83d
      command: ["/home/runner/run.sh"]
      resources:
        limits:
          ephemeral-storage: 2Gi
          memory: 512Mi
        requests:
          cpu: 50m
          ephemeral-storage: 2Gi
          memory: 256Mi
      env:
        - name: ACTIONS_RUNNER_USE_KUBE_SCHEDULER
          value: "true"
        - name: ACTIONS_RUNNER_PREPARE_JOB_TIMEOUT_SECONDS
          value: "1800"
        - name: ACTIONS_RUNNER_CONTAINER_HOOKS
          value: /home/runner/k8s/index.js
        - name: ACTIONS_RUNNER_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINER
          value: "true"
        - name: ACTIONS_RUNNER_CONTAINER_HOOK_TEMPLATE
          value: /home/runner/hook-extension.yaml
      volumeMounts:
        - name: hook-extension
          mountPath: /home/runner/hook-extension.yaml
          subPath: content
    nodeSelector:
      kubernetes.io/arch: amd64
    volumes:
      - name: hook-extension
        configMap:
          name: hook-extension-{{ gha_runner_scale_set__name }}

The Initialize Containers step always fails with the following message:

Run '/home/runner/k8s/index.js'
(node:66) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
Error: Error: cpToPod failed after 30 attempts: {}
Error: Process completed with exit code 1.
Error: Executing the custom container implementation failed. Please contact your self hosted runner administrator

By enabling debug mode it becomes clear that the issue is related to permissions.

##[debug]cpToPod: Attempt 1 failed: Error: Error from cpFromPod - details: 
##[debug] tar: _PipelineMapping: Cannot mkdir: Permission denied
##[debug]tar: _temp: Cannot mkdir: Permission denied
##[debug]tar: _tool: Cannot mkdir: Permission denied
##[debug]tar: tm-pnt-dummy: Cannot utime: Operation not permitted
##[debug]tar: _PipelineMapping: Cannot mkdir: Permission denied
##[debug]tar: _PipelineMapping/TrendMiner: Cannot mkdir: No such file or directory
##[debug]tar: _temp: Cannot mkdir: Permission denied
##[debug]tar: _temp/_github_home: Cannot mkdir: No such file or directory
##[debug]tar: _temp: Cannot mkdir: Permission denied
##[debug]tar: _temp/_github_workflow: Cannot mkdir: No such file or directory
##[debug]tar: _temp: Cannot mkdir: Permission denied
##[debug]tar: _temp/_runner_hook_responses: Cannot mkdir: No such file or directory
##[debug]tar: tm-pnt-dummy/tm-pnt-dummy: Cannot utime: Operation not permitted
##[debug]tar: _PipelineMapping/TrendMiner: Cannot mkdir: No such file or directory
##[debug]tar: _PipelineMapping/TrendMiner/tm-pnt-dummy: Cannot mkdir: No such file or directory
##[debug]tar: _temp/_runner_hook_responses: Cannot mkdir: No such file or directory
##[debug]tar: _temp/_runner_hook_responses/aa549ea6-b227-441e-ab3f-7ca69acc9602.json: Cannot open: No such file or directory
##[debug]tar: _PipelineMapping/TrendMiner: Cannot mkdir: No such file or directory
##[debug]tar: _PipelineMapping/TrendMiner/tm-pnt-dummy/PipelineFolder.json: Cannot open: No such file or directory
##[debug]tar: .: Cannot utime: Operation not permitted
##[debug]tar: Exiting with failure status due to previous errors

The workflow that I am using is the following.

name: Troubleshoot autoscaling
on:
  workflow_dispatch:
jobs:
  job1:
    runs-on: default-staging
    container: company.registry.com/tm-gha-base:v1
    steps:
      - name: Use a big runner to test autoscaler
        run: sleep 600

Our tm-gha-base container is an ubuntu based container which runs as uid 1000 and thus not as root.
As you can see in the scaleset configuration, we are also defining a securityContext with fsGroup set to 1000.

Is there any support to use non-root containers, like we are using now?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions