#1838  Kubernetes Executor hangs after successful run (missing privileges?)
Closed
Martin Langer opened 3 weeks ago

Hi Robin, I managed to get the kubernetes executor running with an image which was build from a kaniko job earlier in the pipeline.

The image contains tests, which are actually successful, but the executor hangs because of a permission error. The job also does not get cleaned up in the cluster and the jobs basically run forever in the build tab.

Here the log from the successful test job:

18:38:01 No job executor defined, auto-discovering...
18:38:01 Discovered job executor type: Kubernetes Executor
18:38:01 Checking cluster access...
18:38:01 Preparing job (executor: auto-discovered, namespace: auto-discovered-1-12-0)...
18:38:02 Running job on node lima-rancher-desktop...
18:38:02 Starting job containers...
18:38:03 Retrieving job data from http://192.168.64.2:6610...
18:38:03 Generating command scripts...
18:38:03 Downloading job dependencies from http://192.168.64.2:6610...
18:38:03 Job workspace initialized
18:38:09 Running step "Unittests"...
18:38:09 Running pytest unittests:
18:38:09 ============================= test session starts ==============================
18:38:09 platform linux -- Python 3.11.8, pytest-8.1.1, pluggy-1.4.0
18:38:09 rootdir: /app
18:38:10 collected 53 items                                                             
18:38:10 
18:38:16 tests/unit/test_image_processing.py ...............................      [ 58%]
18:38:19 tests/unit/test_image_solvers.py ........                                [ 73%]
18:38:20 tests/unit/test_optimization_solver_basics.py ..............             [100%]
18:38:20 
18:38:20 ============================= 53 passed in 10.68s ==============================
18:38:20 Step "Unittests" is successful
18:38:20 /usr/bin/touch: cannot touch '/onedev-build/mark/0.successful': Permission denied

Seems like the CI systems wants to create some files where permissions and/or the folders are missing. For reference, here the Dockerfile which yields the base image for the tests:

ARG BASE_IMAGE=python:3.11.8-slim

FROM ${BASE_IMAGE}

# General setup
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PIP_PROGRESS_BAR=off

WORKDIR /app

# Install the required packages
COPY requirements.txt /app/requirements.txt
RUN pip install --no-cache-dir --root-user-action=ignore -r /app/requirements.txt

# Create non-privileged user for general use and adjust ownership
RUN adduser --disabled-password \
    --gecos "" \
    --home "/nonexistent" \
    --shell "/sbin/nologin" \
    --no-create-home \
    --uid 10001 appuser \
 && chown -R appuser /app

USER appuser

I intend to run the tests as non-root and have the container non-privileged.

Is this intended and a mistake on my side or a bug?

Robin Shen commented 3 weeks ago

For containers running as non root user, please tell OneDev its uid:gid in more settings of the step like below:

2024-04-11_09-57-52.png

Robin Shen commented 3 weeks ago

PS: for the never-ending build previously, you may cancel it, and OneDev will release k8s resources associated with the build.

Martin Langer commented 3 weeks ago

Thank you very much, that was the problem! Awesome kubernetes CI tooling!

Martin Langer changed state to 'Closed' 3 weeks ago
Previous Value Current Value
Open
Closed
issue 1 of 1
Type
Bug
Priority
Normal
Assignee
Affected Versions
OneDev 10.4.0
Labels
No labels
Issue Votes (0)
Watchers (3)
Reference
onedev/server#1838
Please wait...
Page is in error, reload to recover