Build not starting (stuck in "waiting" state) (OD-1425)
devrl opened 3 years ago

I am currently evaluating OneDev on a self-hosted server. I am using a container (Podman) and an external database PostgreSQL 15. The setup, creation of projects etc. works fine, but I have a problem with builds.

I wanted to create a Github Sync (Push) build and followed the instructions in the documentation. However, the job never seems to start, it always stays in "Waiting" status. There are also no log outputs, neither in the job, nor in the container logs.

Does anyone have an idea what this could be or how I can debug this better?

  • Robin Shen commented 3 years ago

    Please attach a sample project demonstrating this, as well as command line you are using to launch OneDev via podman

  • devrl commented 3 years ago

    I just restarted the container which resolved the issue with the stuck "waiting" state. It went to failed on the last started job and gave me an error message like I would expect it. I didn't change anything else, aside of restarting the container.

    I'm still stuck as the problem of the failed job is a missing job executor. I added an agent (Docker 1dev/agent), which is recognized well but the job execution fails, as the docker socket is not reachable from within the container. I get the following error-message trying to execute "docker ps -a" from within the agents (or the app) container.

    Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
    

    Here's the podman run command I'm using to start the agent.

    podman run -d --replace --name onedev-agent --network onedev-net -v $XDG_RUNTIME_DIR/podman/podman.sock:/var/run/docker.sock -v $HOME/onedev/data/agent/work:/agent/work -e serverUrl=http://onedev-app:6610 -e agentToken=XXX docker.io/1dev/agent
    
  • Robin Shen commented 3 years ago

    Have you started podman api service? For instance run below command on terminal:

    podman system service --time 3600
    

    Then restart onedev server or agent (depending on which executor you are using).

  • devrl commented 3 years ago

    Yes, the API service is running (also tried it manually with your last suggestion. I'm able to connect from the host machine, e.g. via:

    curl -H "Content-Type: application/json" --unix-socket $XDG_RUNTIME_DIR/podman/podman.sock http://localhost/_ping
    

    But inside the agent / server container (after restarting & recreating them) a connection to the docker daemon is still not possible. It also refuses to connect at all, e.g. via:

    curl  -H "Content-Type: application/json" --unix-socket /var/run/docker.sock http://localhost/_ping
    curl: (7) Couldn't connect to server
    
  • Robin Shen commented 3 years ago

    What is the host OS, and what is your podman version? How did you install podman?

  • devrl commented 3 years ago

    The host runs Debian 11.7 with Podman 3.0.1 from the Debian repositories (fairly old, but recommended by the Podman installation guide for Debian).

  • Robin Shen commented 3 years ago

    Works fine at my side on ubuntu with Rodman 3. Will spawn a Debian host and do some tests on that. 

  • devrl commented 3 years ago

    I just upgraded to Debian 12, which includes Podman 4. Everything works as expected now, so it seems to be a problem with Debian 11 and/or Podman 3.0.1. Thank you, for your assistance!

    Edit: For more separation (due to the socket access) I created a new user and recreated the containers from ground up. This time, everything worked as expected, the build immediately failed instead of stay in "waiting" without restarting it after the first start. So, setting up is was flawless with Debian 12 / Podman 4.

  • Robin Shen changed state to 'Closed' 3 years ago
    Previous Value Current Value
    Open
    Closed
  • Robin Shen commented 3 years ago

    Great. I am closing this. Feel free to open other issues in case you have problems going forward.

issue 1/1
Type
Bug
Priority
Normal
Assignee
Affected Versions
8.3.8
Issue Votes (0)
Watchers (4)
Reference
OD-1425
Please wait...
Connection lost or session expired, reload to recover
Page is in error, reload to recover