#1425  Build not starting (stuck in "waiting" state)
Closed
devrl opened 10 months ago

I am currently evaluating OneDev on a self-hosted server. I am using a container (Podman) and an external database PostgreSQL 15. The setup, creation of projects etc. works fine, but I have a problem with builds.

I wanted to create a Github Sync (Push) build and followed the instructions in the documentation. However, the job never seems to start, it always stays in "Waiting" status. There are also no log outputs, neither in the job, nor in the container logs.

Does anyone have an idea what this could be or how I can debug this better?

Robin Shen commented 10 months ago

Please attach a sample project demonstrating this, as well as command line you are using to launch OneDev via podman

devrl commented 10 months ago

I just restarted the container which resolved the issue with the stuck "waiting" state. It went to failed on the last started job and gave me an error message like I would expect it. I didn't change anything else, aside of restarting the container.

I'm still stuck as the problem of the failed job is a missing job executor. I added an agent (Docker 1dev/agent), which is recognized well but the job execution fails, as the docker socket is not reachable from within the container. I get the following error-message trying to execute "docker ps -a" from within the agents (or the app) container.

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

Here's the podman run command I'm using to start the agent.

podman run -d --replace --name onedev-agent --network onedev-net -v $XDG_RUNTIME_DIR/podman/podman.sock:/var/run/docker.sock -v $HOME/onedev/data/agent/work:/agent/work -e serverUrl=http://onedev-app:6610 -e agentToken=XXX docker.io/1dev/agent
Robin Shen commented 10 months ago

Have you started podman api service? For instance run below command on terminal:

podman system service --time 3600

Then restart onedev server or agent (depending on which executor you are using).

devrl commented 10 months ago

Yes, the API service is running (also tried it manually with your last suggestion. I'm able to connect from the host machine, e.g. via:

curl -H "Content-Type: application/json" --unix-socket $XDG_RUNTIME_DIR/podman/podman.sock http://localhost/_ping

But inside the agent / server container (after restarting & recreating them) a connection to the docker daemon is still not possible. It also refuses to connect at all, e.g. via:

curl  -H "Content-Type: application/json" --unix-socket /var/run/docker.sock http://localhost/_ping
curl: (7) Couldn't connect to server
Robin Shen commented 10 months ago

What is the host OS, and what is your podman version? How did you install podman?

devrl commented 10 months ago

The host runs Debian 11.7 with Podman 3.0.1 from the Debian repositories (fairly old, but recommended by the Podman installation guide for Debian).

Robin Shen commented 10 months ago

Works fine at my side on ubuntu with Rodman 3. Will spawn a Debian host and do some tests on that. 

devrl commented 10 months ago

I just upgraded to Debian 12, which includes Podman 4. Everything works as expected now, so it seems to be a problem with Debian 11 and/or Podman 3.0.1. Thank you, for your assistance!

Edit: For more separation (due to the socket access) I created a new user and recreated the containers from ground up. This time, everything worked as expected, the build immediately failed instead of stay in "waiting" without restarting it after the first start. So, setting up is was flawless with Debian 12 / Podman 4.

Robin Shen changed state to 'Closed' 10 months ago
Previous Value Current Value
Open
Closed
Robin Shen commented 10 months ago

Great. I am closing this. Feel free to open other issues in case you have problems going forward.

issue 1 of 1
Type
Bug
Priority
Normal
Assignee
Affected Versions
8.3.8
Issue Votes (0)
Watchers (4)
Reference
onedev/server#1425
Please wait...
Page is in error, reload to recover