-
OneDev does delete the network upon job cancellation. But it will take a while after build being cancelled.
-
The networks covered a long timespan. Currently we are at build number ~1300 and there were networks with ~800 up to recent builds not being removed. Build cancelation usually happens when a build runs but new commits on that branch have occurred. It doesn't happen too often for us so the networks have been accumulated slowly over time. I had to delete about 30-40 networks manually spread across three agent VMs.
So I am relatively sure cleanup has not happened. What does "take a while" mean?
-
Name Previous Value Current Value Type
Bug
Question
-
Name Previous Value Current Value Priority
Major
Normal
-
So I am relatively sure cleanup has not happened. What does "take a while" mean?
OneDev deletes the container and network in the background upon cancellation, and this can take some time. Most of the time this works, but occasionally container and network deletion fails for some unknow reason. I also see this happening outside of OneDev before.
-
I have now forced some cancellations by committing single commits in a row and you are right corresponding networks have been cleaned up, almost immediately actually.
Maybe OneDev should retry the cleanup 3 times with some backoff timer before giving up? Given that the container/network name is unique it should be fine to retry it multiple times.
-
Name Previous Value Current Value Type
Question
Improvement
-
Previous Value Current Value Agent fails to cleanup Docker networks if jobs are canceled
Retry deleting docker container/network upon failure
-
OneDev
changed state to 'Closed' 12 months ago
Previous Value Current Value Open
Closed
-
State changed as code fixing the issue is committed (5f5259da)
-
OneDev
changed state to 'Released' 11 months ago
Previous Value Current Value Closed
Released
-
State changed as build OD-5877 is successful
| Type |
Improvement
|
| Priority |
Normal
|
| Assignee | |
| Labels |
No labels
|
We received a build error saying that docker had no subnets available anymore to create a new network for the job container. Looking on the VM hosts we discovered that multiple networks created by agent have not been cleaned up correctly.
Comparing the numbers on the network name it seems like that a network is not cleaned up correctly if the build job has been canceled by OneDev.