-
-
I've found a possible the source of the issue. It could be caused by change of the nodes that was not reflected in DNS. Only 2 of 3 IPs assigned for our domain that hosts OneDev were actually accepting requests, so 1 on 3 requests could fail (depending on selected by HTTP client IP to connect).
I wonder, if actually using for communication internal service domain names (in kubernetes) wouldn't be better? But I think right now it is not possible to use it as
Server URLmay be used for other things than just internal communication between internal components of OneDev. Maybe it would be worth to retry a request that fails with timeout or 502 error? -
Normally dns entry should map domain name to load balancer ip, and the load balancer then forward request to working nodes automatically.
Also you may retry job when certain condition is satisfied, such as log contains some text pattern. Check more settings of the job for details.
-
Previous Value Current Value Open
Closed
| Type |
Question
|
| Priority |
Normal
|
| Assignee | |
| Labels |
No labels
|
I'm using Kubernetes Executor for running builds. Unfortunately, after restart of OneDev (that was caused most likely during OOM on the cluster node), build stopped working. When I try to execute a build I'm getting following error (I'm attaching with a larger part of a log):
As you can see, previous HTTP request (in previous steps) were working, so I wonder what could be an issue here? Is it OneDev failing to accept connection or kubernetes cluster? or maybe something else?