Unexpected cancellation of build jobs (OD-863)
Daniel Gollings opened 3 years ago

Sometimes a single or a couple build jobs get cancelled. And it just says 'cancelled by onedev', what are the criteria for cancellation? I'm not sure yet if bug or feature (I'm aware of the 'cancel on push' feature)

  • Daniel Gollings commented 3 years ago

    Not sure yet, but I'm getting a feeling its a race condition between PR builds and commit builds, with one cancelling the other.

    For context, not sure I mentioned it, its a micro service architecture. And I'm very happy to make integration tests (meaning heavy, slow tests that sometimes fail) which means I've split each service and its tests into a build job for easy retry and distribution over multiple build runners.executors.

    So each push triggers ~26 builds

  • Daniel Gollings commented 3 years ago

    think this was user error, although maybe onedev could warn the user or dedupe conflicting requests:

    I had the following triggers

       triggers:
       - !BranchUpdateTrigger
         branches: '* -master'
         paths: '**/*.go'
       - !PullRequestUpdateTrigger
         paths: '**/*.go'
    

    I'm not going to pretend it makes a lot of sense, but I do feel there should be a way to detect and warn/ignore such a weird request :)

  • Robin Shen commented 3 years ago

    A running build will be cancelled by OneDev automatically if same job with same parameters on same branch is submitted, as still running it does not make too much sense.

    However pull request builds will never be cancelled. If you observe unexpected build cancellation, please let me know the reproducing steps.

  • Daniel Gollings commented 3 years ago

    Best I can offer you is this weird conflicting trigger. Maybe combined with ~20 builds? I definitely saw ~30% of a PR being cancelled but not entirely sure as to the why

  • Daniel Gollings referenced from other issue 3 years ago
  • Robin Shen commented 3 years ago

    I tried with a few connected jobs running both PR and commit builds, and only commit jobs get cancelled which is expected. It will make the investigation a lot easier if you can reproduce the issue with some example setup.

  • Daniel Gollings commented 3 years ago

    I'll see if i can reproduce it again. But I would suggest maybe fixing the cause instead of the symptom. As in, ignore/warn/notify the user that the yaml has conflicting/weird triggers?

    Assuming you're correct and it works entirely as designed, it still weird that a push would trigger two jobs, one of which to be immediately canceled :)

  • Robin Shen commented 3 years ago

    The triggers you listed here is totally valid: they are doing two separate things, the branch update trigger tells OneDev to valid head commit of interesting branches, while the pull request trigger tells OneDev to valid merged commit of that pull request. And they should not affect one another. If you observe such behavior, please let me know how to reproduce.

  • Daniel Gollings commented 3 years ago

    Just now pushed to an old branch, which still has these weird/valid triggers

      triggers:
      - !BranchUpdateTrigger
        branches: '* -master'
        paths: '**/*.go'
      - !PullRequestUpdateTrigger
        paths: '**/*.go'
      triggers:
      - !BranchUpdateTrigger
        branches: '* -master'
        paths: frontend/**/*
      - !PullRequestUpdateTrigger
        paths: frontend/**/*
    

    'PR' pipeline image.png

    'commit' pipeline image_2.png

    it does appear the same jobs are being cancelled, I thought it was each cancelling the other, but there's still the questions as to why anything is being cancelled :)

    How to reproduce on your end, I'm not sure, I'd guess lots of (simple) buildjobs and similar file based triggers?

  • Robin Shen changed fields 3 years ago
    Name Previous Value Current Value
    Type
    Support Request
    Bug
    Seen Builds
    empty
    4130
  • OneDev changed state to 'Closed' 3 years ago
    Previous Value Current Value
    Open
    Closed
  • OneDev commented 3 years ago

    State changed as code fixing the issue is committed

  • Daniel Gollings commented 3 years ago

    you figured it out? Cool :)

  • Robin Shen commented 3 years ago

    Seems that the auto-cancellation feature is too aggressive. I simply disabled it. 😅

  • OneDev changed state to 'Released' 3 years ago
    Previous Value Current Value
    Closed
    Released
  • OneDev commented 3 years ago

    State changed as build #2904 is successful

issue 1/1
Type
Bug
Priority
Normal
Assignee
Affected Versions
Not Found
Issue Votes (0)
Watchers (4)
Reference
OD-863
Please wait...
Connection lost or session expired, reload to recover
Page is in error, reload to recover