Skip to main content

Job cancelling

Motivation

Saving up resources of the external services (especially for projects running builds / tests for an extensive matrix of architectures × releases × distros).

Finding the jobs to cancel

danger

We also have to consider subsequent jobs, i.e., running TF after Copr build succeeds.

We should differentiate here based on the trigger / event.

note

In general, when pushing to branches git forges usually provide previous commit hash, parse and provide it in the event, so that we can optimize the lookup in the database on our side.

Triggered by commit / pull request

note

In both cases we should be given previous commit hash.

In the most ideal scenario, we should utilize the provided previous commit, to find the latest pipeline that might be still running.

Arch discussion

Do not cancel builds/tests on commit trigger for now.

Lookup based on the commit hash

Arch discussion

Start with the cheapest approach, i.e., this one.

Finding the latest pipeline that might be still running based on the commit hash can be done by lookup through PipelineModel and ProjectEventModel (provided via project_event_id) that has a commit hash attribute.

Alternative approach

note

Looks much simpler using the ORM, but boils down to the enumeration below anyways.

  1. Join on pipelines × project events
  2. Filter by event type (commit or pull request)
  3. Join on (pipelines × project events) × specific events
  4. And then find latest by
    • commit: branch
    • pull request: PR ID

Triggered by release

tl;dr n/a

Doesn't make sense to consider, since there is no reasonable scenario for re-releasing.

Subsequent jobs

Given the pipelines we store, it shouldn't be hard, basically similar approach as for cancelling the initial job, just gotta check any other fields in the same row.

Cancelling the jobs themselves

Given that we know what we want to cancel, it is relatively easy to execute with the most critical services that we integrate (VM Image Builder is rather vague in the description of what DELETE on a compose means and OpenScanHub has no mention of allowing to cancel running scans).

Copr

With build ID it's possible to easily cancel the job via the following API call on our side:

# packit/copr_helper.py
self.copr_client.build_proxy.cancel(build_id)

Link to Copr docs

Testing Farm

TF has an API endpoint for deleting the test requests:

DELETE https://api.testing-farm.io/v0.1/requests/{request_id}

Koji

There are multiple API calls:

Arch discussion

Do not consider for now, but could be beneficial for saving resources of the Fedora Infra once we run as a Fedora CI.

VM Image Builder

It appears that it is possible to delete a compose:

/composes/{composeId}:
delete:
description: |
Deletes a compose, the compose will still count towards quota.
operationId: deleteCompose
responses:
"200":
description: OK
summary: delete a compose
get:
description: status of an image compose
operationId: getComposeStatus
responses:
"200":
content:
application/json:
schema:
$ref: "#/components/schemas/ComposeStatus"
description: compose status
summary: get status of an image compose
tags:
- compose
parameters:
- description: Id of compose
in: path
name: composeId
required: true
schema:
example: 123e4567-e89b-12d3-a456-426655440000
format: uuid
type: string

Though based on the description, as you can see, it still counts towards the quota and there's no mention that it would cancel running image build.

OpenScanHub

Based on a brief look through the docs, it appears that it is not possible to cancel running scans.

Breakdown

Suggested splitting into subtasks:

  • Implement cancelling in the Packit API (knowing what needs to be cancelled, i.e., Copr build ID, Testing Farm request, etc.)

  • Implement methods that would yield respective jobs to be cancelled, i.e., after retriggering a Copr build, we should get a list of running Copr builds associated with the previous trigger

  • Automatically cancel running jobs once an update happens, e.g., push to a PR, branch, or retriggering via comment.

  • Improve the previous method by incorporating subsequent jobs

    • NOTE: this might get more complex after implementation of job dependencies
  • Allow users to cancel running jobs via comment

  • Allow users to cancel running jobs via custom GitHub Check action

    • NOTE: custom action can incorporate additional metadata provided by us, therefore cancelling this way could be pretty cheap (there would be no need to deduce which jobs need to be cancelled)
    • NOTE: there's a smallish issue of differentiating of what should be cancelled (could be handled by multiple custom actions), for example:
      • Copr build for specific target
      • Copr build for all targets
      • Copr build for all targets matching an identifier
  • (optionally, low-prio) Allow this to be configurable

    • use case: I want to be able to test multiple Copr builds, even if they were triggered in a succession of pushes

    • NOTE: this use case could be more beneficial for running commit events rather than PR, i.e.

      as a maintainer I'd like to retain all builds that were pushed to the main, or stable