Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: rework pipeline: short/extended based on labels #11324

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ant31
Copy link
Contributor

@ant31 ant31 commented Jun 25, 2024

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

  • The CI VMs are requesting more resources than they are using. It's filling up the servers quickly
  • Too many VMs are created per PR
  • Vagrant and Molecule jobs are using deprecated/archived docker-machine project
  1. This PR reduces the VM resource requests to help pack more VMs per server.
  2. Reduce the number of jobs that are triggered automatically on each PR.
  3. To run additional tests, it's possible to label the pr with "ci-extended", which should be equivalent to the previous pipeline
  4. To run a shorter pipeline : add the label ci-short
  5. Vagrant and Molecule jobs are running inside Kubevirt VM
  6. pre-commit runs in a single job: reduce resources usage without increasing time

Which issue(s) this PR fixes:

Fixes #8786

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/flake Categorizes issue or PR as related to a flaky test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 25, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ant31

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 25, 2024
@ant31 ant31 force-pushed the reduce-vm-requests branch 4 times, most recently from acfeded to f3239e7 Compare June 25, 2024 14:16
@ant31 ant31 added the ci-extended Run additional tests label Jun 25, 2024
@ant31
Copy link
Contributor Author

ant31 commented Jun 25, 2024

/retest

@MrFreezeex
Copy link
Member

MrFreezeex commented Jun 25, 2024

Hey! This kind of approach seems quite sensible to me, but what the policy should be here? Any code change we put the extended test (meaning not a doc only change for instance)? Or is it depending on if the reviewers (as of the persons who is reviewing a PR not the role reviewer on the repo) of a PR think it needs the extended test or not?

Also the kubespray reviewers who are not approvers can't edit labels I believe or maybe that would be just me, not sure 🤷‍♂️.

@ant31
Copy link
Contributor Author

ant31 commented Jun 25, 2024

@MrFreezeex it introduces the mechanism to do it. How it is used is up for discussion and the threasold can move easily.
At least, documentation PR should not trigger massive pipeline.

Which label trigger what can be changed too.
For example, it could trigger the extended when there's a lgtm, or size/L , or area/xyz

In this PR the default pipeline is close to the previous one (removed few redundant test imo) so it's not a big change.

  • Ci-short (manual)
  • Ci-long (default)
  • Ci-extended (manual)
    As mentioned can discuss what is default what is not.

About applying labels, yes it would be the role of the reviewer to determine it. PR owner could self assess too.
For the permissions, we can try to use prow /label or an equivalent bot.

@ant31
Copy link
Contributor Author

ant31 commented Jun 26, 2024

Beside the labels, important change is that there's no more stages, but job dependencies.
It allows to streamline the job. It's not waiting the last job of stage 1 before starting the ones on stage2.

As a result, those new pipelines runs in ~25min.

https://gitlab.com/kargo-ci/kubernetes-sigs-kubespray/-/pipelines/1347530306

@ant31 ant31 added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed kind/flake Categorizes issue or PR as related to a flaky test. labels Jun 26, 2024
@ant31 ant31 requested a review from yankay June 29, 2024 09:29
@ant31
Copy link
Contributor Author

ant31 commented Jun 29, 2024

@MrFreezeex @VannTen @yankay

This is needed to use the new infrastructure, should make CI go through faster, and hopefully improve contributors' experiences.
Both infra are up and running, so we'll have to tear down the previous one soon.

The main blocking point could be the job that runs by default.
a) that can be changed easily; please comment on which jobs you want to add, or we can also do follow-up PR.
b) we can't test every combination of every network+container-engine+os. The tradeoff was to test at least once every option, but not the combination of them.

I tried to remove from the default pipeline only a few:
for example, Instead of :

  • ubuntu-calico
  • centos-calico,
  • ubuntu-flannel
  • centos-flannel

I go for:

  • ubuntu-calico
  • centos-flannel

We can still run full tests on schedule (daily) and for tags/releases, but not on every PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. ci-extended Run additional tests cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

packet_ubuntu20-calico-aio is failing on the task "packet-ci : Wait for vms to have ipaddress assigned"
3 participants