Skip to content

Add DSS integration tests (New) #1787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

motjuste
Copy link
Contributor

@motjuste motjuste commented Mar 11, 2025

Description

  • In the install-deps script, add installing git and python3-venv to be able to setup for the running the DSS integration tests from their repo.
  • Add a job to clone the data-scienc-stack repo, and to setup and venv with tox for running the tests.
  • Add a job to run the CPU integration tests from the repo.
  • Add a job to run the NVIDIA GPU integration tests from the repo if the NVIDIA GPU addon was successfully enabled and validated in the K8s cluster.
  • Fix script name for checking NVIDIA GPU rollout.

TODO

  • Parse Commit hash from DSS snap to determine what version of the repo to check out

Resolved issues

Documentation

No changes to Checkbox documentation.

Tests

There are no new unit-tests added. A full run of the checkbox-provider in Testflinger can be found here.

First job sets-up the repo and venv, then one job runs the CPU
integration tests, and the next runs GPU integration tests if the NVIDIA
GPU was enabled successfully
Copy link

codecov bot commented Mar 11, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 49.83%. Comparing base (da6b4e9) to head (272c394).
Report is 4 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1787   +/-   ##
=======================================
  Coverage   49.83%   49.83%           
=======================================
  Files         377      377           
  Lines       40719    40719           
  Branches     6851     6851           
=======================================
  Hits        20294    20294           
  Misses      19700    19700           
  Partials      725      725           
Flag Coverage Δ
provider-dss ∅ <ø> (∅)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@motjuste motjuste marked this pull request as ready for review March 11, 2025 11:42
@motjuste motjuste requested a review from a team as a code owner March 11, 2025 11:42
@motjuste motjuste requested a review from fernando79513 March 11, 2025 11:43
@motjuste
Copy link
Contributor Author

@fernando79513 , is there anything holding this PR anymore? I am unfortunately sick today, but I hope to be able to look at fixing any issues tomorrow.

@fernando79513 fernando79513 self-assigned this Mar 13, 2025
@fernando79513
Copy link
Collaborator

Hey @motjuste, what's the exact purpose of this PR?
I thought the objective of checkbox-dss snap was to test the dss snap and make sure it could install/run on devices. I don't see how cloning the repo and installing it locally fits into that.
This also raises more questions:
What if the version of dss in the snap is not the same that's in main in the repo? How do you ensure that the version you are testing in the snap is the same as we are cloning from the repo?

@motjuste
Copy link
Contributor Author

In this PR, we clone the DSS repo and only run their integration tests. The integration tests expect that the DSS snap is installed and there's a kubernetes cluster set up. We do not install DSS from this repo. I am sorry for the confusion.

@fernando79513
Copy link
Collaborator

You are right, the DSS integration test are basically calling the DSS snap in different scenarios. So what's the difference between these tests and the ones we are already running?
I think that still does not solves the versioning issues. Imagine we are testing the DSS snap on beta, but they include a new feature on main and add integration tests for it. We will be running integration tests for a feature that's not yet in beta.
If we really want to use the integration tests they define in their repo, we would need to have the same version of the repo that created that DSS snap, and I don't think it's trivial.
If they want to use checkbox-dss in the future to run their integration tests, that would be great, but I think this tests should not be in the same test plan as the rest

Co-authored-by: Fernando Bravo <[email protected]>
@motjuste
Copy link
Contributor Author

motjuste commented Mar 14, 2025

To be honest, our tests at the moment are more extensive than what the DSS integration tests cover. For example, the DSS integration tests currently do not cover Intel GPUs at all, but we do.

I am following a spec KF114 that @deusebio et.al. created for me. It does make sense to run the integration tests from the product team, especially since they will be in the best position to keep those tests up-to-date, e.g. with more granular tests, that don't make sense to be added to this Checkbox Provider.

There's one pattern we can use for handling potential version mis-match:

  • The repo with the integration tests has different branches for the version of the product they can test. Here the repo would be the DSS one.
  • This branch is provided as an argument to the test runner. Here this would become an argument to the Checkbox provider.

The DSS repo currently doesn't provide separate branches. I believe the default main branch is covers all currently published versions of the dss snap. If there is a divergence in the future --- if tests start to differ for different version of dss --- the DSS team can start following the pattern I describe and we can update the provider accordingly.

@deusebio : please provide any comments.

@fernando79513 fernando79513 added the waiting-for-changes The review has been completed but the PR is waiting for changes from the author label Mar 19, 2025
@motjuste
Copy link
Contributor Author

I had a discussion with @deusebio from the DSS team, and he was indeed in favour of adding these into the Checkbox provider. The DSS product team will be able to add more specific tests in the future without having to update this Checkbox provider.

To avoid version mismatch, there are two options:

  1. The CLI for running the Checkbox tests will accept an optional argument to choose which branch of the DSS repo to use for running the integration tests, defaulting to main.
  2. The Checkbox runner can parse the the exact hash of the commit in the DSS repo from which the snap was created, and use that.

@deusebio
Copy link
Collaborator

deusebio commented Apr 17, 2025

Yes, @motjuste, thank you for the summary. I'm definitely in favor to externalize UATs test into a repository managed by the product team such that we can keep on updating tests and also implement new ones without the workflow on checkbox to be modified.

The versioning mismatch can be fixed in either ways you propose. Maybe my preference could possibly be option 2, given for DSS everything is self-contained in one repository. Charmed Kubeflow is a bit more complex (given it is made up of multiple repo/charms/services) and we have a dedicated repo for UATs, therefore the need to use branches, but this is not required here.

The hash of the commit could be parsed from the version (e.g. version of DSS is 0.1-8742e6d3c0a5450c6dbc4ea3788a, where the part after "-" is the commit hash. Otherwise we could also store this info in DSS and provide an API - say dss version to retrieve this information. We can start with just parsing the version if this is simple enough, and move to the dss version entrypoint if needed in the future

@motjuste
Copy link
Contributor Author

@fernando79513 ... I looking into parsing the commit hash ... but feel free to add any other comments you may have in the meantime

@fernando79513
Copy link
Collaborator

I still don't think this should be a checkbox test.

The idea of having these tests in checkbox (and eventually on Test Observer) is to have control of some of the core functionalities of the DSS module across different.
That's why we have in test observer a way to report which test have failed in previous version.
If these tests are modified elsewhere, we won't have a way to know if a new test has been added or removed.

If you really want to use checkbox to run these integration tests, I would have a separated testflinger job that does all the setup and calls tox - e integration, but all this set up should not be in a checkbox test IMO.
I don't like at all the idea of cloning repos and setting up the environment in a checkbox test.

And this integration tests should not be included in the dss-validation test plan
What do you think @pieqq @Hook25?

@deusebio
Copy link
Collaborator

My understanding was that the PR here was to provide the framework to enable testing, but use-case definition and implementation would still be on the product team. From your answer I gather this aims to be "independent" testing and you must both own definition and implementation of tests. I don't have particular concerns with this and it is definitely your call to make whether externalizing tests is appropriate, but the latitude on the support we can provide for evolving those tests will be obviously more limited (probably for good reasons since they are independent tests and product team has limited knowledge of the checkbox framework).

@fernando79513
Copy link
Collaborator

I don't know who is going to be in charge of testing/implementing the tests, but I don't think having this in a checkbox test is a good approach with the current reporting system we have in place.

I see the point on running the integration tests in a set of devices, since these will probably be more up to date, but I think the setup should be done in a separate script (not in a checkbox test) and probably include everything in a separate testflinger job.
But we should not include that inside the dss-validation test-plan (which in my opinion should be tied to a specific dss snap revision)

@mz2
Copy link
Collaborator

mz2 commented Apr 24, 2025

For the sake of reproducible test results indeed any sort of a floating (changing depending on when a test job is run) testing definition being pulled into a Checkbox job doesn't seem a good idea. If the tests are executable in some other context and there's a reason to reference them for that reason as a git ref (tag, or precise commit hash) for example, I can understand that. But since Checkbox is the way testing on real hardware in the labs is conducted and since testing on real hardware seems essential for the usecases relevant, @deusebio seems reasonable to expect that your team would become aware of this test tool enough to help maintain the coverage.

Happy to discuss in a meeting before or at the sprint if you do not find consensus otherwise.

@deusebio
Copy link
Collaborator

I feel there are a few different points of discussions we should unpack:

  1. using floating points (e.g. branches) vs hashes
  2. ownership for the tests and the process for defining, implementing and maintaining
  3. checkbox requirements for snap/charms/rocks (who is mandating the requirement and if there is any specification on this)

Yes, I believe having a dedicated meeting in Frankfurt would definitely be very helpful to unpack those properly. I would prefer during roadmap if possible, since engineering sprint is looking more hectic. Would relevant stakeholders be there in the first week already? I'm sure we will be able to understand the constraint and the latitude for compromise we have. As a general point, I would like to centralize the business logic for user testing in one place, as re-implementing the same tests in different frameworks or repository would increase substantially the burden of maintainance and keeping things in sync, left aside the current skill gap. That's really my main concern and point here

@mz2
Copy link
Collaborator

mz2 commented Apr 24, 2025

Agreed on all of that, let's resolve at the sprint and conclude back here (product week indeed the better option also for us).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting-for-changes The review has been completed but the PR is waiting for changes from the author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants