providers/base : add tests for Intel QAT (New) #1795

hector-cao · 2025-03-14T15:57:45Z

Description

This MR adds tests for Intel Intel QuickAssist Technology (QAT) devices. QAT is an accelerator for crypto/compression operations, available on Intel server-class hardware equipped with XEON processors.

Resolved issues

Documentation

Tests

I run these tests on Intel hardwares with/without QAT devices.
On hardware without QAT support, no test will be run (just 1 test is skipped, no tests are generated from the job templates).

On a machine with 402xx QAT device (402xx driver), here is the sample output of the results:

==================================[ Results ]===================================
 ☑ : Hardware Manifest
 ☑ : Detect Intel QuickAssist Technology device (>= Gen4)
 ☑ : Check PF sysfs for 01:00.0
 ☑ : Check PF sysfs for 0b:00.0
 ☑ : Check PF sysfs for 81:00.0
 ☑ : Check PF sysfs for 8b:00.0
 ☑ : Check SR-IOV support 01:00.0
 ☑ : Check SR-IOV support 0b:00.0
 ☑ : Check SR-IOV support 81:00.0
 ☑ : Check SR-IOV support 8b:00.0
 ☑ : Check telemetry data in debugfs for 01:00.0
 ☑ : Check telemetry data in debugfs for 0b:00.0
 ☑ : Check telemetry data in debugfs for 81:00.0
 ☑ : Check telemetry data in debugfs for 8b:00.0
 ☑ : Check VFIO-PCI support 01:00.0
 ☑ : Check VFIO-PCI support 0b:00.0
 ☑ : Check VFIO-PCI support 81:00.0
 ☑ : Check VFIO-PCI support 8b:00.0
 ☑ : Bring up and down device 01:00.0
 ☑ : Bring up and down device 0b:00.0
 ☑ : Bring up and down device 81:00.0
 ☑ : Bring up and down device 8b:00.0
 ☑ : Attach devices list
 ☑ : Collect information about installed software packages
 ☑ : Run CPA symmetric crypto tests
 ☑ : Run CPA RSA tests
 ☑ : Run compression tests
 ☑ : Run CPA symmetric crypto tests (in standalone mode)
 ☑ : Run CPA RSA tests (in standalone mode)
 ☑ : Run compression tests (in standalone mode)

bladernr · 2025-03-14T20:54:47Z

I made a couple inline comments... I'm interested in including this in the server suite. Mostly my questions are around the dependency on the detect job, which simply checks that someone ticked Y on a manifest entry (server suite doesn't use manifests, it uses resources).

Would it be reasonable for the detect job to pass if either condition is there (e.g. the resource job returns a pf: present or whatever if the QAT PFs are detected by qatctl.py, OR if the manifest is true)?

and add that extra bit to the resource similar to other resource constraints in use? so... something like this for the detect job:

requires: manifest.has_intel_qat == 'True' or pf.present == 'True'

And keep the rest of it as-is with the dependency on the detect job?

hector-cao · 2025-03-17T13:11:43Z

I made a couple inline comments... I'm interested in including this in the server suite. Mostly my questions are around the dependency on the detect job, which simply checks that someone ticked Y on a manifest entry (server suite doesn't use manifests, it uses resources).

Would it be reasonable for the detect job to pass if either condition is there (e.g. the resource job returns a pf: present or whatever if the QAT PFs are detected by qatctl.py, OR if the manifest is true)?

and add that extra bit to the resource similar to other resource constraints in use? so... something like this for the detect job:

requires: manifest.has_intel_qat == 'True' or pf.present == 'True'

And keep the rest of it as-is with the dependency on the detect job?

Thanks @bladernr for your feedback !

Based on your comments, I re-designed the tests, please take a look and let me know things you would want to be improved.

QAT (Intel QuickAssist Technology) is an accelerator for crypto/compression operations. The hardward is available on recent Intel Xeon processors.

pieqq · 2025-05-12T13:17:13Z

@bladernr I just saw your previous comment :

Mostly my questions are around the dependency on the detect job, which simply checks that someone ticked Y on a manifest entry (server suite doesn't use manifests, it uses resources).

Manifest entries can be entered manually when running Checkbox, but they can also be pre-filled (it's just a filke stored in /var/tmp/checkbox-ng/machine-manifest.json).

Also, manifest entries and resources do not serve exactly the same purpose. With a manifest, you tell Checkbox that this device does have a given piece of hardware, or a specific feature enabled. Resources retrieve the information automatically from the system, which may lead in jobs being skipped if the resource script fails, or if the driver is not properly loaded and therefore the feature is not exposed to the user.

A typical example is has_wlan_adapter, which tells Checkbox whether or not the device has WiFi. The wireless/detect test will then try to find a wireless interface only if this manifest is set to True. This allows to catch situations where we know a device should have WiFi, but the driver failed to load.

The tutorial has a whole page about how manifests work, you can check it out.

pieqq

Sorry for the long time to provide feedback! Please check my comments to see if they make sense.

pieqq · 2025-05-14T09:28:21Z

providers/base/units/intel-qat/category.pxu

@@ -0,0 +1,3 @@
+unit: category
+id: intel-qat
+_name: Intel Quick-Assist Technology


Suggested change

_name: Intel Quick-Assist Technology

_name: Intel QuickAssist Technology

As per the Intel page.

pieqq · 2025-05-14T09:31:09Z

providers/base/units/intel-qat/jobs.pxu

+command:
+  PFS=$(qatctl.py list --short | wc -l)
+  if [ "${PFS}" -le 0 ]; then
+    echo "manifest.has_intel_qat is set to True but no device found !"


Suggested change

echo "manifest.has_intel_qat is set to True but no device found !"

echo "This system is supposed to support Intel QuickAssist Technology, but no Intel QAT device were found!"

pieqq · 2025-05-14T13:34:49Z

providers/base/bin/qatctl.py

Can you provide unit tests for this file?

pieqq · 2025-05-14T13:37:34Z

providers/base/units/intel-qat/jobs.pxu

+unit: template
+template-resource: qat
+template-engine: jinja2
+template-unit: job
+id: intel-qat-common/{{ available }}-attach-devices


Suggested change

unit: template

template-resource: qat

template-engine: jinja2

template-unit: job

id: intel-qat-common/{{ available }}-attach-devices

unit: template

template-resource: qat

template-unit: job

id: intel-qat-common/{available}-attach-devices

Template jobs use python string formatting by default. I don't think jinja2 is needed here (nor in any of the following template jobs in this file).

pieqq · 2025-05-14T13:40:54Z

providers/base/units/intel-qat/jobs.pxu

+  package.name == 'qatlib-examples'
+  package.name == 'qatlib-service'


How are these package made available on the system? Is it a package to pull from the official repos? Is it something that needs building?

pieqq · 2025-05-14T13:42:50Z

providers/base/units/intel-qat/jobs.pxu

+  # switch all devices to crypto sym mode
+  printf "POLICY=0\nServicesEnabled=sym\n" | tee /etc/sysconfig/qat
+  systemctl restart qat
+  cpa_sample_code runTests=1


How is cpa_sample_code made available? (I assume it's part of the packages mentioned above?)

pieqq · 2025-05-14T14:06:50Z

providers/base/units/intel-qat/jobs.pxu

+  rmmod vfio-pci || true
+  nb_vfio=$(qatctl.py status --devices {{ pf }} --vfio | wc -l)
+  [ "$nb_vfio" -le 0 ] || (echo "nb vfio devices should be <= 0" && exit 1)
+  # we have to pass the VF device ids
+  modprobe vfio-pci ids=8086:4941,8086:4943,8086:4945,8086:4947
+  nb_vfio=$(qatctl.py status --devices {{ pf }} --vfio | wc -l)
+  [ "$nb_vfio" -gt 0 ] || (echo "nb vfio devices should be > 0" && exit 1)


Several suggestions (some apply to other jobs in this PR too):

Put these into a separate bash script in bin/, and start it with set -e so that it fails as soon as possible if anything goes wrong.

a check flag could be added to the qatctl.py script to avoid relying on additional bash commands (such as [ "$nb_vfio" -le 0 ] || (echo "nb vfio devices should be <= 0" && exit 1))

maybe this test could be split in 2 (the second test would depend on the first):

check that no VFIO files are present if the vfio-pci module is removed

check that VFIO are there when the module is reloaded

if something goes wrong before you reload the vfio-pci module, all the tests running after will fail, so you probably have to make sure the modules are reloaded regardless of the outcome.

pieqq · 2025-05-14T14:08:13Z

providers/base/units/intel-qat/manifest.pxu

@@ -0,0 +1,4 @@
+unit: manifest entry
+id: has_intel_qat
+_name: A Intel Quick-Assist Technology (QAT) device


Suggested change

_name: A Intel Quick-Assist Technology (QAT) device

_name: An Intel QuickAssist Technology (QAT) device

pieqq · 2025-05-14T14:10:50Z

providers/base/units/intel-qat/resource.pxu

+  for pf in ${PFS}; do
+    driver_path=$(readlink /sys/bus/pci/devices/0000:"${pf}"/driver)
+    driver=$(basename "${driver_path}")
+    if [ "${driver}" == "4xxx" ] || [ "${driver}" == "420xx" ]; then
+      echo "pf: ${pf}"
+      echo "driver: ${driver}"
+      echo "available: qat"
+      echo ""
+      break
+    fi
+  done


This should probably be put into the python script directly (maybe as a qatctl.py resource command). Easier to test and to maintain.

pieqq · 2025-05-14T14:12:31Z

providers/base/units/intel-qat/resource.pxu

+#   Hector Cao <[email protected]>
+
+unit: job
+id: qat_pf


This resource job looks like it's doing exactly the same thing as qat below, except it doesn't add the available field. Consider removing this and rely on qat only.

hector-cao force-pushed the dev-add-qat-tests branch 2 times, most recently from c762ecf to 6cacc6d Compare March 14, 2025 16:09

hector-cao changed the title ~~providers/base : add tests for Intel QAT~~ providers/base : add tests for Intel QAT (New) Mar 14, 2025

hector-cao force-pushed the dev-add-qat-tests branch 2 times, most recently from 076bda5 to 5650fda Compare March 14, 2025 16:19

hector-cao force-pushed the dev-add-qat-tests branch 5 times, most recently from 8432a90 to a889a1d Compare March 17, 2025 13:08

hector-cao force-pushed the dev-add-qat-tests branch from a889a1d to d462f2b Compare March 17, 2025 14:50

providers/base : add tests for Intel QAT

0561f5c

QAT (Intel QuickAssist Technology) is an accelerator for crypto/compression operations. The hardward is available on recent Intel Xeon processors.

hector-cao force-pushed the dev-add-qat-tests branch from d462f2b to 0561f5c Compare March 17, 2025 15:36

fernando79513 assigned fernando79513 and pieqq and unassigned fernando79513 May 13, 2025

pieqq requested changes May 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

providers/base : add tests for Intel QAT (New) #1795

providers/base : add tests for Intel QAT (New) #1795

Uh oh!

hector-cao commented Mar 14, 2025 •

edited

Loading

Uh oh!

bladernr commented Mar 14, 2025

Uh oh!

hector-cao commented Mar 17, 2025

Uh oh!

pieqq commented May 12, 2025

Uh oh!

pieqq left a comment

Uh oh!

pieqq May 14, 2025

Uh oh!

pieqq May 14, 2025

Uh oh!

pieqq May 14, 2025

Uh oh!

pieqq May 14, 2025

Uh oh!

pieqq May 14, 2025

Uh oh!

pieqq May 14, 2025

Uh oh!

pieqq May 14, 2025

Uh oh!

pieqq May 14, 2025

Uh oh!

pieqq May 14, 2025

Uh oh!

pieqq May 14, 2025

Uh oh!

Uh oh!

	_name: Intel Quick-Assist Technology
	_name: Intel QuickAssist Technology

	echo "manifest.has_intel_qat is set to True but no device found !"
	echo "This system is supposed to support Intel QuickAssist Technology, but no Intel QAT device were found!"

		package.name == 'qatlib-examples'
		package.name == 'qatlib-service'

	_name: A Intel Quick-Assist Technology (QAT) device
	_name: An Intel QuickAssist Technology (QAT) device

providers/base : add tests for Intel QAT (New) #1795

Are you sure you want to change the base?

providers/base : add tests for Intel QAT (New) #1795

Uh oh!

Conversation

hector-cao commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Resolved issues

Documentation

Tests

Uh oh!

bladernr commented Mar 14, 2025

Uh oh!

hector-cao commented Mar 17, 2025

Uh oh!

pieqq commented May 12, 2025

Uh oh!

pieqq left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hector-cao commented Mar 14, 2025 •

edited

Loading