Skip to content

Conversation

@aruneshpa
Copy link
Collaborator

What does this PR do, and why is it needed?

Implement mutation webhook to automatically add SCSI controllers when volumes need them, checking status.hardware.controllers for capacity. We create new ParaVirtual SCSI controller (max 4) only when all existing controllers with matching requirements are full.

Update the validation webhooks to enforce controller slot limits based on controller type (ParaVirtual: 63 slots, others: 15 slots).

Handles OracleRAC (volume sharingMode=MultiWriter -> controller sharingMode=None) and MicrosoftWSFC (applicationType -> controller sharingMode=Physical).

This change also sets controllerBusNumber on volume when creating new controller. Otherwise, there's a chance that after the mutation webhooks determines that we need a new controller and adds it to the spec, another volume has been detached and a slot frees up on an existing controller. So, now we have a controller that does not have any devices attached. To handle that, we are explicitly setting the controllerBusnumber for a volume for which we are creating the controller. That way, we guarantee that the device is attached to the controller that we created.

Please add a release note if necessary:

Add SCSI controller auto-creation and slot validation for volumes

@github-actions github-actions bot added the size/XXL Denotes a PR that changes 1000+ lines. label Oct 15, 2025
Implement mutation webhook to automatically add SCSI controllers when
volumes need them, checking status.hardware.controllers for capacity.
We create new ParaVirtual SCSI controller (max 4) only when all existing
controllers with matching requirements are full.

Update the validation webhooks to enforce controller slot limits based on
controller type (ParaVirtual: 63 slots, others: 15 slots).

Handles OracleRAC (volume sharingMode=MultiWriter -> controller
sharingMode=None) and MicrosoftWSFC (applicationType -> controller
sharingMode=Physical).

This change also sets controllerBusNumber on volume when creating new
controller.  Otherwise, there's a chance that after the mutation
webhooks determines that we need a new controller and adds it to the
spec, another volume has been detached and a slot frees up on an
existing controller.  So, now we have a controller that does not have
any devices attached.  To handle that, we are explicitly setting the
controllerBusnumber for a volume for which we are creating the
controller.  That way, we guarantee that the device is attached to the
controller that we created.
@faisalabujabal faisalabujabal force-pushed the features/rac-and-wsfc/add-scsi-controller-in-mutation-webhook branch 2 times, most recently from 712960c to ca088f2 Compare October 28, 2025 13:35
@faisalabujabal faisalabujabal force-pushed the features/rac-and-wsfc/add-scsi-controller-in-mutation-webhook branch from ca088f2 to 46e95b0 Compare October 28, 2025 13:49
@github-actions
Copy link

Code Coverage

Package Line Rate Health
github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/clustercontentlibraryitem 67%
github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/contentlibraryitem 67%
github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/utils 46%
github.com/vmware-tanzu/vm-operator/controllers/infra/capability/configmap 92%
github.com/vmware-tanzu/vm-operator/controllers/infra/capability/crd 100%
github.com/vmware-tanzu/vm-operator/controllers/infra/configmap 75%
github.com/vmware-tanzu/vm-operator/controllers/infra/node 77%
github.com/vmware-tanzu/vm-operator/controllers/infra/secret 76%
github.com/vmware-tanzu/vm-operator/controllers/infra/validatingwebhookconfiguration 87%
github.com/vmware-tanzu/vm-operator/controllers/infra/zone 73%
github.com/vmware-tanzu/vm-operator/controllers/storageclass 95%
github.com/vmware-tanzu/vm-operator/controllers/storagepolicyquota 98%
github.com/vmware-tanzu/vm-operator/controllers/util/encoding 73%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/storagepolicyusage 96%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/virtualmachine 69%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/volume 88%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/volumebatch 81%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineclass 73%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinegroup 89%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinegrouppublishrequest 88%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineimagecache 88%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinepublishrequest 82%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinereplicaset 67%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineservice 82%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineservice/providers 92%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinesetresourcepolicy 81%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinesnapshot 92%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest 72%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha1 72%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha1/conditions 88%
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha1/patch 78%
github.com/vmware-tanzu/vm-operator/controllers/vspherepolicy/policyevaluation 93%
github.com/vmware-tanzu/vm-operator/pkg/bitmask 100%
github.com/vmware-tanzu/vm-operator/pkg/builder 94%
github.com/vmware-tanzu/vm-operator/pkg/conditions 90%
github.com/vmware-tanzu/vm-operator/pkg/config 100%
github.com/vmware-tanzu/vm-operator/pkg/config/capabilities 98%
github.com/vmware-tanzu/vm-operator/pkg/config/env 100%
github.com/vmware-tanzu/vm-operator/pkg/context 25%
github.com/vmware-tanzu/vm-operator/pkg/context/generic 100%
github.com/vmware-tanzu/vm-operator/pkg/context/operation 100%
github.com/vmware-tanzu/vm-operator/pkg/crd 75%
github.com/vmware-tanzu/vm-operator/pkg/errors 75%
github.com/vmware-tanzu/vm-operator/pkg/exit 100%
github.com/vmware-tanzu/vm-operator/pkg/log 100%
github.com/vmware-tanzu/vm-operator/pkg/mem 100%
github.com/vmware-tanzu/vm-operator/pkg/patch 78%
github.com/vmware-tanzu/vm-operator/pkg/prober 89%
github.com/vmware-tanzu/vm-operator/pkg/prober/probe 90%
github.com/vmware-tanzu/vm-operator/pkg/prober/worker 77%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere 74%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/clustermodules 73%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/config 88%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/contentlibrary 75%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/credentials 100%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/network 81%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/placement 74%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/session 51%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/storage 44%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/upgrade/virtualmachine 96%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/vcenter 85%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/virtualmachine 86%
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/vmlifecycle 71%
github.com/vmware-tanzu/vm-operator/pkg/record 87%
github.com/vmware-tanzu/vm-operator/pkg/topology 91%
github.com/vmware-tanzu/vm-operator/pkg/util 77%
github.com/vmware-tanzu/vm-operator/pkg/util/cloudinit 89%
github.com/vmware-tanzu/vm-operator/pkg/util/cloudinit/validate 91%
github.com/vmware-tanzu/vm-operator/pkg/util/image 100%
github.com/vmware-tanzu/vm-operator/pkg/util/kube 95%
github.com/vmware-tanzu/vm-operator/pkg/util/kube/cource 100%
github.com/vmware-tanzu/vm-operator/pkg/util/kube/internal 100%
github.com/vmware-tanzu/vm-operator/pkg/util/kube/proxyaddr 73%
github.com/vmware-tanzu/vm-operator/pkg/util/kube/spq 99%
github.com/vmware-tanzu/vm-operator/pkg/util/linuxprep 97%
github.com/vmware-tanzu/vm-operator/pkg/util/netplan 100%
github.com/vmware-tanzu/vm-operator/pkg/util/nil 100%
github.com/vmware-tanzu/vm-operator/pkg/util/ovfcache 75%
github.com/vmware-tanzu/vm-operator/pkg/util/ovfcache/internal 100%
github.com/vmware-tanzu/vm-operator/pkg/util/paused 100%
github.com/vmware-tanzu/vm-operator/pkg/util/ptr 100%
github.com/vmware-tanzu/vm-operator/pkg/util/resize 98%
github.com/vmware-tanzu/vm-operator/pkg/util/sysprep 98%
github.com/vmware-tanzu/vm-operator/pkg/util/vmopv1 88%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/client 66%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/library 96%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/vm 79%
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/watcher 85%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig 95%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/anno2extraconfig 100%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/bootoptions 97%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/crypto 91%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/diskpromo 100%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/policy 96%
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/virtualcontroller 85%
github.com/vmware-tanzu/vm-operator/pkg/webconsolevalidation 100%
github.com/vmware-tanzu/vm-operator/services/vm-watcher 85%
github.com/vmware-tanzu/vm-operator/webhooks/common 98%
github.com/vmware-tanzu/vm-operator/webhooks/persistentvolumeclaim/validation 95%
github.com/vmware-tanzu/vm-operator/webhooks/unifiedstoragequota/validation 88%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachine/mutation 84%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachine/validation 94%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineclass/mutation 62%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineclass/validation 89%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinegroup/mutation 87%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinegroup/validation 92%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinegrouppublishrequest/mutation 86%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinegrouppublishrequest/validation 88%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinepublishrequest/validation 93%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinereplicaset/validation 90%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineservice/mutation 67%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineservice/validation 92%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinesetresourcepolicy/validation 89%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinesnapshot/mutation 83%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinesnapshot/validation 89%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinewebconsolerequest/v1alpha1/validation 92%
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinewebconsolerequest/validation 92%
Summary 82% (16265 / 19854)

Minimum allowed line rate is 79%

Copy link
Collaborator

@akutz akutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start @faisalabujabal , thank you! In addition to the comments on the review:

  • The linter will ping any comments that do not end with periods, so please take care to do that.
  • Go is more efficient at memory management when shared by copy than by address. Unless functions need to accept an argument by address or need to return a value by address, then prefer by-copy instead.
  • There is extensive use of the continue directive in these changes. The use of continue is fine, but it will mean that you have to provide coverage for that logic, versus inverting and indenting and only covering the case about which you care.

var (
// MaxSCSISlotsByType is the maximum number of devices per SCSI controller
// type.
// Note: The controller itself occupies one slot.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth noting here that it is always slot 7.

// MaxSCSISlotsByType is the maximum number of devices per SCSI controller
// type.
// Note: The controller itself occupies one slot.
MaxSCSISlotsByType = map[vmopv1.SCSIControllerType]int32{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a function off of the type itself in api/v1alpha5, ex.:

diff --git a/api/v1alpha5/virtualmachine_hardware_types.go b/api/v1alpha5/virtualmachine_hardware_types.go
index 6bd5c3cf..0de95aab 100644
--- a/api/v1alpha5/virtualmachine_hardware_types.go
+++ b/api/v1alpha5/virtualmachine_hardware_types.go
@@ -17,6 +17,21 @@ const (
 	VirtualControllerTypeSATA VirtualControllerType = "SATA"
 )
 
+// MaxSlots returns the the maximum number of slots per controller
+// type.
+func (t VirtualControllerType) MaxSlots() int32 {
+	switch t {
+	case VirtualControllerTypeIDE:
+		return 2
+	case VirtualControllerTypeNVME:
+		return 4
+	case VirtualControllerTypeSATA:
+		return 4
+	default:
+		return 0
+	}
+}
+
 type VirtualControllerSharingMode string
 
 const (
@@ -36,6 +51,21 @@ const (
 	SCSIControllerTypeLsiLogicSAS     SCSIControllerType = "LsiLogicSAS"
 )
 
+// MaxSlots returns the the maximum number of slots per SCSI controller
+// type. Please note, the controller itself occupies one slot.
+func (t SCSIControllerType) MaxSlots() int32 {
+	switch t {
+	case SCSIControllerTypeParaVirtualSCSI:
+		return 63 // 64 - 1 slot for the SCSI controller
+	case SCSIControllerTypeBusLogic,
+		SCSIControllerTypeLsiLogic,
+		SCSIControllerTypeLsiLogicSAS:
+		return 15 // 16 - 1 slot for the SCSI controller
+	default:
+		return 0
+	}
+}
+
 // +kubebuilder:validation:Enum=CDROM;Disk
 
 type VirtualDeviceType string
@@ -54,6 +84,10 @@ type IDEControllerSpec struct {
 	BusNumber int32 `json:"busNumber"`
 }
 
+func (t IDEControllerSpec) MaxSlots() int32 {
+	return VirtualControllerTypeIDE.MaxSlots()
+}
+
 type NVMEControllerSpec struct {
 	// +required
 	// +kubebuilder:validation:Minimum=0
@@ -81,6 +115,10 @@ type NVMEControllerSpec struct {
 	SharingMode VirtualControllerSharingMode `json:"sharingMode,omitempty"`
 }
 
+func (t NVMEControllerSpec) MaxSlots() int32 {
+	return VirtualControllerTypeNVME.MaxSlots()
+}
+
 type SATAControllerSpec struct {
 	// +required
 	// +kubebuilder:validation:Minimum=0
@@ -99,6 +137,10 @@ type SATAControllerSpec struct {
 	PCISlotNumber *int32 `json:"pciSlotNumber,omitempty"`
 }
 
+func (t SATAControllerSpec) MaxSlots() int32 {
+	return VirtualControllerTypeSATA.MaxSlots()
+}
+
 type SCSIControllerSpec struct {
 	// +required
 	// +kubebuilder:validation:Minimum=0
@@ -134,6 +176,10 @@ type SCSIControllerSpec struct {
 	Type SCSIControllerType `json:"type,omitempty"`
 }
 
+func (t SCSIControllerSpec) MaxSlots() int32 {
+	return t.Type.MaxSlots()
+}
+
 type VirtualDeviceStatus struct {
 	// +required
 


// MaxSCSIControllersPerVM is the maximum number of SCSI controllers
// allowed per VM. (4 controllers, indexed 0-3).
MaxSCSIControllersPerVM = int32(4)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this and the next variable a constant, ex.:

const (
	// MaxSCSIControllersPerVM is the maximum number of SCSI controllers 
	// allowed per VM. (4 controllers, indexed 0-3).
	MaxSCSIControllersPerVM int32 = 4

	// SCSIControllerUnitNumber is the unit number reserved for SCSI controller 
	// on its own bus.
	SCSIControllerUnitNumber int32 = 7
)

func MaxSlotsForControllerType(ctrlType vmopv1.SCSIControllerType) int32 {
if _, ok := MaxSCSISlotsByType[ctrlType]; !ok {
// Unknown controller type, default to ParaVirtual
return MaxSCSISlotsByType[vmopv1.SCSIControllerTypeParaVirtualSCSI]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong. If it is unknown then declaring it to be 63 slots, that is an issue. How will an unknown controller type even work?

Plus, you can get rid of this with the change I requested up above anyway.

return MaxSCSISlotsByType[ctrlType]
}

// ControllerHasAvailableSlots checks if a controller has available slots for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things:

  • Move this to pkg/util/vmopv1 as it is the utility package for functions that act on just the Kube APIs.
  • Create a new file called hardware.go.
  • In the new files:
    • Define an interface called HasMaxSlots
      // HasMaxSlots is an interface describing a type with a function that
      // returns a maximum number of device slots.
      type HasMaxSlots interface {
          MaxSlots() int32
      }
    • Rename this function to NextAvailableUnitNumber
      // NextAvailableUnitNumber returns the next available unit number for a given
      // controller. A value of -1 is returned if there are no available slots remaining.
      func NextAvailableUnitNumber(
          controller HasMaxSlots,
          busNumber int32,
          statusDeviceCounts, volumeAssignments map[int32]int32) int32 {
      
          // Get max slots for this controller.
          maxSlots := controller.MaxSlots()
          if maxSlots == 0 {
              return -1
          }
      
          // Get current device count from status.
          currentDevices := statusDeviceCounts[busNumber]
      
          // Get number of volumes already assigned to this controller in this mutation.
          assignedVolumes := volumeAssignments[busNumber]
      
          // Check if adding one more volume would exceed capacity.
          return (currentDevices + assignedVolumes + 1) <= maxSlots
      }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change the implementation to return the next available unit number if the caller doesn't care about the unit number but only cares about if a slot is available?

func createSCSIController(
busNumber int32,
sharingMode vmopv1.VirtualControllerSharingMode,
) *vmopv1.SCSIControllerSpec {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be returned by address?


const (
invalidControllerBusNumberRangeFmt = "must be between 0 and %d"
invalidControllerBusNumberDoesNotExist = "SCSI controller with bus number %d does not exist"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify and make generic, ex.: controller %s:%d does not exist. That may be:

  • controller SCSI:0 does not exist
  • controller IDE:1 does not exist
  • controller NVME:2 does not exist
  • controller SATA:3 does not exist

invalidControllerBusNumberDoesNotExist = "SCSI controller with bus number %d does not exist"
invalidUnitNumberReserved = "unit number 7 is reserved for the SCSI controller itself"
invalidUnitNumberRangeFmt = "unit number must be less than %d for %s controller"
invalidControllerCapacityFmt = "SCSI controller %d (type=%s) has %d devices and cannot accommodate %d more volumes (%v). Maximum slots: %d"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify and make generic, ex.: controller %s:%d full, maxDevices=%d. That may be:

  • controller SCSI:0 full, maxDevices=63
  • controller IDE:1 full, maxDevices=4
  • controller NVME:2 full, maxDevices=4
  • controller SATA:3 full, maxDevices=4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XXL Denotes a PR that changes 1000+ lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants