For pure CPU methods running with the gpu_enabled run the numpy block triggers a CuPy API for data transfer as in the code bellow
def to_cpu(self):
if not gpu_enabled:
return
self._data = xp.asnumpy(self.data, order="C")
It is also possible that the unnecessary copy of the numpy array happens meantime this.
Logically, when the block is the CPU block already, no CuPy functions should be called on it.
P.S. This is discovered through enabling rescale_to_int to be a pure CPU method and handled by the generic wrapper.