-
Notifications
You must be signed in to change notification settings - Fork 3
Block to cpu function changes of base_block #620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I'm curious, is the We use numpy v1.26.x from what I see in the CI, and that version's docs for |
If you're talking about the numpy array inside a block that is either:
I would imagine that, in general, those numpy arrays are not C-contiguous. This is because numpy arrays inside blocks that are read from a source or written to a sink originate from slicing the original numpy array that represents the chunk associated with a process for a given section, and generating a numpy array via slicing another numpy array doesn't produce a C-contiguous numpy array in general. Only under certain circumstances will slicing produce a C-contiguous numpy array, due to how slicing may or may not produce a view of the original data that is still C-contiguous (in particular, whether the elements within the view are stored contiguously in row-major order or not):
So, my naive assumption would be that numpy arrays representing chunks are C-contiguous (due to being created by As a reference, numpy docs here mention how slicing a numpy array often produces a "view" of the original numpy array. |
Thanks for the clarification @yousefmoazzam . You're, indeed, right about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, I forgot about this PR!
Ok, it sounds like adding order="C"
to the np.empty()
calls indeed makes no difference, so could those be removed please.
OK to merge @yousefmoazzam ? |
Fixes #619
An additional note: The data input to
rescale_to_int
(CPU) needs to be C-contiguous and one approach would be to do the following using Numpy's API instead of the CuPy's API as before in here:However, I decided to add conversion to C-contiguous on the method side as this is the requirement of the method, mostly due to the C-wrapped code we use. Also when the data on the CPU is converted to C-contiguous by the framework it is counted by the
montior
as a GPU transfer, which is confusing when plotting the times.However, it would be interesting to know why the data becomes non C-contiguous when it is written in the sink or read by the source. It is a chance to remove unnecessary data copy with
np.asarray(data, order="C")
Checklist