-
-
Notifications
You must be signed in to change notification settings - Fork 26
Fancy index #459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Fancy index #459
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
FrancescAlted
approved these changes
Aug 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look great to me. It would be interesting to see how the new performance compares with h5py and zarr, for a reference.
7 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Edited code to try and make fancy indexing faster (avoid searching full index every time via ndindex:as_subidx) and incorporate 1D fast path better. Could maybe be revised to allow for interleaved slice and integer arrays (currently it follows ndindex in only allowing slices before and after the integer arrays). Improvements are not huge, but there is a performance benefit, compared to the old algorithm, which found intersecting chunks via ndindex, and looped over those chunks, converting the full index to a local chunk index at every iteration. This involved searching the whole index for every loop (it remains as the default algorithm for boolean indexing).
The algorithm adaps he Zarr vindexing implementation which uses a sorting-based algorithm. Intersecting chunks are found (for slices and integer arrays), the integer array is sorted (by chunks for N>1 integer arrays, by index for a single integer array), and then the code loops over the chunks, taking the indexes corresponding to the chunk from the sorted integer array. For a single integer array, only the relevant part of the chunk (between the minimum and maximum index within a chunk) is loaded - for several integer arrays, because of how
get_slice_numpy
is written, it is necessary to load the whole chunk and then index on it.Hence on avoids repeated searches over the whole index. There is a small detail in using np.unique, which does not quite function like np.bincount, since it sorts and may do a copy, which can cause a slowdown if not handled correctly.
Improvements could be made to try and find a way to use np.bincount directly, as I still do not completely trust np.unique to be fast. Also one could rewrite to interleave slices and indices (matching numpy).
ndindex.expand
also seems to consume quite a bit of memory (via copying) when broadcasting indices, I thinknp.broadcast
would just do a view so might be better. Finally, one could rewriteget_slice_numpy
to allow continuous, flat returns, and not necessarily only slices (a sequence of slices, which is multidimensional ''rectangular'', differs from a range from one multidimensional index to another, which is essentially 1D). See #441.From the plot one can see that the new algorithm is just as fast as the old optimised 1D path, although it handles more cases.
