-
Notifications
You must be signed in to change notification settings - Fork 80
[RELEASE] kvikio v25.10 #845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
AyodeAwe
wants to merge
45
commits into
main
Choose a base branch
from
branch-25.10
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Forward-merge branch-25.08 into branch-25.10
Forward-merge branch-25.08 into branch-25.10
Forward-merge branch-25.08 into branch-25.10
Forward-merge branch-25.08 into branch-25.10
This PR removes the OS suffix from devcontainers, allowing the upstream devcontainer images to determine the OS version. Contributes to rapidsai/build-planning#200. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Gil Forsyth (https://github.com/gforsyth) URL: #780
cuDF PR rapidsai/cudf#19164 currently has 4 failed unit tests when `LIBCUDF_MMAP_ENABLED=ON`: ``` 28 - CSV_TEST (Failed) 29 - ORC_TEST (Failed) 32 - JSON_TEST (Failed) 40 - DATA_CHUNK_SOURCE_TEST (Failed) ``` The fix entails code changes on both the KvikIO and cuDF sides. On the KvikIO side, the `MmapHandle::read()` and `MmapHandle::pread()` methods need to: - Allow the read size to be 0 - Allow `offset` to be equal to `initial_map_offset` (when the read size is 0) This PR makes this change. In addition, this PR adds more detailed error messages when out-of-range exception occurs. Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #781
conda-forge is migrating to gcc 14, so this PR is updating for alignment. See rapidsai/build-planning#188 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Gil Forsyth (https://github.com/gforsyth) URL: #756
rapids_config will use `RAPIDS_BRANCH` contents to determine what branch to use Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Bradley Dice (https://github.com/bdice) URL: #776
Closes #773 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: #784
This PR changes KvikIO C++ standard from 17 to 20. Depends on #751 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) - Bradley Dice (https://github.com/bdice) URL: #749
The unit tests of mmap contain lambda expressions. The style of capturing the current object (`*this`) is not consistent: some places use `[&]` and others use `[=]`. In both cases, `*this` is captured by reference. However, in C++20, implicit capture of `*this` when the capture default is `=` is deprecated. This PR fixes the warning messages by consistently using `[&]` on the ground that the lifetime of `*this` is longer than the point the closure is being called. Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #785
) On top of #740, this PR provides Python binding for file-backed memory mapping. Closes #530 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Tom Augspurger (https://github.com/TomAugspurger) - Mads R. B. Kristensen (https://github.com/madsbk) URL: #742
## Background Knowing the size of the remote file before reading is important in remote I/O, as it allows users to pre-allocate buffer to avoid expensive on-the-fly reallocation. Currently in KvikIO this is not possible for AWS S3 presigned URL, which is a special link generated by data owner to grant time-limited access without using AWS credentials. As is described in #585, file size query in KvikIO results in the HTTP 403 (forbidden) status code. This is because the query method is based on the `HEAD` request, and AWS S3 does not allow `HEAD` for presigned URL. ## Proposed solution This PR provides a solution. The idea is to send a `GET` request (instead of `HEAD`) with a 1-byte range, so that we can still obtain the header information at a negligible cost. Since the `content-length` header is now at a fixed value of 1, we instead extract the file size value from `content-range`. This PR adds a new C++ endpoint `S3EndpointWithPresignedUrl` and Python API `kvikio.RemoteFile.open_s3_presigned_url(url)`. ## Result The following code now works properly without 403 error: ```python import kvikio import cupy presigned_url = "<long_url_generated_by_data_owner>" remote_file = kvikio.RemoteFile.open_s3_presigned_url(presigned_url) print("--> file size: {:}".format(remote_file.nbytes())) buf = cupy.zeros(remote_file.nbytes() // 8) fut = remote_file.pread(buf) read_size = fut.get() print("--> read_size: {:}", read_size) print(buf) ``` ## Limitation This PR is tested manually using a presigned URL. In a future PR, we need to add unit tests using `boto`. Closes #585 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) - Lawrence Mitchell (https://github.com/wence-) URL: #789
Issue: rapidsai/build-planning#207 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Lawrence Mitchell (https://github.com/wence-) - James Lamb (https://github.com/jameslamb) URL: #790
Forward-merge branch-25.08 into branch-25.10
rapids_config will use a user defined branch over `RAPIDS_BRANCH` contents Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #794
## Summary This PR adds WebHDFS support to KvikIO. The background information is available at #787. ## Limitations This PR does not address: - Idiomatic and secure URL parsing and validation - Testing on URL encoding/decoding (which means percent-decoded URL may or may not work at the moment) - Advanced authentication such as Kerberos These features will be added in the future. Partially addresses #787 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) - Bradley Dice (https://github.com/bdice) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) - Bradley Dice (https://github.com/bdice) URL: #788
…eemed necessary (#796) This PR improves the Python binding performance by releasing the Global Interpreter Lock (GIL) wherever necessary. The tasks include: - For function declarations, add `nogil` if missing. Only one such case has been identified, which defines an embedded template function. - At the call site of a C++ function, add `with nogil` context if missing. All the other changes fall into this category. Closes #795 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #796
## Summary This PR adds Python binding for the WebHDFS support Depends on PR #788 Closes #787 Python's built-in package `http.server` is well suited to server mocking. It enables high-level testing for the client. Closes #634 too. Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #791
Removes the features that utilize nvCOMP - Python bindings and Zarr 2 support. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Gil Forsyth (https://github.com/gforsyth) - Tom Augspurger (https://github.com/TomAugspurger) - Bradley Dice (https://github.com/bdice) URL: #798
#798 removed usage of nvcomp but left the linkage in place, kvikio extension modules still relied on nvcomp existing even though they didn't actually use any of its functionality. That is now causing problems in #800. Removing the linkage entirely here (while still revendoring manually until we can move the vendoring to cudf) should resolve that. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: #801
RAPIDS has deployed an autoscaling cloud build cluster that can be used to accelerate building large RAPIDS projects. This contributes to rapidsai/build-planning#209. Authors: - Paul Taylor (https://github.com/trxcllnt) Approvers: - Bradley Dice (https://github.com/bdice) URL: #797
Follow-up to #798 and #801. After libcudf wheels vendor libnvcomp, we can finalize removal of nvcomp from kvikio. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #804
This makes zarr an optional dependency of kvikio. The `pyproject.toml` now includes an optional dependency group 'zarr' that requires zarr>=3.0.0. `zarr` is no longer present as a (required) dependency in the conda recipes. Authors: - Tom Augspurger (https://github.com/TomAugspurger) - Bradley Dice (https://github.com/bdice) Approvers: - Bradley Dice (https://github.com/bdice) URL: #802
Upgrade the nvCOMP dependency to 5.0.0.6. This library is not used directly, but it's till vendored and used in libcudf wheels. Future changes will completely remove the dependency in kvikIO. Depends on rapidsai/rapids-cmake#896 Authors: - Vukasin Milovanovic (https://github.com/vuule) - Bradley Dice (https://github.com/bdice) Approvers: - Bradley Dice (https://github.com/bdice) URL: #800
Contributes to rapidsai/build-planning#208 * uses CUDA 13.0.0 to build and test Contributes to rapidsai/build-planning#68 * updates to CUDA 13 dependencies in fallback entries in `dependencies.yaml` matrices (i.e., the ones that get written to `pyproject.toml` in source control) ## Notes for Reviewers This switches GitHub Actions workflows to the `cuda13.0` branch from here: rapidsai/shared-workflows#413 A future round of PRs will revert that back to `branch-25.10`, once all of RAPIDS supports CUDA 13. Authors: - James Lamb (https://github.com/jameslamb) - Bradley Dice (https://github.com/bdice) - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Bradley Dice (https://github.com/bdice) URL: #803
This fully devendors libnvcomp from libkvikio wheels. A complementary PR is needed to vendor libnvcomp.so.* inside of libcudf wheels: rapidsai/cudf#19743 Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Matthew Murray (https://github.com/Matt711) - Mike Sarahan (https://github.com/msarahan) URL: #805
… URL (1/2): C++ implementation (#793) This PR adds a new remote I/O utility function `RemoteHandle::open(url)` that infers the remote endpoint type from the URL to facilitate `RemoteHandle` creation. - Supported endpoint types include S3, S3 with presigned URL, WebHDFS, and generic HTTP/HTTPS. - Optionally, instead of letting `open` figure it out, users can explicitly specify the endpoint type by passing an enum argument `RemoteEndpointType`. - Optionally, users can provide an allowlist that restricts the endpoint candidates - Optionally, users can specify the expected file size. This design is to fully support the existing constructor overload `RemoteHandle(endpoint, nbytes)`. A byproduct of this PR is an internal utility class `UrlParser` that uses the idiomatic libcurl URL API to validate the URL against "[RFC 3986 plus](https://curl.se/docs/url-syntax.html)". ## This PR depends on - [x] #791 - [x] #788 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #793
This PR updates the rapids-dependency-file-generator hook to get rapidsai/dependency-file-generator#163. Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #809
AWS S3 provides a non-standard S3 scheme for internal use (such as for AWS CLI). The URL takes the form `s3://<bucket-name>/<object-name>`, where `<object-name>` may contain `/` characters indicating subdirectories. The newly added `open` function for remote I/O currently uses an incorrect regular expression, causing object names containing subdirectories to be rejected. This PR fixes this bug. This PR also improves the usage of regular expression by making the pattern constant `static`. Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #810
… URL (2/2): Python binding (#808) This PR adds Python binding to #793 Closes #807 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #808
Contributes to rapidsai/build-planning#208 Now that rapidsai/shared-workflows#413 is merged, this converts all GitHub Actions references from `@cuda13.0` back to `branch-25.10`. ## Notes for Reviewers This is safe to admin-merge because the change is a no-op... configs on those 2 branches are identical.
…he GPUs in the system (#814) We've seen multiple issues over the months from DGX Spark users when it comes to this specific file. This PR address these issues by applying a skip for the max_device_cache_size (cuFileDriverSetMaxCacheSize) setter by examining the output of nvidia-smi. Authors: - https://github.com/ahoyle-nvidia - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Tianyu Liu (https://github.com/kingcrimsontianyu) - Mads R. B. Kristensen (https://github.com/madsbk) URL: #814
Our HTTP library, libcurl, includes a [`CURLOPT_VERBOSE`](https://curl.se/libcurl/c/CURLOPT_VERBOSE.html) setting that can be useful for debugging. To help our users debug things, I've added a new `KVIKIO_REMOTE_VERBOSE` option that configures this. By default, it's off (no change). If the user sets `KVIKIO_REMOTE_VERBOSE=1` then information from the HTTP requests and responses will be printed to stderr. Authors: - Tom Augspurger (https://github.com/TomAugspurger) Approvers: - Tianyu Liu (https://github.com/kingcrimsontianyu) - Bradley Dice (https://github.com/bdice) URL: #815
Previous PR #749 forgets to bring the entrée to the table: Only the C++ code in tests and benchmarks use C++20, but not the main library. This PR fixes this oversight. Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #819
## Background `libcurl` have two path parameters related to the certificate authority (CA): - `CURLOPT_CAINFO`, which specifies the CA bundle file path. - `CURLOPT_CAPATH`, which specifies the directory of individual CA certificates with hash-based naming. The default paths are determined at compile-time, which can cause issues if the Linux distributions where `libcurl` is built and run are different (e.g. on Rocky Linux vs Ubuntu as in our CI vs our lab system), and the certificates files are likely at different locations. This problem has been observed in KvikIO's wheel distribution, where HTTPS would fail with the message: >error setting certificate verify locations: CAfile: /etc/pki/tls/certs/ca-bundle.crt CApath: /etc/ssl/certs ## This PR This PR addresses this problem. The certificate path is now explicitly searched for in the following order. The compile-time parameters, if any, are still used but treated with lowest priority. - CA bundle file: Check env vars `CURL_CA_BUNDLE`, and `SSL_CERT_FILE` - CA directory: Check env vars `SSL_CERT_DIR` - CA bundle file: Search a set of distribution-specific locations for accessible bundle - CA directory: Search a set of distribution-specific locations for accessible directory - CA bundle file: Check if the compile-time path is given and accessible - CA directory: Check if the compile-time parameter is given and accessible Depends on #819 for the use of `static` structured binding which is only available in C++ >=20 Closes #711 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #817
Some of these APIs were identical but presumably duplicated due to otherwise creating a circular include dependency. Moving the manager out of the compat_mode header resolves that and allows us to remove the duplication. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Mads R. B. Kristensen (https://github.com/madsbk) URL: #816
## Background KvikIO supports access to private S3 objects that require AWS credentials: ```python # Method 1 kvikio.RemoteFile.open_s3(bucket, key) # Method 2 kvikio.RemoteFile.open_s3_url(url) # Method 3 kvikio.RemoteFile.open(url) ``` For public S3 object, these functions will throw the following exceptions. >S3: must provide `aws_region` if AWS_DEFAULT_REGION isn't set. A workaround is to simply use the generic HTTP/HTTPS endpoint: ```python # Method 1 kvikio.RemoteFile.open_http(http_url) # Method 2 kvikio.RemoteFile.open(url, RemoteEndpointType.HTTP) ``` However, this workaround loses the feature of S3 URL syntax check. ## This PR - Adds support for accessing public S3 objects in C++ and Python by having a new endpoint type `S3PublicEndpoint`. This endpoint does not require AWS credentials. - Updates the unified interface `open(url)` that can automatically infer the endpoint type. Under `AUTO` mode, for a syntactically valid S3 URL using HTTP/HTTPS protocol, KvikIO now checks the connectivity using a private S3 endpoint, and if failed proceeds to use a public S3 endpoint. - Updates the comments on each endpoint to further improve clarity. - Adjusts Python APIs `kvikio.RemoteFile.open_*` from class method to static method (which is a breaking change). Closes #806 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #820
Fixes an issue where CUDA 13 packages named like `linux-aarch64/libkvikio-25.10.00a43-cuda13_0_250916_b69d9aea.conda` were getting dependencies on `cuda-version >=12.2.0a0,<14.0a0`, which allowed them to be used in CUDA 12 environments. That is not desired and could cause problems. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - James Lamb (https://github.com/jameslamb) - Mads R. B. Kristensen (https://github.com/madsbk) URL: #827
Resolves #830 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - Tianyu Liu (https://github.com/kingcrimsontianyu) URL: #832
Configure repo for automatic release notes generation
`kvikio::RemoteHandle::open()` has started to support public S3 since #820. When `open()` sees an S3 URL, it first assumes a private S3 object and queries its size. If the query fails, it proceeds to assume that the file is a public S3. During the construction of a private S3 object, the constructor scans the environment variables for AWS credentials. Manual testing of #820 accidentally includes the env vars all the time and hides a bug: in absence of env vars, the constructor of the private S3 object will throw an exception, which is unhandled, and KvikIO never gets a chance to try with public S3. This PR fixes this bug. Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) - Vukasin Milovanovic (https://github.com/vuule) URL: #831
This is an empty commit to trigger a build. It is used when builds get stuck with an old ABI. Rebuilding updates them to the new one.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
❄️ Code freeze for
branch-25.10
and v25.10 releaseWhat does this mean?
Only critical/hotfix level issues should be merged into
branch-25.10
until release (merging of this PR).What is the purpose of this PR?
branch-25.10
intomain
for the release