-
Notifications
You must be signed in to change notification settings - Fork 54
Data import methods free memory #763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Basically, the idea is that the importer should not free the |
I think that would make more sense personally, but it would be a serious backwards incompatible change. The alternative would be to add additional methods (perhaps an overloaded method signature with a boolean parameter, although that's not really pretty either) that let the caller control what happens. I would be happy to put together a PR for this if you like. Any preference on what this should look like exactly? |
New methods is probably cleanest |
I took a first stab at this in #766. The PR crossed your comment, so the initial version is the breaking change approach. I'll close this discussion issue and continue in the PR. |
Uh oh!
There was an error while loading. Please reload this page.
The various import methods in the Data class are documented to transfer ownership of the Arrow data from the input object to internal values as per the C FFI. If I'm reading https://arrow.apache.org/docs/format/CDataInterface.html#moving-an-array correctly, implementations are supposed to hollow out the input object in that case by nulling the release field. In practice a class like ArrayImporter does this
So not only is the array marked as released, it is also closed so the ArrowArray instance is no longer usable. This may or may not happen when you call
Data#importIntoVector
for instance, since that might abort early when given a nullAllocator
which makes the post conditions ofData#importIntoVector
a bit unclear. You kind of expect the ArrowArray to have been closed, but you still need to call close yourself as well or risk having a memory leak.What's the rationale behind the choice to also call
close
on objects passed to Data rather than expecting people to use try-with-resources/try-finally themselves?The text was updated successfully, but these errors were encountered: