Skip to content

Conversation

Flauschbaellchen
Copy link

Previously, whenever an S3 file was opened, it was directly downloaded into a SpooledTemporaryFile, independently if it was ever read or not. For files with a bigger size it reduced performance and increased traffic and expenses in cloud environments.

This commit wraps the access to the S3 file into a S3SeekableFile wrapper which reads requested bytes on demand, without loading the file completely into memory or local storage.

Previously, whenever an S3 file was opened, it was directly downloaded
into a `SpooledTemporaryFile`, independently if it was ever read or not.
For files with a bigger size it reduced performance and increased
traffic and expenses in cloud environments.

This commit wraps the access to the S3 file into a S3SeekableFile
wrapper which reads requested bytes on demand, without loading the file
completely into memory or local storage.
self._file, ExtraArgs=params, Config=self._storage.transfer_config
)
self._file.seek(0)
self._file = S3SeekableFile(self.obj, ExtraArgs=params)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that Config=self._storage.transfer_config was removed... I'm not sure if this is needed as the method changed from download_fileobj to get as well.

@Flauschbaellchen
Copy link
Author

Flauschbaellchen commented Jul 14, 2025

@jschneier May I bump this PR? I would love to hear from you what you think about this change. It would help a lot when working with large files on S3 if only a subset of data needs to be read. It also reduces the time it takes from opening the file until the first chunk can be processed by the application as it does not need to wait until the full download has been completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant