Skip to content

Use IO instead of BytesIO/TextIO #1495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

RazerM
Copy link

@RazerM RazerM commented Apr 29, 2025

BytesIO/TextIO are the types returned by open(), but this precludes other working types such as tempfile.TemporaryFile().

BytesIO/TextIO are the types returned by open(), but this precludes
other working types such as tempfile.TemporaryFile()
Copy link
Member

@balamurugana balamurugana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need tempfile.TemporaryFile and why can't it be compatible with BinaryIO or TextIO?

@harshavardhana
Copy link
Member

Why do we need tempfile.TemporaryFile and why can't it be compatible with BinaryIO or TextIO?

I think @balamurugana python made it a base class https://docs.python.org/3/glossary.html#term-file-object

@RazerM
Copy link
Author

RazerM commented Apr 29, 2025

I misremembered about TemporaryFile which does return a BinaryIO subclass, but my point still stands that IO is generally recommended as the type to use.

BinaryIO is a subclass of IO[bytes]. IO is what implements read, write, etc. And by accepting the most compatible form of the interface, more argument types are supported.

put_object only calls .read(), and therefore there isn't a reason to require BinaryIO instead of IO[bytes]. I think the general consensus is that the IO types are a bit of a mess1 as introduced in PEP 484, and there is various discourse around to that effect.

Footnotes

  1. There's no protocol for Minio just to accept an argument that implements .read(), for example. You could define that protocol yourself if you feel it's worth it.

@balamurugana
Copy link
Member

Why do we need tempfile.TemporaryFile and why can't it be compatible with BinaryIO or TextIO?

I think @balamurugana python made it a base class https://docs.python.org/3/glossary.html#term-file-object

@harshavardhana It is about typing. Why not https://docs.python.org/3/library/typing.html#abcs-for-working-with-io works?

@balamurugana
Copy link
Member

I misremembered about TemporaryFile which does return a BinaryIO subclass, but my point still stands that IO is generally recommended as the type to use.

BinaryIO is a subclass of IO[bytes]. IO is what implements read, write, etc. And by accepting the most compatible form of the interface, more argument types are supported.

put_object only calls .read(), and therefore there isn't a reason to require BinaryIO instead of IO[bytes]. I think the general consensus is that the IO types are a bit of a mess1 as introduced in PEP 484, and there is various discourse around to that effect.

Footnotes

1. There's no protocol for Minio just to accept an argument that implements `.read()`, for example. You could define that protocol yourself if you feel it's worth it. [↩](#user-content-fnref-1-2017ed1cb690647813e175fd895d45b6)

If TemporaryFile is not BinaryIO, it means you cannot do binary read/write. I do not understand why is it a problem here. This PR just changes types from BinaryIO/TextIO and I don't find any valid reason.

@RazerM
Copy link
Author

RazerM commented Apr 29, 2025

It seems you didn't read what I wrote?

IO is the protocol that provides .read(). .read() is what minio-py is using.

If [whatever] is not BinaryIO, it means you cannot do binary read/write

Sorry, this is objectively not true. Any type that is IO[bytes] can do binary read/write.

Copy link
Member

@balamurugana balamurugana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the check

import tempfile
from typing import BinaryIO, TextIO


def read_text(data: TextIO) -> None:
    print("textio", data)


def read_binary(data: BinaryIO) -> None:
    print("binaryio", data)


binfile = tempfile.TemporaryFile()
read_text(binfile)
read_binary(binfile)

textfile = tempfile.TemporaryFile("w+")
read_text(textfile)
read_binary(textfile)

and mypy check returns error as expected

$ mypy --strict --no-color-output c.py 
c.py:14: error: Argument 1 to "read_text" has incompatible type "BufferedRandom"; expected "TextIO"  [arg-type]
c.py:19: error: Argument 1 to "read_binary" has incompatible type "TextIOWrapper[_WrappedBuffer]"; expected "BinaryIO"  [arg-type]
Found 2 errors in 1 file (checked 1 source file)

This PR doesn't add any value.

@RazerM
Copy link
Author

RazerM commented Apr 30, 2025

You keep talking about TemporaryFile but I acknowledged in my second message already that it does implement BinaryIO:

I misremembered about TemporaryFile which does return a BinaryIO subclass

Rather than construct hypothetical mypy scripts, you could acknowledge that mypy still passes on minio-py after my change, which it wouldn't if IO[bytes] did not provide the methods that minio-py uses.

from typing import IO, BinaryIO, reveal_type


def old_put_object(data: BinaryIO) -> None:
    reveal_type(data.read())


def new_put_object(data: IO[bytes]) -> None:
    reveal_type(data.read())


def third_party_code(data: IO[bytes]) -> None:
    old_put_object(data)
    new_put_object(data)
main.py:5: note: Revealed type is "builtins.bytes"
main.py:9: note: Revealed type is "builtins.bytes"
main.py:13: error: Argument 1 to "old_put_object" has incompatible type "IO[bytes]"; expected "BinaryIO"  [arg-type]
Found 1 error in 1 file (checked 1 source file)

@balamurugana
Copy link
Member

@RazerM Passing CI is not the question here. From python standard libraries, what is not compatible with BinaryIO, but IO[bytes]. I don't find any valid reason for this PR.

@RazerM
Copy link
Author

RazerM commented Apr 30, 2025

Passing CI is not the question here.

But it is? If minio needed something that BinaryIO provides but IO[bytes] doesn't, your CI would fail.

From python standard libraries, what is not compatible with BinaryIO, but IO[bytes]

I have explained several times the difference between these interfaces.

The amount of arguing against this change is frankly baffling. You keep saying my change has no value despite the fact that typing.IO is a standard interface. You haven't provided any reason why minio requires a subclass of IO[bytes].

Here is the entire definition of BinaryIO:

class BinaryIO(IO[bytes]):
    @abstractmethod
    def __enter__(self) -> BinaryIO: ...

As you can see, there is no value in using BinaryIO in argument position of a function, as it doesn't provide anything that IO[bytes] doesn't have. It's mainly used in the standard library in return position.


It doesn't make sense for you to ask for some standard library type which this applies to as if we're not just using public interfaces. The whole point of interfaces is for types to implement them. But since you keep asking here is one I've had to go and find:

from typing import IO, BinaryIO, reveal_type
import tempfile


def old_put_object(data: BinaryIO) -> None:
    reveal_type(data.read())


def new_put_object(data: IO[bytes]) -> None:
    reveal_type(data.read())


with tempfile.NamedTemporaryFile() as tmpfile:
    old_put_object(tmpfile)
    new_put_object(tmpfile)
main.py:6: note: Revealed type is "builtins.bytes"
main.py:10: note: Revealed type is "builtins.bytes"
main.py:14: error: Argument 1 to "old_put_object" has incompatible type "_TemporaryFileWrapper[bytes]"; expected "BinaryIO"  [arg-type]
Found 1 error in 1 file (checked 1 source file)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants