Skip to content

Conversation

@amotin
Copy link
Member

@amotin amotin commented Oct 28, 2025

In cases where all issued ZIOs must succeed, and we can't do anything clever about the errors, we should just explicitly set ZIO_FLAG_TRYHARD and let OS to do all the reasonable retries.

In other cases, where retries can be different from the original, for example, some ZIOs are allowed to fail due to redundancy, or we can disable aggregation on retrial to get at least some of the data, we can do first pass without TRYHARD, and only if needed retry with ZIO_FLAG_IO_RETRY (which implies TRYHARD semantics).

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

In cases where all issued ZIOs must succeed, and we can't do
anything clever about the errors, we should just explicitly set
ZIO_FLAG_TRYHARD and let OS to do all the reasonable retries.

In other cases, where retries can be different from the original,
for example, some ZIOs are allowed to fail due to redundancy, or
we can disable aggregation on retrial to get at least some of
the data, we can do first pass without TRYHARD, and only if needed
retry with ZIO_FLAG_IO_RETRY (which implies TRYHARD semantics).

Signed-off-by: Alexander Motin <[email protected]>
@amotin amotin added the Status: Code Review Needed Ready for review and testing label Oct 28, 2025
Copy link
Member

@robn robn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cautious yes. I didn't love it, because IO_RETRY is usually set in the pipeline when it retries from vdev_io_assess, and I was asking myself for a while why the caller should be setting it rather than allowing the pipeline to do it. But then I saw that retries are only done for logical (vd == NULL) IO, and all label work is vdev-specific, so it would have been missed. And with that, I think it makes sense.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Accepted Ready to integrate (reviewed, tested)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants