Skip to content

syncfs: don't erroneously return success when the pool is suspended #17420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

robn
Copy link
Member

@robn robn commented Jun 4, 2025

[Sponsors: Klara, Inc., Wasabi Technology, Inc.]

Motivation and Context

See #17416. This PR is tackling the core problem described: syncfs() can return success even when the pool has failed. There's a lot more we can do but this should address correctness, which is more important than anything else.

I'm hesitant to say this closes #17416, since there's more we could do in there to improve the situation. On the other hand, it straight-up says "this is wrong" and this PR un-wrongs it, so idk, reviewer's call :)

Description

See individual commits, described below.

  1. Add a test case to demonstrate the issue, and verify the fix. We don't generally care what syncfs() returns when the pool is up, but we want to be sure that if the pool suspends, it either blocks until it returns, or returns an error. (we do care, of course, but it's not something we can address here; I'm working towards that in ZIL: "crash" the ZIL if the pool suspends during fallback #17398 and later).

  2. Simplify zfs_sync() by removing code for scenarios that can't occur, that is, called with NULL superblock and called with no ZIL. See syncfs() returns success on suspended pools #17416 for more on this.

  3. Change the suspended check to return EIO if the pool suspends. This is nod back to the original reason for this check, which is to not block on suspended pools during shutdown (see discussion in syncfs() returns success on suspended pools #17416). This will likely change in ZIL: "crash" the ZIL if the pool suspends during fallback #17398 once the ZIL can report a failure directly without blocking, but until then, this seems like a reasonable middle road to not block the call.

  4. Ensure safe error behaviour according to what syncfs() can do. The comment explains more, but the short version is that before 5.17 the kernel ignores errors directly returned by sync_fs(), but a workaround is available back to 5.8. If the workaround is not available either, then the call will return success to userspace, and we have no choice but to block until the txg completes. Unamazing, but ensures correctness.

How Has This Been Tested?

The new syncfs_suspend test passes on 5.4.293 (blocks, no fallbacks available) and 6.1.140 (sync_fs return works, immediate error return).

It's difficult to test the superblock writeback error case on a release kernel, since the LTS kernels in the 5.8-5.17 range that have no other options all have the fix backported, and non-LTS 5.x kernels are difficult to build these days. To check, I used 6.1.140, commented out the version check and forced the returned error to 0, leaving the superblock writeback error as the only remaining error state. This went as expected - syncfs() return an error on a suspended pool.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@robn robn changed the title `syncfs: don't erroneously return success when the pool is suspended syncfs: don't erroneously return success when the pool is suspended Jun 4, 2025
Copy link
Contributor

@pcd1193182 pcd1193182 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution is sort of gross, but so it the problem space. I can't think of a better approach than this one, given the constraints we have. I would say you could close the github issue, though I would consider opening a new one specifically for "We can't tell if the system is shutting down in the syncfs() path"

}

zil_commit(zfsvfs->z_log, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice this got rid of the check that z_log isn't NULL; is that always true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fairly sure, but only by code read. It's always set in zfsvfs_setup(), and NULLed in zfsvfs_teardown(). That can fail after NULLing it in very niche situations (something about unmounting during online recieve) but I'm not sure I see any way for a sync op to get here in that case, because they won't beat the teardown lock.

I'm not exactly brimming with confidence, as you see. If I leave the check in, then it seems we have to have a txg_wait_synced() call here instead to ensure anything was written out.

zfsvfs_t *zfsvfs = sb->s_fs_info;
ASSERT3P(zfsvfs, !=, NULL);
if (zfs_enter(zfsvfs, FTAG) == 0) {
txg_wait_synced(dmu_objset_pool(zfsvfs->z_os), 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry that blocking system-wide sync calls might be too disruptive. I'd say: "if they don't care about errors, why should we?".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love it, but I don't see any not-ridiculous way to distinguish sync() from syncfs() from here.

@amotin amotin added the Status: Code Review Needed Ready for review and testing label Jun 4, 2025
@robn robn force-pushed the syncfs-suspend branch from d88b864 to def0ed3 Compare June 4, 2025 21:44
robn added 4 commits June 10, 2025 11:10
Fairly coarse, but if it returns while the pool suspends, it must be
with an error.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <[email protected]>
The superblock pointer will always be set, as will z_log, so remove code
supporting cases that can't occur (on Linux at least).

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <[email protected]>
If the pool is suspended, we'll just block in zil_commit(). If the
system is shutting down, blocking wouldn't help anyone. So, we should
keep this test for now, but at least return an error for anyone who is
actually interested.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <[email protected]>
If the kernel will honour our error returns, use them. If not, fool it
by setting a writeback error on the superblock, if available.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <[email protected]>
@robn robn force-pushed the syncfs-suspend branch from def0ed3 to 4b13dd1 Compare June 10, 2025 01:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

syncfs() returns success on suspended pools
3 participants