Skip to content

Conversation

behlendorf
Copy link
Contributor

Motivation and Context

Initial patch set for zfs-2.4.0-rc2. I expect to include some additional cherry picks from master in this PR before tagging rc2.

Description

This pull in all of the recent fixes on the master branch with a one significant exception. The #17375 patch to drop the legacy FreeBSD aliases wont be included until zfs-2.5.

How Has This Been Tested?

Will be tested by the CI.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

congzhangzh and others added 27 commits September 9, 2025 17:03
Update the zfsunlock initramfs hook to provide instructions on how
to unlock the root filesystem when appropriate.  The intent is to
make the dropbear ssh MOTD more user friendly.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ahelenia Ziemiańska <[email protected]>
Signed-off-by: Cong Zhang <[email protected]>
Closes openzfs#17661
Closes openzfs#17662
The concurrent execution of feature_sync() can lead to a panic due 
to an unprotected update of the feature refcount.  Resolve this by
using the spa->spa_feat_stats_lock to synchronize the update of the 
refcount.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Youzhong Yang <[email protected]>
Closes openzfs#17184
Closes openzfs#17632
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Mark Johnston <[email protected]>
Closes openzfs#17665
Add a test case to reproduce issue openzfs#17277:

1. Make a pool
2. Write a file to the pool
3. Mount the file as a loopback device
4. Make an XFS filesystem on the loopback device
5. Mount the XFS filesystem... <hangs>

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Issue openzfs#17277
Closes openzfs#17329
If $KERNEL_CC was not defined, configure status output would print an
empty string where the kernel compiler should have been. Fix this and
simplify the code generally.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ivan Shapovalov <[email protected]>
Closes openzfs#16997
The sorting logic is all in cmd/zfs/zfs_iter.c.  I borrowed
where I could from the comments in the source code, but please
note that the comment to zfs_sort() is a little imprecise, or at
least incomplete, because it doesn't give any indication of the
chronological sort that will be used by default for snapshots in
zfs_compare().

While adding this description, I took the liberty to copy-edit
the rest of the file lightly.

In those edits, I've removed "If specified, you can list
property information by the absolute pathname or the relative
pathname" because, in context, it seems more confusing than
helpful.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Signed-off-by: Shawn Bayern <[email protected]>
Closes openzfs#15713
Closes openzfs#15869
Add an openzfs-2.4 compatibility file for the next release.

While there are no compatibility difference between Linux and
FreeBSD for 2.4 symlinks for the -linux and -freebsd names are
created for any scripts expecting that convention.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: ofthesun9 <[email protected]>
Closes openzfs#17672
Closes openzfs#17673
As described in freebsd/freebsd-src#1305,
FreeBSD's installer defaults to zroot/home for user home directories.

For FreeBSD only, set the default prefix for pam_zfs_key to match.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Eric A. Borisch <[email protected]>
Closes openzfs#17600
glibc includes linux/stat.h for statx, but musl defines its own statx
struct and associated constants, which does not include STATX_MNT_ID
yet. Thus, including linux/stat.h directly should be avoided for
maximum libc compatibility.

Tested on:
  - glibc: x86_64, i686, aarch64, armv7l, armv6l
  - musl: x86_64, aarch64, armv7l, armv6l

Reviewed-by: Brian Behlendorf <[email protected]>
Tested-By: Achill Gilgenast <[email protected]>
Signed-off-by: classabbyamp <[email protected]>
Closes openzfs#17675
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Ziaee <[email protected]>
Closes openzfs#17676
LLVM-21 enables -Wuninitialized-const-pointer which results in the
following compiler warning and the bdev_file_open_by_path() interface
not being detected for 6.9 and newer kernels.  The blk_holder_ops
are not used by the ZFS code so we can safely use a NULL argument
for this check.

    bdev_file_open_by_path/bdev_file_open_by_path.c:110:54: error:
    variable 'h' is uninitialized when passed as a const pointer
    argument here [-Werror,-Wuninitialized-const-pointer]

Reviewed-by: Rob Norris <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#17682
Closes openzfs#17684
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#17690
If ARCH environment variable is set it can cause the failure of the
kernel modules check during the configure step. The resulting error
will be confusing, and may looks like this:

>    checking for kernel config option compatibility... done
>    checking whether CONFIG_MODULES is defined... no
>    configure: error:
>        *** This kernel does not include the required loadable module
>        *** support!

Detect when ARCH is print a warning.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Maksym Shkolnyi <[email protected]>
Closes openzfs#17680
Because GitHub creates a merge commit on top of real head, so the check
on HEAD will fail regardlessly.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes openzfs#17695
Otherwise it might become `if [ == "" ]` which is ill-formed.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes openzfs#17695
They will become zarcsummary and zarcstat in 2.4.0.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes openzfs#16357
Closes openzfs#17695
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes openzfs#16357
Closes openzfs#17695
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes openzfs#16357
Closes openzfs#17695
If we call ddt_log_load() for legacy ddt, we will end up going into
ddt_log_update_stats() and filling uninitialized value into ddo_dspace.
This value will then get added to dedup_table_size during
ddt_get_dedup_object_stats().

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Closes openzfs#17019
Closes openzfs#17699

Signed-off-by: Chunwei Chen <[email protected]>
Co-authored-by: Chunwei Chen <[email protected]>
If the target already exists, lt will fail. Force it to recreate the
symlinks.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#17702
FreeBSD now has a pathconf name called _PC_CLONE_BLKSIZE
which is the block size supported for block cloning for
the file system.  Since ZFS's block size varies per file,
return the largest size likely to be used, or zero if block
cloning is not supported.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Rick Macklem <[email protected]>
Closes openzfs#17645
GCC complains about casting a 64-bit integer to a 32-bit pointer.
Originally committed downstream as
freebsd/freebsd-src@2d76470b701

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by:	Alan Somers <[email protected]>
Sponsored by:	ConnectWise
Closes openzfs#17706
This is one problem currently preventing OpenZFS from building on
FreeBSD/i386.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:	Alan Somers <[email protected]>
Sponsored by:	ConnectWise
Closes openzfs#17704
fault_limits would often hit the 10min timeout and be killed on Fedora
41-42.  Investigation showed that the 'fill_fs' portion of the test,
which would fill the pool with junk data before vdev replacement, was
writing highly compressible data (~126x), which would have taxed the
CPUs, potentially causing the timeout.

The fix is to write random data and reduce the number of writes.
This has an added benefit that more real data being is written to the
pool (~1GB) vs the old way (~300-400MB).  It also speeds up the test.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Paul Dagnelie <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes openzfs#17709
Accidentally removed calls in ed048fd.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#17621
We only have extremely narrow uses, so move it all into a single
function that does only what we need, with and without d_set_d_op().

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#17621
@behlendorf behlendorf added the Status: Work in Progress Not yet ready for general review label Sep 10, 2025
robn and others added 2 commits September 10, 2025 15:01
While rw_destroy() may do nothing on Linux, we still want to make sure
that we don't have any holders outstanding like we do for mutexes.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#17718
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Colm Buckley <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes openzfs#16357
Closes openzfs#17712
Harry-Chen and others added 18 commits September 10, 2025 15:01
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Colm Buckley <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes openzfs#16357
Closes openzfs#17712
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Colm Buckley <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes openzfs#16357
Closes openzfs#17712
This commit synchronizes the debian packaging files with the distro
version (also maintained by me) as much as possible.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Colm Buckley <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes openzfs#17712
When attempting to debug performance problems on large systems, one of
the major factors that affect performance is free space
fragmentation. This heavily affects the allocation process, which is an
area of active development in ZFS. Unfortunately, fragmenting a large
pool for testing purposes is time consuming; it usually involves filling
the pool and then repeatedly overwriting data until the free space
becomes fragmented, which can take many hours. And even if the time is
available, artificial workloads rarely generate the same fragmentation
patterns as the natural workloads they're attempting to mimic.

This patch has two parts. First, in zdb, we add the ability to export
the full allocation map of the pool. It iterates over each vdev,
printing every allocated segment in the ms_allocatable range tree. This
can be done while the pool is online, though in that case the allocation
map may actually be from several different TXGs as new ones are loaded
on demand.

The second is a new subcommand for zhack, zhack metaslab leak (and its
supporting kernel changes). This is a zhack subcommand that imports a
pool and then modified the range trees of the metaslabs, allowing the
sync process to write them out normall. It does not currently store
those allocations anywhere to make them reversible, and there is no
corresponding free subcommand (which would be extremely dangerous); this
is an irreversible process, only intended for performance testing. The
only way to reclaim the space afterwards is to destroy the pool or roll
back to a checkpoint.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes openzfs#17576
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes openzfs#17576
test->id is a uint64_t, not a long.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by:	Alan Somers <[email protected]>
Sponsored by:	ConnectWise
Closes openzfs#17707
Print a warning if you're attempting to run a ZTS test that calls
'user_run', and the ephemeral user doesn't have permissions to
access the test binaries.

This can happen if you're running ZTS from a local git repo.  In
that case the test user (say, 'testuser1') may need access to the
ZTS binaries in:

/home/<your_username>/zfs/tests/zfs-tests/bin/

... but 'testuser1' doesn't have permission to enter your home dir:

/home/<your_username>

The warning will help alert users to what is going on.  This will
not be an issue when ZTS is actually installed on the system
(via 'make install' or from packages).

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes openzfs#17721
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes openzfs#17227
A single slow responding disk can affect the overall read
performance of a raidz group.  When a raidz child disk is
determined to be a persistent slow outlier, then have it
sit out during reads for a period of time. The raidz group
can use parity to reconstruct the data that was skipped.

Each time a slow disk is placed into a sit out period, its
`vdev_stat.vs_slow_ios count` is incremented and a zevent
class `ereport.fs.zfs.delay` is posted.

The length of the sit out period can be changed using the
`raid_read_sit_out_secs` module parameter.  Setting it to
zero disables slow outlier detection.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Contributions-by: Don Brady <[email protected]>
Contributions-by: Brian Behlendorf <[email protected]>
Closes openzfs#17227
While it would be nice to be able to scrub a pool imported read-only
this will currently trip an ASSERT.  Before we can support this there
are some designs challenges which need to be thought through first.

For starters, a read-only import skips reading certain information 
from disk which it knows won't be needed, such as the space maps.
Furthermore, the scrub process expects to be checkpoint it's progress, 
update the on disk error log, and issue repair IO.  None of which 
would be possible when the pool is imported read-only.  

Each of these wrinkles can certainly be handled, but that will take 
some signifcant work.  In the meanwhile we disable the 'zpool scrub' 
command when the pool is imported read-only.

Reviewed-by: Alan Somers <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#17527
Closes openzfs#17717
Historically, ZED has blindly spawned off zedlets in parallel and never
worried about their completion order.  This means that you can
potentially have zedlets for event number 2 starting before zedlets for
event number 1 had finished.  Most of the time this is fine, and it
actually helps a lot when the system is getting spammed with hundreds
of events.

However, there are times when you want your zedlets to be executed
in sequence with the event ID.  That is where synchronous zedlets
come in.

ZED will wait for all previously spawned zedlets to finish before
running a synchronous zedlet.  Synchronous zedlets are guaranteed to be
the only zedlet running.  No other zedlets may run in parallel with a
synchronous zedlet.  Users should be careful to only use synchronous
zedlets when needed, since they decrease parallelism.

To make a zedlet synchronous, simply add a "-sync-" immediately
following the event name in the zedlet's file name:

	EVENT_NAME-sync-ZEDLETNAME.sh

For example, if you wanted a synchronous statechange script:

	statechange-sync-myzedlet.sh

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes openzfs#17335
A new `zfs allow` permissions that ONLY allows sending replication
streams in raw (encrypted) mode, so encrypted data will not be
decrypted as part of the replication process.

Sponsored-by: Klara, Inc.
Sponsored-by: Karakun AG
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Co-authored-by: JT Pennington <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Closes openzfs#17543
Create tests for the new send:encrypted permission

Sponsored-by: Klara, Inc.
Sponsored-by: Karakun AG
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: JT Pennington <[email protected]>
Closes openzfs#17543
In ddt_log_load(), when removing dup entry from flushing tree, it doesn't
free the entry causing memleak.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Co-authored-by: Chunwei Chen <[email protected]>
Closes openzfs#17657
Closes openzfs#17730
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes openzfs#17728
zfs_aclset_common() might be called for newly created or not even
created vnodes, that triggers assertions on newer FreeBSD versions
with DEBUG_VFS_LOCKS included into INVARIANTS.  In the first case
make sure to call vn_seqc_write_begin()/_end(), in the second just
skip the assertion.

The similar has to be done for project management IOCTL and file-
bases extended attributes, since those are not going through VFS.

Signed-off-by: Alexander Motin <[email protected]>
Closes openzfs#17722
For ABS() to work, the argument must be signed, but rrdd_time is
uint64_t.  Clang noticed it.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Mariusz Zaborski <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Fixes openzfs#16853
Closes openzfs#17733
Several small changes intended to make this test reliable.

- Leave the default compression enabled for the pool and switch
  to using /dev/urandom as the data source.  Functionally this
  shouldn't impact the test but it's preferable to test with
  the pool defaults when possible.

- Verify the device is created and removed as required.  Switch
  to a unique volume name for a more clarity in the logs.

- Use the ZVOL_DEVDIR to specify the device path.

- Speed up the test by creating the pool with an ashift=12 and
  testing 4K, 8K, 128K volblocksizes.

Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#17725
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Work in Progress Not yet ready for general review
Projects
None yet
Development

Successfully merging this pull request may close these issues.