Wednesday, January 8, 2025

command line – fs_usage, output display regarding io call – the options and flags shown

Throttling and passive file io may get some illumination from
https://mjacobson.net/blog/2022-02-throttling.html

Some of this may explain why my 2013 iMac ran like treacle after update to Catalina. It flies now I move to boot from external USB Samsung T7 … and thanks/kudos to Bombich for CCC making a bootable Catalina copy so well. (This may be the last macOS that could be cloned like that.)

In case that evaporates here are a few main points…

On Darwin, threads block inside throttle_lowpri_io when they’re being artificially delayed to slow down their I/O operations, with the ultimate goal of optimizing the performance of higher-priority I/O. And indeed, in both of these cases (and in the other similar problems I saw), the chain of blockage ultimately leads to a thread with less-than-highest I/O priority.

..
To keep track of which I/Os should be throttled, the Darwin kernel maintains what I’ll call throttling domains (the source calls them struct _throttle_io_info_t). In rough terms, each throttling domain is meant to correspond one-to-one to a disk device.

When an I/O is issued through the spec_strategy routine, the kernel has to determine which throttling domain the operation lives in, so that the operation may either be throttled or cause throttling of lower-priority operations. The throttling domain is determined first by taking the vnode (i.e., file) the I/O is being done to and walking up to its enclosing mount_t.

From there, the code looks at the mount’s mnt_devbsdunit property. The mnt_devbsdunit describes the “disk number” of the device the filesystem lives on. If a filesystem is mounted from /dev/disk3, then the mount’s mnt_devbsdunit is 3. If the backing disk is actually a partition of a disk, then the number comes from the whole disk, not the partition; e.g., /dev/disk3s2 results in 3.[2]

The mnt_devbsdunit—which can range from 0 to 63[3]—determines which throttling domain is in play.

..

Logical volume groups and mnt_throttle_mask

Apple added a logical volume manager, called CoreStorage, to Mac OS X Lion. In contrast to traditional disk partitions, in which a contiguous range of a disk device is used as a volume, CoreStorage allows a looser relationship between volumes and backing storage. For instance, a volume might use storage from multiple different disk devices—witness Fusion Drive for example.

This complicates the mnt_devbsdunit situation. Suppose a filesystem is mounted from volume disk2. According to the previous rules, mnt_devbsdunit is 2. However, disk2 might be a CoreStorage logical volume, backed by the real disk devices disk0 and disk1.

Moreover, CoreStorage might not be the only user of disk0 and disk1. Suppose further a second, non-CoreStorage volume on disk0, called disk0s3. I/Os to disk2 and disk0s3 may contend with each other. But the mnt_devbsdunit of disk0s3 is 0, so the two mounts will be in different throttling domains.

To solve this, enter a second mount_t field, mnt_throttle_mask. mnt_throttle_mask is a 64-bit bit array. A bit is set only when I/Os to the mount may involve the correspondingly numbered disk device. For our CoreStorage logical volume disk2, since disk0 and disk1 are included, bits 0 and 1 are set. Bit 2 is also set for the logical volume itself, so the overall mask is 0x7.

In theory, you might imagine a system wherein a mount could reside in multiple throttling domains. Or perhaps the throttling domain decision could be pushed down so that CoreStorage could help make smart decisions about which to use for a particular I/O operation.

The implemented reality is much more mundane. mnt_devbsdunit is set to the index of the lowest bit set in mnt_throttle_mask. For disk2, since bit 0 is set, mnt_devbsdunit is 0. So disk2 and disk0s3 live in the same throttling domain (though, notably, a theoretical disk1s3 would not).

This explains what’s happening with /System/Volumes/Data above. disk1s1 is a logical volume presented by a volume manager[4], and its backing storage is on disk0. Tweaking the dtrace script shows that mnt_throttle_mask is 0x3:

..

Use IOPOL_PASSIVE

In addition to assigning a priority tier to its I/O operations, a process may mark its I/O as passive; passive I/O may be throttled but doesn’t cause throttling of other I/Os.

Recompiling dd to call setiopolicy_np(3) would be a hassle. An easier way is to use the taskpolicy(8) modifier utility that comes with recent versions of macOS. Though not documented in the manpage, the -d option can take the argument passive, like:

# taskpolicy -d passive dd if=…
Turn off throttling temporarily

There are a bunch of sysctls available to tune the behavior of the I/O throttling system, including one to shut it off entirely:

# sysctl debug | fgrep lowpri_throttle
debug.lowpri_throttle_max_iosize: 131072
debug.lowpri_throttle_tier1_window_msecs: 25
debug.lowpri_throttle_tier2_window_msecs: 100
debug.lowpri_throttle_tier3_window_msecs: 500
debug.lowpri_throttle_tier1_io_period_msecs: 40
debug.lowpri_throttle_tier2_io_period_msecs: 85
debug.lowpri_throttle_tier3_io_period_msecs: 200
debug.lowpri_throttle_tier1_io_period_ssd_msecs: 5
debug.lowpri_throttle_tier2_io_period_ssd_msecs: 15
debug.lowpri_throttle_tier3_io_period_ssd_msecs: 25
debug.lowpri_throttle_enabled: 1

# sysctl -w debug.lowpri_throttle_enabled=0
debug.lowpri_throttle_enabled: 1 -> 0

Related Articles

Latest Articles