|
| 1 | +# Notes on Complex Division |
| 2 | + |
| 3 | +A discussion of the algorithms used to implement complex division |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +How does compute the quotient of two complex numbers z/w? The naïve schoolbook |
| 8 | +answer is to turn it into complex multiplication by scaling both numerator and |
| 9 | +denominator by the conjugate of w: z/w = (zw̅)/(ww̅) = (zw̅)/|w|². Translated |
| 10 | +directly into swift, this expression would become: |
| 11 | + |
| 12 | +```swift |
| 13 | +extension Complex { |
| 14 | + func naïveDivision(_ z: Complex, _ w: Complex) -> Complex { |
| 15 | + (z * w.conjugate).divded(by: w.lengthSquared) |
| 16 | + } |
| 17 | +} |
| 18 | +``` |
| 19 | + |
| 20 | +If both z and w are both "well-scaled" (i.e. their exponents are neither very |
| 21 | +large nor very small), then this expression works reasonably well when |
| 22 | +evaluated in floating-point. Cancellation may occur in the computation of |
| 23 | +either the real or the imaginary part of the zw̅ term, but not in both |
| 24 | +simultaneously, so the relative error in any vector norm is bounded by a |
| 25 | +small multiple of machine epsilon. |
| 26 | + |
| 27 | +However, if z or w is not well-scaled, then either zw̅ or |w|² or both may |
| 28 | +underflow or overflow, even when the actual quotient may be perfectly |
| 29 | +representable. For example, if we have |
| 30 | + |
| 31 | +```swift |
| 32 | +let a: Float = 1.84467441E+19 // large, but not outrageously large |
| 33 | +let z = Complex(a, a) |
| 34 | +let w = z |
| 35 | +``` |
| 36 | + |
| 37 | +then both `(z * w.conjugate)` and `w.lengthSquared` overflow, and the result |
| 38 | +of `naïveDivision` is `(NaN, NaN)` even though the actual mathematical quotient |
| 39 | +is 1. |
| 40 | + |
| 41 | +At least `(NaN, NaN)` is obviously wrong; when intermediate results underflow |
| 42 | +instead of overflow, we can instead get bogus results that are not obvious at |
| 43 | +all. For example: |
| 44 | + |
| 45 | +```swift |
| 46 | +let z = Complex<Float>(imaginary: 6.6174449e-24) |
| 47 | +let w = Complex<Float>(5.29395592e-23, 3.97046694e-23) |
| 48 | +let q = naïveDivision(z, w) |
| 49 | +``` |
| 50 | + |
| 51 | +`q` is computed as `(0.666666686, 0.666666686)`, which looks like a reasonable |
| 52 | +result, but the exact `z/w` would be `(0.6, 0.8)`; both the phase and magnitude |
| 53 | +of the computed result have large errors, and there is no indication that |
| 54 | +anything went wrong. |
| 55 | + |
| 56 | +## Complex division in Swift Numerics |
| 57 | + |
| 58 | +### Goals |
| 59 | + |
| 60 | +Programming languages and libraries have developed a variety of approaches to |
| 61 | +handle this problem. Swift Numerics' approach is somewhat novel, so I will |
| 62 | +first lay out the considerations that lead us to it. |
| 63 | + |
| 64 | +1. Division should be reasonably fast, and permit compiler optimizations that |
| 65 | + make it faster. So long as all inputs are well-scaled, it should not be |
| 66 | + much slower than multiplication. |
| 67 | + |
| 68 | +2. Division should have the same error characteristics as multiplication; |
| 69 | + componentwise error bounds and correct rounding are non-goals. Good error |
| 70 | + bounds in a complex norm are what matters. |
| 71 | + |
| 72 | +3. Division should not be any more sensitive to undue overflow or underflow |
| 73 | + than multiplication is (i.e. it is OK for results very near the boundaries |
| 74 | + to overflow or underflow even though the exact result would be |
| 75 | + representable, but results that are far from those boundaries should always |
| 76 | + be computed with good accuracy). |
| 77 | + |
| 78 | +### Fast path |
| 79 | + |
| 80 | +These considerations lead us to a first draft of our implementation; since we |
| 81 | +want division to be roughly like multiplication, we will "simply" turn it into |
| 82 | +a multiplication via the naïve formula: |
| 83 | + |
| 84 | +```swift |
| 85 | +public static func /(z: Complex, w: Complex) -> Complex { |
| 86 | + let lenSq = w.lengthSquared |
| 87 | + guard lenSq.isNormal else { /* naive formula will not work */ } |
| 88 | + return z * w.conjugate.divided(by: lenSq) |
| 89 | +} |
| 90 | +``` |
| 91 | + |
| 92 | +if `lenSq` can be computed without overflow or underflow, then |
| 93 | +`w.conjugate.divided(by: lenSq)` is a (rounded) reciprocal `1/w`, and so |
| 94 | +multiplying that by `z` satisfies the goals that we laid out above. Note |
| 95 | +in particular that if we are dividing multiple values by a single divisor, |
| 96 | +the reciprocal is a constant, and this check only has to happen once. |
| 97 | + |
| 98 | +That leaves us just needing to figure out what to do when the reciprocal |
| 99 | +is not well-scaled. |
| 100 | + |
| 101 | +### Slow path |
| 102 | + |
| 103 | +When an approximate reciprocal for the divisor is not representable, we use |
| 104 | +a scaling algorithm adapted from Doug Priest's "Efficient Scaling for |
| 105 | +Complex Division". In hand-wavy form, Priest's idea is: |
| 106 | + |
| 107 | +1. choose a scale factor s ~ |w|^(-¾) |
| 108 | +2. let wʹ ← sw |
| 109 | +2. let zʹ ← sz |
| 110 | + |
| 111 | +Note that we _can_ compute a reciprocal for wʹ without overflow or underflow, |
| 112 | +because |wʹ| ~ |w|^(¼), so |wʹ²| ~ |w|^(½) and so |w̅ʹ/wʹ²| ~ |w|^(-¼).¹ |
| 113 | +To compute z/w, we compute zʹ/wʹ. sz might overflow or underflow, but only |
| 114 | +if the final computation would overflow or underflow anyway. This is because |
| 115 | +division by w is a dilation combined with a rotation; because |w| is bounded |
| 116 | +away from 1, the multiplication by s is a dilation if and only if division |
| 117 | +by w is also dilation (this is where we diverge from Priest; because he |
| 118 | +doesn't have a fast-path for well-scaled w, he has to handle this case |
| 119 | +carefully). |
| 120 | + |
| 121 | +Like Priest, we arrange for s to be a power of the radix with the desired |
| 122 | +scale, which makes the computation of wʹ and zʹ exact (unless zʹ over- or |
| 123 | +underflows). This achieves one more desirable property: barring underflow, |
| 124 | +rescaling both w and z by the radix does not change results when we move |
| 125 | +between the fast and slow algorithms. |
| 126 | + |
| 127 | +Instead of `zʹ/wʹ = zʹ(w̅ʹ/|wʹ|²)`, Priest uses `(zw″)/|wʹ|²`. There isn't |
| 128 | +a strong reason to favor one or the other numerically, but our formulation |
| 129 | +lets us preserve `z/w = z * w.reciprocal` under scaling by powers of the |
| 130 | +radix (up to the underflow boundary), which is somewhat nice to have, and |
| 131 | +allows us to extract a little bit more ILP in some cases. |
| 132 | + |
| 133 | +---- |
| 134 | +### Notes |
| 135 | + |
| 136 | +¹ This analysis fails for types like Float16 where the number of |
| 137 | +significand bits is on the order of the greatest exponent because the desired |
| 138 | +s may not be representable if w is subnormal. Note that the problem is _only_ |
| 139 | +in the representation of s, so we can handle this case by rescaling in two |
| 140 | +steps. |
0 commit comments