Skip to content

Commit cbdb286

Browse files
committed
Added some notes on complex division.
1 parent be0d7c6 commit cbdb286

File tree

1 file changed

+140
-0
lines changed

1 file changed

+140
-0
lines changed
Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
# Notes on Complex Division
2+
3+
A discussion of the algorithms used to implement complex division
4+
5+
## Overview
6+
7+
How does compute the quotient of two complex numbers z/w? The naïve schoolbook
8+
answer is to turn it into complex multiplication by scaling both numerator and
9+
denominator by the conjugate of w: z/w = (zw̅)/(ww̅) = (zw̅)/|w|². Translated
10+
directly into swift, this expression would become:
11+
12+
```swift
13+
extension Complex {
14+
func naïveDivision(_ z: Complex, _ w: Complex) -> Complex {
15+
(z * w.conjugate).divded(by: w.lengthSquared)
16+
}
17+
}
18+
```
19+
20+
If both z and w are both "well-scaled" (i.e. their exponents are neither very
21+
large nor very small), then this expression works reasonably well when
22+
evaluated in floating-point. Cancellation may occur in the computation of
23+
either the real or the imaginary part of the zw̅ term, but not in both
24+
simultaneously, so the relative error in any vector norm is bounded by a
25+
small multiple of machine epsilon.
26+
27+
However, if z or w is not well-scaled, then either zw̅ or |w|² or both may
28+
underflow or overflow, even when the actual quotient may be perfectly
29+
representable. For example, if we have
30+
31+
```swift
32+
let a: Float = 1.84467441E+19 // large, but not outrageously large
33+
let z = Complex(a, a)
34+
let w = z
35+
```
36+
37+
then both `(z * w.conjugate)` and `w.lengthSquared` overflow, and the result
38+
of `naïveDivision` is `(NaN, NaN)` even though the actual mathematical quotient
39+
is 1.
40+
41+
At least `(NaN, NaN)` is obviously wrong; when intermediate results underflow
42+
instead of overflow, we can instead get bogus results that are not obvious at
43+
all. For example:
44+
45+
```swift
46+
let z = Complex<Float>(imaginary: 6.6174449e-24)
47+
let w = Complex<Float>(5.29395592e-23, 3.97046694e-23)
48+
let q = naïveDivision(z, w)
49+
```
50+
51+
`q` is computed as `(0.666666686, 0.666666686)`, which looks like a reasonable
52+
result, but the exact `z/w` would be `(0.6, 0.8)`; both the phase and magnitude
53+
of the computed result have large errors, and there is no indication that
54+
anything went wrong.
55+
56+
## Complex division in Swift Numerics
57+
58+
### Goals
59+
60+
Programming languages and libraries have developed a variety of approaches to
61+
handle this problem. Swift Numerics' approach is somewhat novel, so I will
62+
first lay out the considerations that lead us to it.
63+
64+
1. Division should be reasonably fast, and permit compiler optimizations that
65+
make it faster. So long as all inputs are well-scaled, it should not be
66+
much slower than multiplication.
67+
68+
2. Division should have the same error characteristics as multiplication;
69+
componentwise error bounds and correct rounding are non-goals. Good error
70+
bounds in a complex norm are what matters.
71+
72+
3. Division should not be any more sensitive to undue overflow or underflow
73+
than multiplication is (i.e. it is OK for results very near the boundaries
74+
to overflow or underflow even though the exact result would be
75+
representable, but results that are far from those boundaries should always
76+
be computed with good accuracy).
77+
78+
### Fast path
79+
80+
These considerations lead us to a first draft of our implementation; since we
81+
want division to be roughly like multiplication, we will "simply" turn it into
82+
a multiplication via the naïve formula:
83+
84+
```swift
85+
public static func /(z: Complex, w: Complex) -> Complex {
86+
let lenSq = w.lengthSquared
87+
guard lenSq.isNormal else { /* naive formula will not work */ }
88+
return z * w.conjugate.divided(by: lenSq)
89+
}
90+
```
91+
92+
if `lenSq` can be computed without overflow or underflow, then
93+
`w.conjugate.divided(by: lenSq)` is a (rounded) reciprocal `1/w`, and so
94+
multiplying that by `z` satisfies the goals that we laid out above. Note
95+
in particular that if we are dividing multiple values by a single divisor,
96+
the reciprocal is a constant, and this check only has to happen once.
97+
98+
That leaves us just needing to figure out what to do when the reciprocal
99+
is not well-scaled.
100+
101+
### Slow path
102+
103+
When an approximate reciprocal for the divisor is not representable, we use
104+
a scaling algorithm adapted from Doug Priest's "Efficient Scaling for
105+
Complex Division". In hand-wavy form, Priest's idea is:
106+
107+
1. choose a scale factor s ~ |w|^(-¾)
108+
2. let wʹ ← sw
109+
2. let zʹ ← sz
110+
111+
Note that we _can_ compute a reciprocal for wʹ without overflow or underflow,
112+
because |wʹ| ~ |w|^(¼), so |wʹ²| ~ |w|^(½) and so |w̅ʹ/wʹ²| ~ |w|^(-¼).¹
113+
To compute z/w, we compute zʹ/wʹ. sz might overflow or underflow, but only
114+
if the final computation would overflow or underflow anyway. This is because
115+
division by w is a dilation combined with a rotation; because |w| is bounded
116+
away from 1, the multiplication by s is a dilation if and only if division
117+
by w is also dilation (this is where we diverge from Priest; because he
118+
doesn't have a fast-path for well-scaled w, he has to handle this case
119+
carefully).
120+
121+
Like Priest, we arrange for s to be a power of the radix with the desired
122+
scale, which makes the computation of wʹ and zʹ exact (unless zʹ over- or
123+
underflows). This achieves one more desirable property: barring underflow,
124+
rescaling both w and z by the radix does not change results when we move
125+
between the fast and slow algorithms.
126+
127+
Instead of `zʹ/wʹ = zʹ(w̅ʹ/|wʹ|²)`, Priest uses `(zw″)/|wʹ|²`. There isn't
128+
a strong reason to favor one or the other numerically, but our formulation
129+
lets us preserve `z/w = z * w.reciprocal` under scaling by powers of the
130+
radix (up to the underflow boundary), which is somewhat nice to have, and
131+
allows us to extract a little bit more ILP in some cases.
132+
133+
----
134+
### Notes
135+
136+
¹ This analysis fails for types like Float16 where the number of
137+
significand bits is on the order of the greatest exponent because the desired
138+
s may not be representable if w is subnormal. Note that the problem is _only_
139+
in the representation of s, so we can handle this case by rescaling in two
140+
steps.

0 commit comments

Comments
 (0)