Indeterminancy for multiple transforms #688
Replies: 10 comments
-
We should try with debug=2 to see if there is any difference in the parameters chosen. |
Beta Was this translation helpful? Give feedback.
-
Right. For 2.4.0rc1, we get
while 2.3.0 gives us
So the main difference seems to be the automatic choice of Indeed, manually overriding |
Beta Was this translation helpful? Give feedback.
-
I tuned the heuristic because it is faster. According to the error analysis and #677 it should work. |
Beta Was this translation helpful? Give feedback.
-
Dear Joakim, So, I examined this fascinating failure pretty closely: I do not consider it a failure of finufft to provide outputs within the requested accuracy, so I don't think there is anything that actually has to be fixed, as I will explain. But it raises a new type of concern when using upsampfac=1.25 at eps<1e-7 where ns=13...15, close to its max. What your test code demands is that a multithreaded ntrans=6 produce a "close" answer to doing those transforms separately. (In particular you've tested index [1,1,0] of the (2,3,1) shaped set of 6). Now, this relative change (due to arithmetic order, not tolerance) is about 5e4 times epsmach, which is admittedly a large prefactor: it turns out that upsampfac=1.25 amplifies this prefactor here more than usf=2.0, which seems to max out at a prefactor of <10, ie, 15 digits are stable. By picking I believe the cause is larger amplification in the deconv step - I will look into that right now. I do propose to add an alert in the docs. I also propose to tweak Marco's heuristics to push usf=2.0 to be used for eps<=3e-8, so we avoid the ns>12 cases with usf=1.25, at least as defaults. I also propose that you change your unit tests in ASPIRE, since you're testing for a feature that is not promised: repeatability to around 11 digits of doing a transform with different threading, when the tolerance is only 8 digits. Maybe you should avoid Do you agree? Any suggestions to the above, before we release 2.4.0 asap? Best wishes, Alex |
Beta Was this translation helpful? Give feedback.
-
Let me add that rounding error due to the condition number of the problem (which the different arithmetic order exposes) should also scale like N1=512 in this case, ie max(N1,N2).epsmach. This would lead to one expecting about 13 digits stability, best case. (Why usf=2.0 is better than this, I don't know). So 11 digits does not feel so bad after all. |
Beta Was this translation helpful? Give feedback.
-
Indeed, I just checked the maximum amplification factor relative to zero-freq, which is called Matlab verification (see eqn (3.21) and (4.5) from the paper):
Whether this is a problem I don't know. usf=5/4 is very useful for speed, so I'll add docs re repeatability, and tweak heuristics.hpp to avoid ns>12 by default. |
Beta Was this translation helpful? Give feedback.
-
I have pushed this to docs. But I'm not sure heuristics.hpp needs tweaking. Await Joakim's input. Until tomorrow, Alex |
Beta Was this translation helpful? Give feedback.
-
Hi Alex, Thanks for taking the time to go through this. Let me see if I've got this correctly:
Overall, it seems reasonable here to adjust our tolerances in ASPIRE given that we are not guaranteed machine epsilon accuracy with repeated transforms. |
Beta Was this translation helpful? Give feedback.
-
With regard to your question:
3. What I don't quite understand is how this error (due to arithmetic
order) can be guaranteed to be below the overall eps error (well, I guess
we don't make any strict guarantees, but still, how do we know that this
won't get too big)?
Recall that r_{dyn} sets the amplification of rounding error differences
that one would expect due to reordering (repeatability under different
threading). So the repeatability variation should be O( r_{dyn}
eps_{mach}) or possibly O( r_{dyn} eps_{mach} max(N_i) ); I don't know
which yet.
r_{dyn} <10 for usf=2.0; but r_{dyn}<1e3 for usf=5/4. That addresses your
question #2 too.
The only reason this is less than ``eps`` (the tol param), in float64 at
least, is that ns gets too big for usf=5/4 for ``eps``<1e-9, so we always
switch to usf=2 for such ``eps``.
Are you ok with us releasing, now? Best, Alex
…On Thu, May 22, 2025 at 4:04 PM Joakim Andén ***@***.***> wrote:
*janden* left a comment (flatironinstitute/finufft#679)
<#679 (comment)>
Hi Alex,
Thanks for taking the time to go through this. Let me see if I've got this
correctly:
1.
The API does not guarantee that the same transform give the same
output up to machine epsilon. I know we had discussed this at some point,
but I wasn't sure what the outcome was. The guarantee is only with respect
to the eps parameter times the input l1-norm, got it.
2.
Somehow, the different upsampling factor gives a larger error due to
arithmetic order (why this depends on the usf, I don't quite understand
here). Since the heuristics for picking the usf has changed, we see a
different behavior in 2.4.0.
3.
What I don't quite understand is how this error (due to arithmetic
order) can be guaranteed to be below the overall eps error (well, I
guess we don't make any strict guarantees, but still, how do we know that
this won't get too big)?
Overall, it seems reasonable here to adjust our tolerances in ASPIRE given
that we are not guaranteed machine epsilon accuracy with repeated
transforms.
—
Reply to this email directly, view it on GitHub
<#679 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACNZRSSHVNXX3OPRGUL7EZD27YUWRAVCNFSM6AAAAAB47ACS4KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSMBSGQYTSMJUGQ>
.
You are receiving this because you were assigned.Message ID:
***@***.***>
--
*-------------------------------------------------------------------~^`^~._.~'
|\ Alex Barnett Center for Computational Mathematics, Flatiron Institute
| \ http://users.flatironinstitute.org/~ahb 646-876-5942
|
Beta Was this translation helpful? Give feedback.
-
Ah I see. That makes sense to me then. Yes I'm fine with releasing. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
When applying a 2D type 1 transform, we get slightly different results when computing multiple transforms at the same time compared to one transform at a time.
The following script reproduces the error in 2.4.0rc1 but gives the expected results in 2.3.0
data.zip
Specifically, for 2.3.0, we always pass the
allclose
check and the maximum error is between zero and1e-10
, while for 2.4.0rc1,allclose
usually fails (non-deterministic) with a maximum error around1e-6
.The particular layout of the data seems important here. If I get rid of the reshapes, the errors drop somewhat (enough to not trigger the
allclose
check, but still higher compared to 2.3.0).Beta Was this translation helpful? Give feedback.
All reactions