-
-
Notifications
You must be signed in to change notification settings - Fork 84
deprecate derivatives and error_components #337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jagerber48
wants to merge
5
commits into
lmfit:master
Choose a base branch
from
jagerber48:deprecate_derivatives_and_error_components
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+19
−0
Open
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
7131839
deprecate derivatives and error_components
jagerber48 4a3c390
message formatting
jagerber48 b187beb
typo
jagerber48 059dbdc
changelog
jagerber48 18b68f4
Merge remote-tracking branch 'lmfit/master' into deprecate_derivative…
jagerber48 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #339 I already asked if it had already been settled to change
error_componentsinstead of use a different name for the property and keep the method. I was also wondering aboutderivatives. Is it correct to summarize that the old code kept both the derivative and the weight of the uncertainty atom as separate values and multiplied them together to get the final uncertainty while the new code eagerly multiplies them and just keeps weights? What is the trade off there? Is the decision that keeping only the final weights makes the code simpler and the derivative information is not interesting enough to keep?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's see if I can explain this difference. This is sort of the critical architecture difference so I should be able to.
The
AffineScalarFuncobject stored a mapping fromVariableconstituents to the derivative of the function with respect to thatVariable. When requested, e.g. duringstd_devcalculation, theerror_componentswere calculated by multiplying thestd_devof theVariableby its corresponding derivative. I think lebigot had the function f(x, y) = f(x0, y0) + df/dx dx + df/dy dy in mind.In the new framework we wanted to do away with the
Variable/AffineScalarFuncdistinction. Now we haveUAtomwhich doesn't inherit fromUFloat. In this case, for eachUFloat, we keep track of a mapping fromUAtomto a float weight. Thestd_devof aUAtomis always unity, so we don't need to keep track of it. But if oneUAtommoves through multiple operations we don't keep track of the intermediate derivates. The equation we have in mind is something like f(x, y) = f(x0, y0) + wa da + wb db + wc db + ... where da, db, dc areUAtomon whichxandyrandom variables my covariantly depend. We use derivatives to calculate updatedUCombolinear combination coefficients as as we do computations, but these derivative calculations are considered intermediate results which are NOT saved.So what you say is close to correct but not quite. The old code stores the derivatives and uses the derivatives and std_devs lazily when calculating composite std_dev. The new code doesn't store std_devs of UAtom's because they're always unity. It eagerly multiplies derivatives into the new linear combination. The driver for all of this is a move away from the idea of
AffineScalarFuncbeing a function and moving towards the idea ofUFloatbeing a random variable.What you say in the second sentence is essentially what I think. I think the random variable approach makes the code simpler. Getting rid of the derivatives concept allows us to eliminate the
AffineScalarFunc/Variableconfusion. The new framework doesn't even have a way that we could store these derivatives if we wanted to because we don't keep track of the "history" of a random variable. To do so you need a privileged class ofVariableobjects with respect to which derivatives can be taken and that is exactly what we were trying to eliminate. At #251 (comment) I give some more explication about what I think are the downsides of the derivatives framework.And then yes, I think the derivative information is not interesting to keep. I don't think
uncertaintiesshould try to make derivatives computation a selling point.numdifftoolsorsympyand others exist for that.I'll address your
error_componentsquestion in the other thread where I'm more open to a shim, but it is actually very important and not really optional for this refactor that we do get rid ofUFloat.derivatives.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't use
.derivativesmyself so I am not that attached to it. I do see some usage searching GitHub though. So it is hard for me to assess how important that feature is to users.I think
AffineScalarFuncvsVariableissue is separate from whether derivatives can be kept. HavingAffineScalarFuncorVariableinherit from the other and having both be user facing depending on what operations have been performed on an object with uncertainty I agree are not a good ways for things to operate for users or developers. I like the way #262 makesUFloat(replacingAffineScalarFunc) the only public facing uncertain object and makesUAtom(replacingVariable) completely hidden.I think though that the system in #262 could work similarly with
UAtomhaving astd_devattribute that held that uncertain variable's weight of uncertainty and withUFloatcreating aUCombowith thatUAtomwith a weight of 1 in theUCombo. In this case, theUComboweight is the derivative. The propagation would work similarly but just the weights would be propagating the derivatives and to get theUFloatoverall standard deviation would require multiplying theUComboweights by theUAtomweights before summing. So it doesn't seem that different to me. I don't love the idea of mutating thestd_devof a created value with uncertainty but, since equality is based onUFloatnominal value and the weights and uuids ofUAtoms, I don't think it actually breaks anything related to hashing or equality to do it (the action at a distance concern).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wshanks ok thanks, this is a helpful comment and I'm excited to have this discussion because it's very important I think. I forgot the details. Yes, I agree with you that the need to drop derivatives comes more from the idea that
UAtomshould have unity, rather than controllable,std_devrather than theAffineScalarFunc/Variableissue.So the downsides to letting
UAtomhave non-unitystd_devare:UAtomstd_devdon't appreciably impact performance.Which object would have a mutating std_dev? I think
UAtomwould not have mutatingstd_devunder your proposal.UComboalready has a_std_devattribute that mutates from "unset" to it's value the first time it is calculated. I think it would be the same under your proposal.Another downside of the derivative concept is that there is a question: with respect to what are we taking the derivative. In
masterthe answer is that we are taking the derivative of the special uncertain objects that were instantiated using theufloat,correlated_values, orcorrelated_values_normfunctions. Under your proposal is that we are taking derivatives with respect to theUAtomobjects that get instantiated duringufloat,correlated_values, orcorrelated_values_norm. But that gives a special privilege to certain uncertain numbers that I don't think is warranted. For example, I can doUnder your proposal, I can access the derivative of
z_xwith respect to the singleUAtomthat goes intoxand get the answer I'm expecting. This is kind of like the derivative ofz_xwith respect tox. I can also access the derivative ofz_ywith respect to the singleUAtomthat goes intoxand get the answer I'm expecting. But your proposal (and the code onmaster) doesn't provide a way to access anything like the derivative of eitherz_yorz_xwith respect toy. You could look at the derivatives of bothz_yandywith respect to theUAtomthat goes intoxand do some math on your end to squeeze something out, but this isn't really what you want.I think when you shift from thinking of
UFloatas a function (AffineScalarFunc) and move to thinking about it as a random variable, the importance of derivatives goes away. What exists, rather, is random variables and correlations between those random variables. I think the current code on #262 is an architecture that minimally accomplishes this data model.One final comment
UAtomis not "completely" hidden. When a user doesUFloat.error_componentsthey get a dictionary whose keys areUAtominstances. I also imagine a json serialization algorithm that uses theUAtomobjects and their uids to generate a human readable serialization of aUFloatobject. In my opinion, this means thatUAtomis part of the public API and needs to be documented. There is an argument to be made that we could ALSO eliminate theerror_componentsproperty/function. ThenUAtomwould be truly hidden and the case for eliminatingderivativeswould be closed. I'm not totally opposed to this idea since I find the use cases forerror_componentsa little strange, especially in light of the re-architecture. In this case users would shift to calculating correlations and covariances between differentUFloatobects to figure out how much the uncertainty inxcontributes to the uncertainty iny. This is the definition of covariance.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wshanks curious for your thoughts on deprecating
UFloat.derivatives. I think it has a poor data model that is very far removed for the4.xdata model ofUFloatas random variable. The issue is that theUFloat.derivativesattribute doesn't make sense for any combination of twoUFloatinstances. Inuncertainties < 4.0UFloat.derivativesat least relatedUFloatinstances but the relation wasn't symmetric. You could calculateddy/dx, but notdx/dy.In
uncertainties > 4.0it would relateUFloatinstances toUAtominstances which is even more awkward.I think the symmetric relationship between variables that we really want is
covarianceorcorrelationas I bring up in #345.@wshanks Can you share your methodology for finding usages of
UFloat.derivativeson github and share some of the examples you found? I would be curious to look at them.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I just wanted to respond to newville's comment here about the action at a distance being more of a wart than a feature. When you keep the standard deviations inside the
Variable/UAtom, you leave open the ability for the user to mutateUAtom.std_devand change the error of downstreamUFloats. I just wanted to say that I still would not want that to be a supported feature but the user could do that if they wanted.Yes, I do think of the
Variable/UAtomobjects as privileged. I think this relates back to your previous comment:I don't think Uncertainties needs to be a tool for tracking derivatives of all computations, but the derivatives relative to the underlying independent variables do make sense to track to me because fundamentally we have these random variables of some size and then weights for how much those different random variables contribute to downstream calculations. Maybe derivatives is too generic a term because really what is is the weights of the independent random variables.
Right. What I meant was that when the user uses
ufloatto generate initial values with uncertainty they get objects that are the same type as all further calculated results instead of needing to care that at first they getVariableand then later those transform intoUFloatwhen they do something like multiply by 2. The rest of your point here is similar to the way the discussion is going in #336. We could decide that tracking the underlying independent variables is not interesting enough and the user just needs to use covariance methods to find those weights themselves if they want them.I gave a search string in #336 and commented some there. I actually could not find non-trivial usage of
.derivatives, onlyerror_components.I think I am coming around on the idea of dropping
derivativesanderror_components. If we did that, would we also deprecatetag? It is only useful for labeling theUAtoms which would then be hidden. On the flip side, I could see keeping #262 working the way it does but tagging thestd_devonto theUAtomobjects. Then it would be there in case a user did want to work out the weights of all theUAtoms like derivatives provides, but we might decide that that is niche enough that the user just needs to keep track of theufloats corresponding to independent variables themself.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wshanks You have lots of good interesting responses I want to get back to between this thread and the other one but my time is short. I want to just give a quick response now
I feel strongly about dropping
UFloat.derivatives, I will explain in more detail as I have time. I would prefer to dropUFloat.error_componentsandUAtom.tagbut, as you point out, there is a use case involving tags anderror_componentsthat we would then lose. This use case is outlined in the documentation. Again, as you point out, the user could keep track of this on their own and calculate covariances. I think we should provide a code snippet how the transition fromerror_componentstocovariancewould look. Your research revealed thatscinumis a serious library using this functionality. We could reach out to them directly to try to get their thoughts on this change.I think that if we drop
error_componentswe should also drop tags.