-
Notifications
You must be signed in to change notification settings - Fork 241
New library: Reverse-mode automatic differentiation #1302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Working differentiation
added linear regression example
added autodiff reverse docs
@mborland @ckormanyos I wrote the first draft of the docs. It took a bit longer than i though it would. Please take a look at let me know if there are any changes/additions needed. Its my first time working with the boost documentation, but building the docs locally seems to change a bunch of html files for me. I didnt want to have 500+ files changed in the PR, so I've been doing
to ignore these changes, but some some still go through. Is there a better way of doing this? Also @jzmaddock thank you for taking a look at the library. I have a few responses to your comments below
Unfortunately this seems to happen quite often. See lines 64, 68, 215, of bessel.hpp, or 88/89 of detail/bessel_jy_asym.hpp. I'd need to take a more dedicated crack at
The special cases that return a constant result are indeed a big problem. I actually have a mechanism in my code for detecting expression types, so I think your solution makes sense.
Totally agree with you. |
Ah got you, in multiprecision we fix this with:
So that if any special function is called with an expression template, then it (internally/magically) converts that to the "real" type. This means that in user code, if someone writes:
then it all still works, even though |
@ckormanyos Some good news on the expression template on/off macro. I have a working version under the add_no_expression_template_support branch. With the change #define BOOST_MATH_ET_OFF
#include <boost/math/differentiation/autodiff_reverse.hpp>
#include <boost/math/special_functions/bessel.hpp>
#include <boost/math/special_functions/gamma.hpp>
#include <iostream>
namespace rdiff = boost::math::differentiation::reverse_mode;
int main()
{
constexpr std::size_t N = 1;
using rvar = rdiff::rvar<double, N>;
double x = 2.1;
rvar x_ad = x;
auto g = boost::math::tgamma(x_ad + x_ad);
auto j = boost::math::cyl_bessel_j(0.25, x_ad + x_ad / 2);
std::cout << "tgamma(x + x) = " << g << "\n";
std::cout << "J_0.25(x) = " << j << "\n";
auto &tape = rdiff::get_active_tape<double, 1>();
g.backward();
std::cout << "d/dx tgamma(x+x), autodiff = " << x_ad.adjoint()
<< " expected = " << 2 * boost::math::tgamma(2 * x) * boost::math::digamma(2 * x)
<< std::endl;
tape.zero_grad();
j.backward();
std::cout << "d/dx J_0.25(x+x/2), autodiff = " << x_ad.adjoint() << " expected = "
<< 3.0 / 4.0
* (boost::math::cyl_bessel_j(-0.75, 3. * x / 2.)
- boost::math::cyl_bessel_j(1.25, 3. * x / 2.))
<< std::endl;
return 0;
}
produces the expected gradients. |
Wow this seems very positive. I've also followed your discussion with John. I am of the opinion that, maybe down the road, we will get most or at least some of specfun working with both ET on as well as off.
Hmmm I was wondering if you would like to use a PP-definition that has one more word in it. Maybe someday, there will be several kinds of ETs in Math. What do you think about |
I think that the
I'm fine with that. Would |
That's a tough question. In Multiprecision, we use a templated enumeration type. In other words, one of the template parameters to the Do you think it might be better to use a template parameter? It could default to either ET on or off, but that way, things might be more flexible or similar to the Multiprecision way? In multiprecision, we default to |
This would require a significant restructure of the code I think. Right now the et on and et off simply decide which header file with function overloads to include. |
That'll work. If you ever decide to change down the road, you can probably still remain compatible by using a default parameterr value. So the simpler evolutionary step you suggest seems just fine for the moment.
These seem fine from above, if you're happy with them, I'm happy with them. |
Co-authored-by: Matt Borland <[email protected]>
Co-authored-by: Matt Borland <[email protected]>
@mborland updated the docs based on your suggestions |
Is there anything else needed from my end to get this merged? |
I don't know why it was still necessary to approve your workflow runs (CI/CD on GHA). But I just did that, after overlooking the button for a week or so. I was sure we had approved them initially, but I guess not. Let's see how it runs and shakes out on GHA. Other than that, from my side (Christopher), it's good to go. So it might be a good idea to see how the Code Coverage looks. And if we don't like it, we can add tests either prior to or any time after the merge. Cc: @jzmaddock and @mborland |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #1302 +/- ##
===========================================
+ Coverage 90.47% 95.10% +4.62%
===========================================
Files 191 796 +605
Lines 23771 67115 +43344
===========================================
+ Hits 21507 63830 +42323
- Misses 2264 3285 +1021
... and 667 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
Other than the main CMakeLists.txt inadvertently being deleted I am good with this PR assuming it runs green. |
strange. I added it back in |
Does it still make sense for the ET on / ET off change to be a separate PR? Its something that I have working. I guess it would be good to compartmentalize the discussion. |
I think so. I'll merge this in and then you can do more targeted feature branches. |
Thanks @demroz nice work. I'm excited to see how this area of Math evolves. Thanks also Matt and John. |
Hi @demroz is it appropriate or useful to add a phrase or extension of this sentence in the main docs? |
I'm sorry I dont understand what you mean exactly. Are you asking if its appropriate to add a sentence about cost function minimization to the autodiff docs? Or adding a setance about autodiff to the readme file? |
This pull request introduces a new library for reverse-mode automatic differentiation. Its a tape based reverse mode autodiff, so the idea is to build a computational graph, and then call backward on the entire graph to compute all the gradients at once.
Currently it supports all the basic operations (+,-,/,/), everything listed in conceptual requirements for real number types, and boost calls to
erf
,erc
,erf_inv
anderfc_inv
. Everything is tested up to the 4th derivative. The list of tests:test_reverse_mode_autodiff_basic_math_ops.cpp
test_reverse_mode_autodiff_comparison_operators.cpp
test_reverse_mode_autodiff_constructors.cpp
test_reverse_mode_autodiff_error_functions.cpp
test_reverse_mode_autodiff_flat_linear_allocator.cpp
test_reverse_mode_autodiff_stl_support.cpp
There are also two examples in the example directory:
reverse_mode_linear_regression_example.cpp
-> simple linear regression that demonstrates how this library can be used for optimizationautodiff_reverse_black_scholes.cpp
-> a rewrite of the forward mode equivalent.Important notes
f in this case is not actually type
rvar
, butadd_expr<rvar,mult_expr<rvar,rvar>>
new
for memory allocations. This is a deliberate design choice. The flat_linear_allocator destructor explicitly calls the destructors of individual elements. Explicit calls todelete
shouldn't be needed hereThank you, and looking forward to your feedback