-
Notifications
You must be signed in to change notification settings - Fork 154
Description
Note: I have looked for issues/PR mentioning SIMD but they are all related to using SIMD to accelerate operations on the partials. Here the goal is to have the primal be SIMD, which is quite different.
I am combining explicit vectorization with SIMD.jl and AD of small functions (a few real inputs, a single real output) with ForwardDiff.
```julia
import ForwardDiff as FD
import SIMD
x = SIMD.Vec(randn(4)...)
xx = FD.Dual(x,one(x))
yy = xx/exp(xx*xx)
A typical use is computing partial derivatives of that function for each value in a large array. The above works if I commit the following type piracy:
const SVec{F,N} = SIMD.Vec{N,F}
@inline FD.can_dual(::Type{SIMD.Vec{N, F}}) where {N, F} = FD.can_dual(F)
@inline FD._mul_partial(partial::SIMD.Vec, x::SIMD.Vec) = partial * x
@inline FD._mul_partial(partial::SVec{F}, x::F) where F = partial * x
@inline FD._mul_partial(partial::F, x::SVec{F}) where F = partial * x
@inline Base.:*(x::SIMD.Vec, partials::FD.Partials) = partials*x
@inline function Base.:*(partials::FD.Partials, x::SIMD.Vec)
return FD.Partials(FD.scale_tuple(partials.values, x))
end
@inline function FD.dual_definition_retval(::Val{T}, val::S, deriv::S, partial::FD.Partials{M,S}) where {T,F,N,M, S<:SVec{F,N}}
return FD.Dual{T}(val, deriv*partial)
end
@inline function FD.dual_definition_retval(::Val{T}, val::S, deriv1::S, partial1::FD.Partials{M,S}, deriv2::F, partial2::FD.Partials{M,F}) where {T,F,N,M, S<:SVec{F,N}}
return FD.Dual{T}(val, FD._mul_partials(partial1, partial2, deriv1, deriv2))
end
@inline function FD.dual_definition_retval(::Val{T}, val::S, deriv1::F, partial1::FD.Partials{M,F}, deriv2::S, partial2::FD.Partials{M,S}) where {T,F,N,M, S<:SVec{F,N}}
return FD.Dual{T}(val, FD._mul_partials(partial1, partial2, deriv1, deriv2))
end
@inline function FD.dual_definition_retval(::Val{T}, val::S, deriv1::S, partial1::FD.Partials{M,S}, deriv2::S, partial2::FD.Partials{M,S}) where {T,F,N,M, S<:SVec{F,N}}
return FD.Dual{T}(val, FD._mul_partials(partial1, partial2, deriv1, deriv2))
endI also use SIMDMathFunctions to vectorize math functions cos, exp, etc. Overall this gives pretty good performance. Especially I don't have to give up SIMD when computing derivatives, which would incur a big performance hit.
If there is interest in this feature, I can work out a PR. From there I would need some guidance to implement appropriate tests.