You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/models/advanced.md
+16-9
Original file line number
Diff line number
Diff line change
@@ -18,8 +18,8 @@ function (m::CustomModel)(x)
18
18
return m.chain(x) + x
19
19
end
20
20
21
-
# Call @functor to allow for training. Described below in more detail.
22
-
Flux.@functor CustomModel
21
+
# Call @layer to allow for training. Described below in more detail.
22
+
Flux.@layer CustomModel
23
23
```
24
24
25
25
You can then use the model like:
@@ -39,15 +39,15 @@ Taking reference from our example `Affine` layer from the [basics](@ref man-basi
39
39
By default all the fields in the `Affine` type are collected as its parameters, however, in some cases it may be desired to hold other metadata in our "layers" that may not be needed for training, and are hence supposed to be ignored while the parameters are collected. With Flux, the way to mark some fields of our layer as trainable is through overloading the `trainable` function:
It is also possible to further restrict what fields are seen by writing `@functor Affine (W,)`. However, this is not recommended. This requires the `struct` to have a corresponding constructor that accepts only `W` as an argument, and the ignored fields will not be seen by functions like `gpu` (which is usually undesired).
70
+
The exact same method of `trainable` can also be defined using the macro, for convenience:
71
+
72
+
```julia
73
+
Flux.@layer Affine trainable=(W,)
74
+
```
75
+
76
+
There is a second, more severe, kind of restriction possible. This is not recommended, but is included here for completeness. Calling `Functors.@functor Affine (W,)` means that all no exploration of the model will ever visit the other fields: They will not be moved to the GPU by [`gpu`](@ref), and their precision will not be changed by `f32`. This requires the `struct` to have a corresponding constructor that accepts only `W` as an argument.
Notice that we parameterized the type of the `paths` field. This is necessary for fast Julia code; in general, `T` might be a `Tuple` or `Vector`, but we don't need to pay attention to what it specifically is. The same goes for the `combine` field.
137
144
138
-
The next step is to use [`Functors.@functor`](@ref) to make our struct behave like a Flux layer. This is important so that calling `params` on a `Join` returns the underlying weight arrays on each path.
145
+
The next step is to use [`Functors.@layer`](@ref) to make our struct behave like a Flux layer. This is important so that calling `params` on a `Join` returns the underlying weight arrays on each path.
139
146
```julia
140
-
Flux.@functor Join
147
+
Flux.@layer Join
141
148
```
142
149
143
150
Finally, we define the forward pass. For `Join`, this means applying each `path` in `paths` to each input array, then using `combine` to merge the results.
@@ -194,7 +201,7 @@ model(xs)
194
201
195
202
Our custom `Split` layer will accept a single input, then pass the input through a separate path to produce multiple outputs.
196
203
197
-
We start by following the same steps as the `Join` layer: define a struct, use [`Functors.@functor`](@ref), and define the forward pass.
204
+
We start by following the same steps as the `Join` layer: define a struct, use [`@layer`](@ref), and define the forward pass.
Copy file name to clipboardExpand all lines: docs/src/models/basics.md
+7-2
Original file line number
Diff line number
Diff line change
@@ -257,8 +257,8 @@ m(5) # => 26
257
257
258
258
There is still one problem with this `Affine` layer, that Flux does not know to look inside it. This means that [`Flux.train!`](@ref) won't see its parameters, nor will [`gpu`](@ref) be able to move them to your GPU. These features are enabled by the [`@functor`](@ref Functors.@functor) macro:
259
259
260
-
```
261
-
Flux.@functor Affine
260
+
```julia
261
+
Flux.@layer Affine
262
262
```
263
263
264
264
Finally, most Flux layers make bias optional, and allow you to supply the function used for generating random weights. We can easily add these refinements to the `Affine` layer as follows, using the helper function [`create_bias`](@ref Flux.create_bias):
Copy file name to clipboardExpand all lines: src/Flux.jl
+4-1
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,7 @@ using MacroTools: @forward
9
9
10
10
@reexportusing NNlib
11
11
using MLUtils
12
+
const stack = MLUtils.stack # now exported by Base
12
13
import Optimisers: Optimisers, trainable, destructure # before v0.13, Flux owned these functions
13
14
using Optimisers: freeze!, thaw!, adjust!
14
15
using Random: default_rng
@@ -69,14 +70,16 @@ include("functor.jl")
69
70
# Pirate error to catch a common mistake.
70
71
Functors.functor(::Type{<:MLUtils.DataLoader}, x) =error("`DataLoader` does not support Functors.jl, thus functions like `Flux.gpu` will not act on its contents.")
0 commit comments