Deep and convolutional neural networks with feedback operations in Flux.
This package implements deep neural networks with feedback. This means that the output of higher/later layers can serve as an input to lower/earlier layers at the next timestep.
Most deep learning frameworks do not support this form of recurrence in a straightforward manner. Usually recurrence is limited to a single layer, implemented as an RNN cell. This package essentially turns the whole network into a single RNN cell with support for arbitrary connectivity.
The package can be installed using
using Pkg Pkg.add("FeedbackNets")
or using the REPL shorthand
] add FeedbackNets
The package depends on
Flux and on
CuArrays for GPU support. For more
details on Julia package management, look here.
Once the package is installed, you can access it with Julia's package manager:
Typically, you'll want to load
Flux as well for its network layers:
The core of the package is the
FeedbackChain, a type that behaves largely
similar to a normal
Flux.Chain. It treats normal Flux layers as one would
expect. However, it can contain two additional elements:
Mergers. These two types are used to structure feedback in a network, i.e., to
enable higher levels of the chain to provide input to lower levels in the next
Splitter marks a point in the forward stream from which feedback is provided.
FeedbackChain traverses the feedforward stream, it records the
intermediate output at each
Splitter and adds it to a state dictionary.
Merger marks a location at which feedback is folded back into the
feedforward stream. Each
Merger contains the name of the
Splitter from which
it gets feedback, an operation (e.g., a
ConvTranspose or a
Chain) to apply
to the feedback and a binary operation (e.g.,
+) which it applies to combine
forward and feedback input.
For example, a simple
FeedbackChain may contain a
Dense layer that maps ten
input units to five outputs and a feedback path that has another
with the inverse connectivity.
net = FeedbackChain( Merger("fork1", Dense(5, 10, relu), +), Dense(10, 5, relu), Splitter("fork1") )
At each timestep, this network will take the previous state of
fork1, pass it
through the 5-to-10 unit
Dense layer and add it to the 10-unit input. The
result is then passed through the 10-to-5 Dense layer to produce the output of
the network, which is stored for the next timestep by
In order to apply
net to an input, we need to pass it a dictionary with the
current / inital state of
x = randn(10) h = Dict("fork1" => zeros(5)) h, out = net(h, x)
A FeedbackChain can be wrapped in a
Flux.Recur in order to have it handle the
state internally. This requires that an initial state dictionary is provided.
net = Flux.Recur(net, h) out = net(x)
The project is MIT licensed. See LICENSE for details.