ReverseDiffSource.jl

Reverse automated differentiation from source
Popularity
47 Stars
Updated Last
2 Months Ago
Started In
February 2014

ReverseDiffSource.jl

Reverse automated differentiation from an expression or a function

Julia 0.3 Julia 0.4 Julia 0.5 master (on nightly + release) Coverage
ReverseDiffSource ReverseDiffSource ReverseDiffSource Build Status Coverage Status

This package provides a function rdiff() that generates valid Julia code for the calculation of derivatives up to any order for a user supplied expression or generic function. Install with Pkg.add("ReverseDiffSource"). Package documentation and examples can be found here.

This version of automated differentiation operates at the source level (provided either in an expression or a generic function) to output Julia code calculating the derivatives (in a expression or a function respectively). Compared to other automated differentiation methods it does not rely on method overloading or new types and should, in principle, produce fast code.

Usage examples:

  • derivative of x³
    julia> rdiff( :(x^3) , x=Float64)  # 'x=Float64' indicates the type of x to rdiff
    :(begin
        (x^3,3 * x^2.0)  # expression calculates a tuple of (value, derivate)
        end)
  • first 10 derivatives of sin(x) (notice the simplifications)
    julia> rdiff( :(sin(x)) , order=10, x=Float64)  # derivatives up to order 10
    :(begin
            _tmp1 = sin(x)
            _tmp2 = cos(x)
            _tmp3 = -_tmp1
            _tmp4 = -_tmp2
            _tmp5 = -_tmp3
            (_tmp1,_tmp2,_tmp3,_tmp4,_tmp5,_tmp2,_tmp3,_tmp4,_tmp5,_tmp2,_tmp3)
        end)
  • works on functions too
	julia> rosenbrock(x) = (1 - x[1])^2 + 100(x[2] - x[1]^2)^2   # function to be derived
	julia> rosen2 = rdiff(rosenbrock, (Vector{Float64},), order=2)       # orders up to 2
		(anonymous function)
  • gradient calculation of a 3 hidden layer neural network for backpropagation
    # w1-w3 are the hidden layer weight matrices, x1 the input vector
    function ann(w1, w2, w3, x1)
        x2 = w1 * x1
        x2 = log(1. + exp(x2))   # soft RELU unit
        x3 = w2 * x2
        x3 = log(1. + exp(x3))   # soft RELU unit
        x4 = w3 * x3
        1. / (1. + exp(-x4[1]))  # sigmoid output
    end

    w1, w2, w3 = randn(10,10), randn(10,10), randn(1,10)
    x1 = randn(10)
    dann = m.rdiff(ann, (Matrix{Float64}, Matrix{Float64}, Matrix{Float64}, Vector{Float64}))
    dann(w1, w2, w3, x1) # network output + gradient on w1, w2, w3 and x1