QuasiMonteCarlo.jl

Lightweight and easy generation of quasi-Monte Carlo sequences with a ton of different methods on one API for easy parameter exploration in scientific machine learning (SciML)
Popularity
75 Stars
Updated Last
1 Year Ago
Started In
December 2019

QuasiMonteCarlo.jl

Join the chat at https://julialang.zulipchat.com #sciml-bridged Global Docs

codecov Build Status

ColPrac: Contributor's Guide on Collaborative Practices for Community Packages SciML Code Style

This is a lightweight package for generating Quasi-Monte Carlo (QMC) samples using various different methods.

Tutorials and Documentation

For information on using the package, see the stable documentation. Use the in-development documentation for the version of the documentation, which contains the unreleased features.

Example

using QuasiMonteCarlo, Distributions
lb = [0.1,-0.5]
ub = [1.0,20.0]
n = 5
d = 2

s = QuasiMonteCarlo.sample(n,lb,ub,GridSample([0.1,0.5]))
s = QuasiMonteCarlo.sample(n,lb,ub,UniformSample())
s = QuasiMonteCarlo.sample(n,lb,ub,SobolSample())
s = QuasiMonteCarlo.sample(n,lb,ub,LatinHypercubeSample())
s = QuasiMonteCarlo.sample(n,lb,ub,LatticeRuleSample())
s = QuasiMonteCarlo.sample(n,lb,ub,HaltonSample([10,3], false))

The output s is a matrix, so one can use things like @uview from UnsafeArrays.jl for a stack-allocated view of the ith point:

using UnsafeArrays
@uview s[:,i]

API

Everything has the same interface:

A = QuasiMonteCarlo.sample(n,lb,ub,sample_method)

where:

  • n is the number of points to sample.
  • lb is the lower bound for each variable. The length determines the dimensionality.
  • ub is the upper bound.
  • sample_method is the quasi-Monte Carlo sampling strategy.

Additionally, there is a helper function for generating design matrices:

k=2
As = QuasiMonteCarlo.generate_design_matrices(n,lb,ub,sample_method,k)

which returns As which is an array of k design matrices A[i] that are all sampled from the same low-discrepancy sequence.

Available Sampling Methods

Sampling methods SamplingAlgorithm are divided into two subtypes

  • DeterministicSamplingAlgorithm
    • GridSample(dx) where the grid is given by lb:dx[i]:ub in the ith direction.
    • SobolSample for the Sobol' sequence.
    • FaureSample for the Faure sequence.
    • LatticeRuleSample for a randomly-shifted rank-1 lattice rule.
    • HaltonSample(base) where base[i] is the base in the ith direction.
    • GoldenSample for a Golden Ratio sequence.
    • KroneckerSample(alpha, s0) for a Kronecker sequence, where alpha is an length-d vector of irrational numbers (often sqrt(d)) and s0 is a length-d seed vector (often 0).
  • RandomSamplingAlgorithm
    • UniformSample for uniformly distributed random numbers.
    • LatinHypercubeSample for a Latin Hypercube.
    • Additionally, any Distribution can be used, and it will be sampled from.

Adding a new sampling method

Adding a new sampling method is a two-step process:

  1. Add a new SamplingAlgorithm type.
  2. Overload the sample function with the new type.

All sampling methods are expected to return a matrix with dimension d by n, where d is the dimension of the sample space and n is the number of samples.

Example

struct NewAmazingSamplingAlgorithm{OPTIONAL} <: SamplingAlgorithm end

function sample(n,lb,ub,::NewAmazingSamplingAlgorithm)
    if lb isa Number
        ...
        return x
    else
        ...
        return reduce(hcat, x)
    end
end

Randomization of QMC sequences

Most of the previous methods are deterministic i.e. sample(n, d, Sampler()::DeterministicSamplingAlgorithm) always produces the same sequence $X = (X_1, \dots, X_n)$. The API to randomize sequence is either

  • Directly use QuasiMonteCarlo.sample(n, d, DeterministicSamplingAlgorithm(R = RandomizationMethod())) or sample(n, lb, up, DeterministicSamplingAlgorithm(R = RandomizationMethod()))
  • Or given any matrix $X$ ($d\times n$) of $n$ points all in dimension $d$ in $[0,1]^d$ one can do randomize(x, ::RandomizationMethod())

There are the following randomization methods:

  • Scrambling methods ScramblingMethods(base, pad, rng) where base is the base used to scramble and pad the number of bits in b-ary decomposition. pad is generally chosen as $\gtrsim \log_b(n)$. The implemented ScramblingMethods are
    • DigitalShift
    • MatousekScramble a.k.a Linear Matrix Scramble.
    • OwenScramble a.k.a Nested Uniform Scramble is the most understood theoretically but is more costly to operate.
  • Shift(rng) a.k.a. Cranley Patterson Rotation.

For numerous independent randomization, use generate_design_matrices(n, d, ::DeterministicSamplingAlgorithm), ::RandomizationMethod, num_mats) where num_mats is the number of independent X generated.

Example

Randomization of a Faure sequence with various methods.

using QuasiMonteCarlo
m = 4
d = 3
b = QuasiMonteCarlo.nextprime(d)
N = b^m # Number of points
pad = m # Length of the b-ary decomposition number = sum(y[k]/b^k for k in 1:pad)

# Unrandomized low discrepency sequence
x_faure = QuasiMonteCarlo.sample(N, d, FaureSample())

# Randomized version
x_nus = randomize(x_faure, OwenScramble(base = b, pad = pad)) # equivalent to sample(N, d, FaureSample(R = OwenScramble(base = b, pad = pad)))
x_lms = randomize(x_faure, MatousekScramble(base = b, pad = pad))
x_digital_shift = randomize(x_faure, DigitalShift(base = b, pad = pad))
x_shift = randomize(x_faure, Shift())
x_uniform = rand(d, N) # plain i.i.d. uniform
using Plots
# Setting I like for plotting
default(fontfamily="Computer Modern", linewidth=1, label=nothing, grid=true, framestyle=:default)

Plot the resulting sequences along dimensions 1 and 3.

d1 = 1
d2 = 3 # you can try every combination of dimension (d1, d2)
sequences = [x_uniform, x_faure, x_shift, x_digital_shift, x_lms, x_nus]
names = ["Uniform", "Faure (deterministic)", "Shift", "Digital Shift", "Matousek Scramble", "Owen Scramble"]
p = [plot(thickness_scaling=1.5, aspect_ratio=:equal) for i in sequences]
for (i, x) in enumerate(sequences)
    scatter!(p[i], x[d1, :], x[d2, :], ms=0.9, c=1, grid=false)
    title!(names[i])
    xlims!(p[i], (0, 1))
    ylims!(p[i], (0, 1))
    yticks!(p[i], [0, 1])
    xticks!(p[i], [0, 1])
    hline!(p[i], range(0, 1, step=1 / 4), c=:gray, alpha=0.2)
    vline!(p[i], range(0, 1, step=1 / 4), c=:gray, alpha=0.2)
    hline!(p[i], range(0, 1, step=1 / 2), c=:gray, alpha=0.8)
    vline!(p[i], range(0, 1, step=1 / 2), c=:gray, alpha=0.8)
end
plot(p..., size=(1500, 900))

Different randomize methods of the same initial set of points