Implementation of different parameter estimators that take in measures under uncertainty and produce a probability distribution over the parameters.
# observation function with multivariate observations
f(x, p) = [(x + 1)^2 - sum(p);
           (x + 1)^3 + diff(p)[1]]
# true parameter (to be estimated) and a prior belief
θtrue = [1.0, 2.0]
paramprior = MvNormal(zeros(2), 4.0 * I)
# observation noise
obsnoises = [rand()/10 * I(2) * MvNormal(zeros(2), I) for _ in eachindex(xs)]
noisemodel = UncorrGaussianNoiseModel(obsnoises)
# noisy observations x and y
xs = rand(5)
ysmeas = f.(xs, [θtrue]) .+ rand.(noises)
# find a probabilistic description of θ either as samples or as a distribution
# currently we provide three methods
for est in [MCMCEstimator(),
            LinearApproxEstimator(),
            LSQEstimator()]
    # either
    samples =  predictsamples(est, f, xs, ysmeas, paramprior, noisemodel, 100)
    # or
    dist    =  predictdist(est, f, xs, ysmeas, paramprior, noisemodel; nsamples=100)
endWe assume parameters 
Given that we have uncertainty in the observations, we are interested in constructing a probabilistic description predictsamples(est, f, xs, ys, paramprior, noisemodel, nsamples) and predictdist(est, f, xs, ys, paramprior, noisemodel), respectively.
The conversion between samples and a distribution can be done automatically via sampling or fitting a multivariate normal distribution.
The MCMCEstimator simply phrases the problem as a Monte-Carlo Markov-Chain inference problem, which we solve using the NUTS algorithm provided by Turing.jl.
Therefore predictdist(::MCMCEstimator, f, xs, ys, paramprior, noisemodel, nsamples) will create nsamples samples (after skipping a number of warmup steps).
predictdist(::MCMCEstimator, f, xs, ys, paramprior, noisemodel, nsamples) will do the same, and then fit a MvNormal distribution.
The LSQEstimator works by sampling noise paramprior is used to sample initial guesses for 
Therefore predictsamples(::LSQEstimator, f, xs, ys, paramprior, noisemodel, nsamples) will solve nsamples optimization problems and return a sample each.
predictdist(::LSQEstimator, f, xs, ys, paramprior, noisemodel, nsamples) will do the same, and then fit a MvNormal distribution.
The LinearApproxEstimator solves the optimization problem above just once, and then constructs a multivariate normal distribution centered at the solution.
The covariance is constructed by computing the Jacobian of paramprior is used to sample initial guesses for 
Therefore predictdist(::LinearApproxEstimator, f, xs, ys, paramprior, noisemodel, nsamples) will solve one optimization problem and compute one Jacobian, yielding a MvNormal and making it very efficient.
predictsamples(::LinearApproxEstimator, f, xs, ys, paramprior, noisemodel, nsamples) will simply sample nsample times from this distribution, which is also very fast.
We are currently considering three different possible noise models.
Consider again how we may have an observation 
If there is no correlation between observations, we can use the noise models:
- UncorrGaussianNoiseModel: A vector of (possibly multivariate) Gaussian noise distributions, one for each observation.
- UncorrProductNoiseModel: A vector of univariate noise distributions of any kind. Can not model correlations within a single observation.
If there is correlation between observations, we can provide a single multivariate Gaussian noise model.
- CorrGaussianNoiseModel: A single multivariate normal distribution, with a noise component for each component in each observation. Multivariate observations are therefore flattened to correspond to the noise model.
Here are some examples:
## UncorrGaussianNoiseModel
xs = rand(5)
# one (uni- or multivariate) normal distribution per observation
noises = [MvNormal(zeros(2), I) for _ in eachindex(xs)]
noisemodel = UncorrGaussianNoiseModel(noises)
θtrue = [1.0, 2.0]
ysmeas = f.(xs, [θtrue]) .+ rand.(noises)
## UncorrProductNoiseModel
xs = eachcol(rand(2,30))
# one univariate distribution per observation
productnoise = [truncated(0.1*Normal(), 0, Inf) for _ in 1:length(xs)]
noisemodel = UncorrProductNoiseModel(productnoise)
θtrue = [5., 10]
ysmeas = f.(xs, [θtrue]) .+ rand.(productnoise)
## CorrGaussianNoiseModel
xs = 5 * eachcol(rand(2,30))
n = length(xs)
# one large Gaussian relating all observations
corrnoise = MvNormal(zeros(n), I(n) + 1/5*hermitianpart(rand(n, n)))
noisemodel = CorrGaussianNoiseModel(corrnoise)
θtrue = [5., 10]
ysmeas = f.(xs, [θtrue]) .+ rand(corrnoise)