LinearRegression
This is a simple package. See below for a list of more complex packages for linear regression in Julia.
Why this package?
Because I keep finding myself thinking, "I need some simple linear regression
here...", and missing the level of abstraction halfway in between X \ y
and
GLM.jl, without lots of additional
dependencies.
I keep running into this Discourse
thread
and wishing I could just be using LinearRegression
.
Alistair.jl would fit the bill, but
hasn't been maintained and doesn't work with Julia 1+.
What this package supports:
Linear regression based on vector and matrix inputs and outputs:
lr = linregress(X, y)
X
can be a vector (1D inputs, each element is one observation) or a matrix (multivariate inputs, each row is one observation, columns represent features).
y
can be a vector (1D outputs, each element is one observation) or a matrix (multivariate outputs, each row is one observation, columns represent targets).
Weighted linear regression:
lr = linregress(X, y, weights)
weights
is the vector of each observation's weight.
Intercept/bias term: By default, implicitly adds a column of ones to account for the intercept term.
You can disable this and force the linear regression to go through the origin by passing the intercept=false
keyword argument.
Choice of solver:
By default, uses QR factorization (X \ y
) to solve the linear system.
You can explicitly choose a solver by passing the method
keyword argument.
Currently implemented choices are method=SolveQR()
(using QR factorization, the default) and method=SolveCholesky()
(using Cholesky factorization; can be faster, but numerically less accurate).
Predicting:
ytest = lr(Xtest)
Extracting coefficients:
β = coef(lr)
which includes the intercept/bias in the last position, if intercept=true
(the default).
You can explicitly obtain slopes and intercept/bias by calling
LinearRegression.slope(lr)
LinearRegression.bias(lr)
I'm happy to receive issue reports and pull requests, though I am likely to say no to proposals that would significantly increase the scope of this package (see below for other packages with more features).
What this package does not do (aka Alternatives):

Be as comprehensive as SciML's LinearSolve.jl (on the other hand, less dependencies).

Ridge regression (use MultivariateStats.jl instead, or convince me it really should be part of LinearRegression.jl as well).

Handling of DataFrames (use GLM.jl instead).

Lots of regression statistics (use GLM.jl instead).

Different (nonGaussian) observation models (use GLM.jl instead).

Sparse regression (use SparseRegression.jl instead).

Bayesian linear regression (use BayesianLinearRegressors.jl instead).

Online estimation (use OnlineStats.jl instead).
Want to suggest another package to recommend here? Feel free to open a pull request! (: