OptimalMatrixCompletion is a Julia package which solves low-rank matrix completion problems to certifiable optimality via a custom branch-and-bound scheme. It is the implementation of the following paper: Optimal Low-Rank Matrix Completion: Semidefinite Relaxations and Eigenvector Disjunctions.
If you use this in your work, please cite this paper as follows:
@article{bertsimas2023optimal,
author = {Bertsimas, Dimitris and Cory-Wright, Ryan and Lo, Sean and Pauphilet, Jean},
year = {2023},
month = {05},
pages = {},
title = {Optimal Low-Rank Matrix Completion: Semidefinite Relaxations and Eigenvector Disjunctions},
doi = {10.13140/RG.2.2.20630.11841}
}
To install the package:
julia> Pkg.install("OptimalMatrixCompletion")
To perform matrix completion on the matrix A with observed indices indices:
julia> using OptimalMatrixCompletion
julia> (k, m, n) = (1, 50, 50);
julia> A = randn(Float64, (m, n)); indices = BitMatrix(rand([0,1], (m, n)));
julia> (γ, λ) = (80.0, 0.0);
julia> noise = true;
julia> (solution, printlist, instance) = OptimalMatrixCompletion.branchandbound_frob_matrixcomp(
k, A, indices, γ, λ, noise,
;
node_selection = "bestfirst",
disjunctive_cuts_type = "linear",
disjunctive_cuts_breakpoints = "smallest_1_eigvec",
time_limit = 3600,
)
Here, solution is a dictionary with the following fields:
-
"X_initial": the solution obtained after an alternating minimization procedure at the root node. -
"Y_initial","U_initial": obtained fromX_initialsuch that$(Y, U, X)$ are feasible for the master problem. -
"objective_initial": the objective value obtained forX_initial. -
"MSE_in_initial": the in-sample MSE ofX_initialcompared to the input matrixA -
"MSE_out_initial": the out-of-sample MSE ofX_initialcompared to the input matrixA(this is only relevant if one has access to the unobserved values inA) -
"MSE_all_initial": the overall MSE ofX_initialcompared to the input matrixA(this is only relevant if one has access to the unobserved values inA) -
"X": the solution obtained after the branch-and-bound algorithm. -
"Y","U": obtained fromXsuch that$(Y, U, X)$ are feasible for the master problem. -
"objective": the objective value obtained forX. -
"MSE_in": the in-sample MSE ofXcompared to the input matrixA -
"MSE_out": the out-of-sample MSE ofXcompared to the input matrixA(this is only relevant if one has access to the unobserved values inA) -
"MSE_all": the overall MSE ofXcompared to the input matrixA(this is only relevant if one has access to the unobserved values inA)
printlist is a Vector{String} that contains the logged output of the algorithm. instance is a dictionary with the following fields:
"run_log": a log of the branch-and-bound algorithm, documenting the number of explored and total nodes, the incumbent lower and upper bounds, the current optimality gap, and the time taken so far."run_details": a dictionary consisting of the parameters tobranchandbound_frob_matrixcomp(), together with measurements on the time taken for various parts of the algorithm, and number of nodes in branch-and-bound of various categories.
branchandbound_frob_matrixcomp() has the following required parameters:
-
k::Int, the rank constraint on the imputed matrix$X$ . -
A::Array{Float64, 2}, the observed data matrix$A \in \mathbb{R}^{m \times n}$ . -
indices::BitMatrix, the observed indices in$A$ as a 0-1 matrix with 1 denoting the positions of observed entries. -
γ::Float64, the regularization parameter$\gamma > 0$ on the imputed matrix$X$ . A larger value indicates less regularization, while a value closer to 0 indicates more regularization. -
λ::Float64, the penalty parameter$\lambda$ on the rank of the matrix. This is typically set to 0. -
noise::Bool, whether the entries in$A$ are observed with noise or noiselessly.
See below for a full list of optional paramters.
branching_type::Union{String, Nothing} = nothing: in the situation withuse_disjunctive_cuts = false, determining which coordinate to branch on: either "lexicographic" or "bounds" or "gradient".branch_point::Union{String, Nothing} = nothing: in the situation withuse_disjunctive_cuts = false, determine which value to branch on: either "midpoint" or "current_point".node_selection::String = "breadthfirst": the node selection strategy to use: either "breadthfirst" or "bestfirst" or "depthfirst" or "bestfirst_depthfirst".bestfirst_depthfirst_cutoff::Int = 10000: in the situation withnode_selection = "bestfirst_depthfirst", the number of nodes in the queue before the algorithm switches from"bestfirst"to"depthfirst".gap::Float64 = 1e-4: relative optimality gap for branch-and-bound algorithm.use_disjunctive_cuts::Bool = true: whether to use eigenvector disjunctions, highly recommended to betrue.disjunctive_cuts_type::Union{String, Nothing} = nothing: number of pieces in piecewise linear upper-approximation; either "linear" (2 pieces) or "linear2" (3 pieces) or "linear3" (4 pieces).disjunctive_cuts_breakpoints::Union{String, Nothing} = nothing: number of eigenvectors to use in constructing separation oracle, either "smallest_1_eigvec" (most negative eigenvector) or "smallest_2_eigvec" (combination of first and second most negative eigenvectors).presolve::Bool = false: in the noiseless setting (noise = false), whether to perform presolve, highly recommended to betrue.add_basis_pursuit_valid_inequalities::Bool = false: in the noiseless setting (noise = false), whether to impose valid inequalities from determinant minors.add_Shor_valid_inequalities::Bool = false: whether to add Shor SDP inequalities to strengthen SDP relaxation at each node.Shor_valid_inequalities_noisy_rank1_num_entries_present::Vector{Int} = [1, 2, 3, 4]: ifadd_Shor_valid_inequalitiesis true, the set of 2-by-2 determinant minors to model with Shor SDP inequalities, based on the number of filled entries (should be some subset of[1, 2, 3, 4]).add_Shor_valid_inequalities_fraction::Float64 = 1.0: ifadd_Shor_valid_inequalitiesis true, the proportion of 2-by-2 determinant minors to model with Shor SDP inequalities.add_Shor_valid_inequalities_iterative::Bool = false: ifadd_Shor_valid_inequalitiesis true, whether to add them iteratively from parent node to child node.max_update_Shor_indices_probability::Float64 = 1.0: ifadd_Shor_valid_inequalities_iterativeis true, the maximum probability of adding inequalities at a node.min_update_Shor_indices_probability::Float64 = 0.1, ifadd_Shor_valid_inequalities_iterativeis true, the minimum probability of adding inequalities at a node.update_Shor_indices_probability_decay_rate::Float64 = 1.1: ifadd_Shor_valid_inequalities_iterativeis true, the base of the exponential decay of the probability of adding inequalities at a node, as a function of depth in the tree.update_Shor_indices_n_minors::Int = 100: ifadd_Shor_valid_inequalities_iterativeis true, the number of Shor SDP inequalities to add at a node whenever adding is performed.root_only::Bool = false: if true, only solves relaxation at root nodealtmin_flag::Bool = true: whether to perform alternating minimization at nodes in the branch-and-bound tree, highly recommended to betrue.max_altmin_probability::Float64 = 1.0: ifaltmin_flagis true, the maximum probability of performing alternating minimization at a node.min_altmin_probability::Float64 = 0.005: ifaltmin_flagis true, the minimum probability of performing alternating minimization at a node.altmin_probability_decay_rate::Float64 = 1.1: ifaltmin_flagis true, the base of the exponential decay of the probability of performing alternating minimization at a node, as a function of depth in the tree.use_max_steps::Bool = false: whether to terminate the algorithm based on the number of branch-and-bound nodes explored.max_steps::Int = 1000000: ifuse_max_stepsis true, the upper limit on number of branch-and-bound nodes explored.time_limit::Int = 3600: time limit in seconds.update_step::Int = 1000: number of branch-and-bound nodes explored per printed update.
- It is highly recommended to set the parameter
use_disjunctive_cutsandaltmin_flagto true (the method in the paper), which implements eigenvector disjunctions and alternating minimization at branch-and-bound nodes respectively. - For basis pursuit problems (the noiseless setting), it's highly recommended to set
presolveto true, which fills in some entries based on a determinant condition on the$(k+1)$ -by-$(k+1)$ minors. - The regularization parameter
γshould be tuned according to some paramter tuning and cross-validation procedure. Bear in mind that large values ofγ, corresponding to less regularized problems, usually corresponds to longer solution times and more nodes explored. - Take care in deciding the
add_Shor_valid_inequalitiesparameter. If set to true, convex relaxations at each node in general take much longer to solve (especially with a large set inShor_valid_inequalities_noisy_rank1_num_entries_present). This can greatly increase solution time. However, depending on the sparsity and rank regime, judicious choices ofShor_valid_inequalities_noisy_rank1_num_entries_presentcan result in a much stronger relaxation that is tight, resulting in few if any branch-and-bound nodes required. - Don't forget to set the
time_limitto a reasonable value!