Recently Andreas Tillmann presented the poster “An Infeasible-Point Subgradient Algorithm and a Computational Solver Comparison for l1-Minimization” at SPARS11. This poster summarized some results of the project SPEAR on sparse exact and approximate recovery of Marc Pfetsch an myself. We used this as an opportunity to release a draft of the accompanying paper with the same title. Although this draft is not totally ready to be submitted yet, I already summarize its content here.

Is this paper we considered the Basis Pursuit problem (beware: the linked Wikipedia page is stub at this time) from a purely optimization point of view. The Basis Pursuit problem is: For given matrix {A\in{\mathbb R}^{m\times n}} (with {m<n}) and a vector {b\in{\mathbb R}^m}, find the solution to

\displaystyle \min_{x} \|x\|_1\quad\text{s.t.}\quad Ax = b. \ \ \ \ \ (1)

Hence, we mainly neglected all its interesting features of reproducing the sparsest solution of an underdetermined linear system and so on and solely concentrated on its solution as an optimization problem.

The paper has three somehow separated contributions:

  • The new algorithm ISAL1: The problem (1) is a convex nonsmooth constrained optimization problem. Marc and Andreas are optimizers and they wondered how the most basic method for this class of problems would perform: The projected subgradient method: For solving

    \displaystyle \min_x f(x)\quad\text{s.t.}\quad x\in C

    take steps along some negative subgradient and project back to {C}: {x^{k+1} = P_C(x^k - \alpha_k h^k)}. For (1) subgradients are readily available, e.g. {h^k = \text{sgn}(x^k)} (taken coordinate-wise). However, projecting onto the constraint {Ax=b} is not too easy. Denoting the projection simply by {P}, we can give a closed form expression (assuming that {A} has full rank) as

    \displaystyle P(z) = (I - A^T (AA^T)^{-1} A) z + A^T(AA^T)^{-1}b,

    this has the drawback that one needs to explicitly invert a matrix (which, however, is just {m\times m} and hence, is usually not too large since we assume {m<<n}). However,  we proposed replace the exact projection by an approximate one: In each step we solve for the projection by a truncated conjugate gradient method. While we expected that one should increase the accuracy of the approximate projection by increasing the number of CG-steps during iteration, surprisingly that is not true: Throughout the iteration, a fixed small number of iterations (say {5} for matrices of size {1000\times 4000} but mainly independently of the size) suffices to obtain convergence (and especially feasibility of the iterates). In this paper we give a proof of convergence of the methods under several assumptions on the step-sizes and projection accuracies building on our previous paper in which we analyzed this method in the general case. Moreover, we described several ways to speed up and stabilize the subgradient method. Finally, we called this method “Infeasible point subgradient algorithm for {\ell^1}”: ISAL1. A Matlab implementation can be found and the SPEAR website.

  • HSE, the heuristic support evaluation: That’s a pretty neat device which can be integrated in any Basis Pursuit solver (beware: not Basis Pursuit denoising; we want the equality constraint). The idea is based on the following small lemma:

    Lemma 1 A feasible vector {\bar x} (i.e. {A\bar x = b}) is optimal for (1) if and only if there is {w\in{\mathbb R}^m} such that {A^Tw \in\partial\|\bar x\|_1}.

    The proof basically consists of noting that the normal cone on the constraint {\{Ax=b\}} is the image space of {A^T} and hence, the condition is equivalent to saying that this normal cone intersects the subgradient {\partial\| \bar x\|_1} which is necessary and sufficient for {\bar x} being optimal. In practice the HSE does the following:

    • deduce candidate (approx.) support {S} from a given {x}
    • compute approximate solution {\hat{w}} to {A_{S}^T w = \text{sgn}(x_{S})} by {w = (A_S^T)^\dagger\text{sgn}(x_S)} with the help of CG
    • if {\|A^T \hat{w}\|_\infty \approx 1} check existence of a {\hat{x}} with {A_{S} \hat{x}_{S} = b} and {\hat{x}_i = 0} {\forall\, i \notin S}
    • if that {\hat x} exists, check if the relative duality gap {(\|\hat{x}\|_1 + b^T (-\hat{w}))/\|\hat{x}\|_1} is small and return “success” if so, i.e. take {\hat x} as an optimal solution

    Again, CG usually performs great here and only a very few iterations (say {5}) are needed. In practice this methods did never return any vector {\hat x} marked as optimal which was wrong.

  • Computational comparison: We faced the challenge of a computational comparison for Basis Pursuit solvers.
    The first step was, to design a testset. We constructed 100 matrices (74 of which are dense, 26 are sparse) by several constructions and concatenations (see Section 5 in the draft). More complicated was the construction of appropriate right hand sides. Why? Because we wanted to have unique solutions! That is, because we wanted to have the norm difference {\|x^*-\hat x\|_2} between optimal and computed solution as a measure for both optimality and feasibility. In the first place we used the ERC due to Joel Tropp (e.g. described in this blog post of Bob’s blog). However, this does not only guarantee uniqueness of solutions but also that the minimum {1}-norm solution is also the sparsest. Since that is probably too much to have for solutions (i.e. they have to be very sparse) we constructed some more right hand sides using L1TestPack: Construct an {x} such that there is {w} such that {A^T w \in\partial \|x\|_1} and use {b = Ax}. This also leads to unique solutions for Basis Pursuit if A is injective when restricted to the columns which related to the entries in which (A^T w)_i = \pm 1 but allows for much larger supports.For the results of the comparison of ISAL1, SPGL1, YALL1, {\ell^1}-MAGIC, SparseLAB, the homotopy solves of Salman Asif and CPLEX check the paper. However, some things are interesting:

    1. homotopy is the overall winner (which is somehow clear for the instances constructed with ERC but not for others). Great work Salman!
    2. ISAL1 is quite good (although it is the simplest among all methods).
    3. HSE works great: Including it e.g. in SPGL1 produces “better” solution in less time.
    4. CPLEX is remarkably good (we used the dual simplex). So: How does it come that so many people keep saying that standard LP-solves do not work well for Basis Pursuit? That is simply not true for the dual simplex! (However, the interior point methods in CPLEX was not competitive at all.)

    We plan to make a somehow deeper evaluation of our computational results before submitting the paper to have some more detailed conclusions on the performance of the solvers an different instances.