STAT 603 – ACTIVE CONSTRAINT METHODS
1. The set-up
We consider the nonnegatively constrained least-squares problem
(NNLS)
minimize
k A x − b k2
subject to x > 0 (component wise) ,
where A ∈ Rm×n and b ∈ Rm are given and x ∈ Rn . We shall assume
that the columns of A are linearly independent.
The algorithm we pursue is an active constraint method, after Lawson and Hanson (1974). An oldy but goody. First, we describe the first
few steps of the algorithm, and then give a (more) formal presentation
of the algorithm.
The algorithm is iterative in nature, except that it terminates in a
finite number of steps. The starting point is an initial guess for the
solution of (NNLS), say x1 = 11, the vector of all ones, as well as an
initial guess as to which constraints are active, based on the current
guess for the solution. Thus, P1 = { } is the set of indices j for which
x1j = 0. So, P1 is empty, initially.
Both the guess for the solution and the guess for the active constraints
will change in the course of the computation.
Now, the actual work starts. Compute z 2 as the solution to
(1)
minimize
k A x − b k2
subject to xj = 0 for all j ∈ P1 .
Exactly how this is to be achieved is another question, but note we must
set to zero the gradient with respect to the xi that are “free”. So, we
must solve the system of equations
(2)
[ AT ( A x − b ) ]i = 0 ,
i∈
/ P1 ,
xj = 0 ,
1
i ∈ P1 .
2
STAT 603 – ACTIVE CONSTRAINT METHODS
Now, there are two possibilities : either z 2 satisfies z 2 > 0 or it does not.
If it does not, then determine t ∈ ( 0 , 1 ) such that
def
x2 = x1 + t ( z 2 − x1 )
(3)
is nonnegative and has at least one component equal to zero. Thus, x2
is obtained by moving from x1 to z 2 until one hits the boundary of the
constraint set x > 0. Finally, update the active constraint sets,
P2 = j : xj 6 0 .
Now, repeat the above : Let z 3 be the solution to
(4)
minimize
k A x − b k2
subject to xj = 0 for all j ∈ P2 ,
ETC !
But what if z 2 > 0 ? Then, set x2 = z 2 . Thus, x2 is the solution to
(1) with the added constraints x > 0. For definiteness, x2 solves
minimize
(5)
k A x − b k2
subject to x > 0 ,
xj = 0 for all j ∈ P1 .
The question now is whether x2 is the solution to (NNLS). Now, we
know that an x ∈ Rn solves (NNLS) if it satisfies the complementarity
conditions
(6)
x>0 ,
AT ( A x − b ) > 0 ,
xi [ AT ( A x − b ) ]i = 0 ,
i = 1, 2, · · · , n .
Surely, x2 > 0. Also, by (2)
(7)
[ AT ( A x − b ) ]i = 0
for all i ∈
/ P1 .
It follows that for each i, either x2i or [ AT ( A x − b ) ]i equals zero (or
both), so
xi [ AT ( A x − b ) ]i = 0 , i = 1, 2, · · · , n .
STAT 603 – ACTIVE CONSTRAINT METHODS
3
All that is left is to verify whether AT ( A x2 − b ) > 0 . If this holds, then
x2 solves (NNLS). If not, then our guess for the active constraint set is not
correct. We change it by choosing an index j with [ AT ( A x − b ) ]j < 0 ,
and removing j from the set P1 ,
P2 = P1 − { j } .
(8)
Now, repeat the above process, by solving the problem (4). ETC !
The formalized algorithm is given as Algorithm 1.
Comment on (1). How does one solve (1) ? If we partition A by
columns,
i
h A = a1 a2 · · · an ,
then if xi = 0, the column ai is dropped. Thus, the inactive constraint
list are those indices i for which P1 (i) < 0, and these are the indices of
the columns that stay. This results in the matrix B, and we must solve
the problem
minimize k B y − b k2 .
If the columns
the solution is
then be put in
This is also
We implement
of A are linearly independent, so are the ones of B, and
given by y = (B T B)−1 B T b. The components of y must
x, in their proper locations.
the place to comment on the active constraint sets Pk .
it as an array P of length n. Initially,
P (j) = j ,
j = 1, 2, · · · , n .
Whenever P (j) = +j, then the constraint xj > 0 is not active. If
P (j) = −j, then the constraint xj > 0 is active, i.e., xj = 0. The
command inact = find( P ) returns the list of inactive constraints.
The matrix B is then computed as B = A ( inact ), and the rest is
smooth sailing.
The function lszero( A, b, P ) solves the least squares problem
with the constraints encoded in P. An inefficient way to do this would be
function [ x ] = lszero( A, b, P )
inact = find( P>0 ) ; % the free components (CORRECTED)
y = A( inact ) \ b ;
% put the components of y in their proper places in x
x = zeros( length( P ), 1 ) ;
x( inact ) = y ;
4
STAT 603 – ACTIVE CONSTRAINT METHODS
%
function [ x ] = myNNLS( A, b, tol)
%
% Initialization
%
dims = size( A ); n = dims(2);
%
x = ones( n, 1) ;
P = [ 1 : n ] ;
% the outer loop
for k = 2 : inf
iter=k
z = lszero( A, b, P ) ;
% Is z feasible ?
tees = - x ./ ( z - x ) ;
t = min( tees( find( tees >= 0 ) ) ) ;
t = min( [ 1, t ]) ;
if ( t == 1 )
% z is feasible
x = z ;
grad = A’ * ( A * x - b ) ;
[ stuff, jgr ] = min( grad ) ;
if ( stuff > - tol )
% CORRECTED
% done: x is the solution of (NNLS)
break
else
% remove offending constraint
P( jgr ) = + jgr
end
else
% z is not feasible;
[ stuff, jt ] = min( z ) ; % CORRECTED
x = x + t * ( z - x )
% CORRECTED
P( jt ) = - jt
% CORRECTED
end
end
% this is it
Algorithm 1. Active Constraint Method for NNLS.
STAT 603 – ACTIVE CONSTRAINT METHODS
5
The inefficiency arises from the fact that we start from scratch solving
the system of equations, even though it is a rank one update or downdate
of the matrix in the previous instance of this function being called.
Comment on (3). The determination of t is somewhat tedious. We
need to look at each component of x1 + t ( z 2 − x1 ). If zj2 − x1j > 0 , then
the j-th component causes no problems. If it is negative, then t cannot
be larger than
t j = −x1j / ( zj2 − x1j ) .
So, the final t equals
t = min
t j : zj2 − x1j < 0
,
(or 1, whichever is smallest). The MATLAB code snippet that achieves
this is
tees = - x_1 ./ ( z_2 - x_1 ) ;
t = min( tees( find( tees >= 0 ) ) ) ;
(As always, if you do not know what find does, type help find at
the MATLAB command prompt.) Actually, t might not be defined : it
might be empty if all components of tees were negative. In this case
there are no constraints active, but to avoid trouble, add the MATLAB
line
t = min( [ 1 , t ] ) ;
Since this is somewhat arcane, try it out in MATLAB, with tees =
-rand(5,1).
An application. An application is provided by nonparametric monotone regression. Suppose we have the model
Yi = fo (Xi ) + εi ,
i = 1, 2, · · · , n ,
where
ε = (ε1 , ε2 , · · · , εn )T ∼ N ( 0 , σ 2 ) ,
for some unknown σ 2 . We assume that X1 < X2 < · · · < Xn .
6
STAT 603 – ACTIVE CONSTRAINT METHODS
The basic assumption is that the function fo is monotone increasing.
We might know this if fo represents a growth curve, or if we are not
quite sure if fo is increasing, we might wish to compare the monotone
estimator to a spline estimator.
The estimation problem is
n
P
minimize
| f (Xi ) − Yi |2
i=1
subject to f increasing .
To get a finite dimensional model, let
bi = f (Xi ) ,
i = 1, 2, · · · , n ,
Then f increasing means b1 6 b2 6 · · · . To get to a nonnegatively
constrained problem, write
bi =
i
P
zj ,
j=1
where the zj are nonnegative. Then,
b=V z ,
where b = (b1 , b2 , · · · , bn )T , z = (z1 , z2 , · · · , zn )T , and
V =
1
1
1
..
.
1
1
..
.
1
1
1
..
.
.
..
.
1 ···
1
So the problem is
minimize
k V z − Y k2
subject to
z>0.
Try it, with such examples as fo (x) = 1 − e−x , 0 6 x 6 10 , or some
such.
© Copyright 2026 Paperzz