Subject: Self-Duality of the second order cone
Date: Wed, 4 Sep 2002 22:44:58 -0500 (CDT)

Hello all,

I am reproducing my proof regarding the self-duality of the second order cone.

As Prof. Yin Zhang pointed out in class, there was a minor point I had to take care of. Luckily, this can be done; I fixed this minor detail, and here is the complete proof. If you have any questions, please do let me know.

To show that the second order cone is self dual :

L_n = {x = (x_0;\bar{x}) \in R^n with x_0 >= ||\bar{x}||}

Show that L_n \subseteq (L_n)* ---- (A)

Consider x = (x_0,\bar{x})and any y = (y_0,\bar{y}) in L_n. We have x_0 >= ||\bar{x}||, and y_0 >= ||\bar{y}|| --- (1)

Thus x^Ty = x_0y_0 + \bar{x}^T\bar{y} >= ||\bar{x}|| ||\bar{y}|| + \bar{x}^T \bar{y} >= -\bar{x}^T\bar{y} + \bar{x}^T\bar{y} = 0

[the first inequality follows from (1), whereas the second inequality follows from the Cauchy-Schwarz's inequality].

Thus x^Ty >= 0 for all y in L_n, implying that x is in (L_n)*

This shows (A), i.e. L_n \subseteq (L_n)*

Show that (L_n)* \subseteq L_n ---- (B) Consider x = (x_0;\bar{x}) in (L_n)* It is easy to see that x_0 >= 0 [to see this consider the vector (1;0), here 0 is a (n-1)*1 dimensional 0 vector; since (1;0) is in (L_n), we must have x_0 >=0].

Now what happens when \bar{x} = 0. This was the point Prof. Yin Zhang mentioned in class today. Well in this case x = (x_0,\bar{x}) clearly belongs to L_n since x_0 >= 0 = ||\bar{x}||

If \bar{x} is not 0, then set y = (||\bar{x}||,-\bar{x}) as was done in class. Clearly y belongs to (L_n), and hence x^Ty >= 0 since we are assuming that x=(x_0,||\bar{x}||) is in (L_n)* Thus we have x_0 ||\bar{x}|| - ||\bar{x}||^2 >= 0 --- (2)

and since ||\bar{x}|| is not zero, but strictly positive, we can divide throughout by ||\bar{x}|| in (2) to obtain x_0 >= ||\bar{x}|| implying that x is in (L_n)

Thus we have (B), i.e. (L_n)* \subseteq (L_n)

(A) and (B) together give (L_n) = (L_n)* I hope this is clear. Please forgive my Latex like notation. I will present this proof again in class this Friday.

Cheers,
Kartik

-------------------------------------------------------------------------------

Subject: Optimization over certain convex cones is not easy
Date: Sat, 7 Sep 2002 12:41:07 -0500 (CDT)

Hello all,

This is intended to serve as a discussion of the points Prof. Richard Tapia addressed in class on Friday.

As I mentioned in class each convex cone 'K' has something called the universal self-concordant barrier function 'F'. Yurii Nesterov and Arkadii Nemirovskii proved the existence of such a function for all convex cones K.

Each function 'F' has a parameter called gamma [I have will more to say on this, when we cover interior point methods for these convex programs], but the important result is that you can solve an optimization problem over such cones in O(sqrt(gamma) log(1/epsilon)) Newton iterations, where epsilon is the accuracy to which you want to solve the problem (say 1e-6).

When I mean I can solve an optimization problem over the cone 'K' in polynomial time, I mean that I can solve the overall problem in a polynomial number of arithmetic operations.

So the problem can be solved in polynomial time, if each Newton iteration can itself be carried out in polynomial time.

The Newton iteration requires that the gradient and the Hessian of the self concordant barrier functional F be computed efficiently, and unfortunately this cannot be done in most cases, since the function F is complicated, and is given by a complicated integral.

If we could do this for all convex cones 'K', we could show P = NP [and win a million bucks for resolving the P = NP conjecture :)]

As an example, consider the case when 'K' is a copositive cone. The copositive cone contains all matrices that are symmetric and copositive.

A matrix M is copositive if M is symmetric, and y^TMy >= 0, for all y >= 0

It is similiar to the cone of psd matrices, except that you require psd-ness only w.r.t nonnegative vectors y. Every matrix M that is psd is copositive, but not the other way around.

In other words psd-ness is only a sufficient condition for copositivity, not a necessary condition.

There is a paper by DeKlerk and Paschenik which shows that you can formulate the maximum stable set problem on a graph [an NP-hard problem] exactly as a copositive program, i.e. minimizing a linear function over the intersection of an affine subspace, and the copositive cone.

So optimizing over the copositive cone is NP-hard, even though we have a convex problem.

The paper is available on the readings list, and also at http://epubs.siam.org/sam-bin/dbq/article/38324

and you can take a look at it. I will have a simpler and a more intriguing proof for this in my lecture notes, which I will distribute in class on Wednesday

So the bottom line is not all convex programs are easy to solve; we will look only at certain well-structured convex programs, which include among others, the LP cone, the SOCP cone, and the SDP cone, and these we will see can actually be solved in a polynomial no of arithmetic operations.

Hope this helps!.

Thank you.
Kartik

-------------------------------------------------------------------------------

Subject: Corrected proof for the rank of extreme matrices in SDP
Date: Sun, 24 Nov 2002 11:59:17 -0600 (CST)

Hello all,
Here is a corrected proof for the rank of extreme matrices in SDP, which I discussed in class this Friday. I have interspaced some comments to highlight the important points in the proof.

What we want to show is the following :
Consider the SDP

Min C.X
s.t. A_i.X = b_i, i=1,...,m
X psd

where . denotes the regular matrix dot product. Also let S denote the feasible set of SDP, i.e. {X : A_i.X = b_i, i=1,...,m, X psd}. Since we have an SDP, S is some convex subset of S^n, the space of n*n symmetric matrices.

We call a convex subset F of S [S defined above] a face of S if x \in F, y,z \in S, and x = 1/2(y+z) implies that y and z must both be in F. A vertex (extreme point) is a face of dimension 0, and is also a face consisting of a single element. In this case, x,y,z are the same element x (say). Also, note that in the SDP we are minimizing a linear functional in X over a convex set S, and so the optimal solution should be attained at an extreme point solution, i.e. at a face F of dimension 0.

Let's say F is any face of dimension 'd' of S. We say that a face of dimension d, if there are d+1 affinely independent points on this face. Thus in the case of an extreme point we have only one affinely independent point, and hence a face of dimension 0.

----------------------------------------------------------------------------------------------------------------

Consider any matrix X sitting on this face F, and let r be the rank of X. We want a bound on the rank 'r' of this matrix in terms of the dimension of the face 'd', and the number of functional constraints in the SDP 'm'.

The theorem says that t(r) = r(r+1)/2 <= m+d ---- (1)

Proof :-

Since X = Q Lambda Q^T, where Q \in R^{n*r}, Lambda \in S^r, and Lambda positive definite (pd). [Note : X = Q \Lambda Q^T is NOT a spectral factorization of X, since Lambda is not required to be a diagonal matrix here. This was a major source of confusion in class on Friday].

So
A_i.X = A_i.(Q Lambda Q^T) = (Q^TA_iQ).Lambda = b_i, i=1,..,m [where we have used trace(AB) = trace(BA)].

Thus we have (Q^TA_iQ).Lambda = b_i, i=1,...,m ---- (2) We have m equations, and t(r) = r(r+1)/2 unknowns in (2). Note that Lambda is a positive definite matrix of size r, and hence is nonsingular with rank r [If we had assumed Lambda to be a diagonal matrix, then we would really have r unknowns, since there are only r degrees of freedom here. In fact any matrix of rank r can be written as X = Q Lambda Q^T, with Lambda p.d. of size r. This is a crucial observation, that is needed to proceed further with the proof].

We will now prove (1) by contradiction. Assume instead that t(r) > m+d, i.e. the smallest value of t(r) = m+d+1. From the fundamental theorem of linear algebra on system (2), there exist Delta_1,....,Delta_{d+1} \in S^r linearly independent matrices sitting in the null space of [Q^TA_1Q; Q^TA_2Q; .....; Q^TA_mQ], which is at least of dimension d+1.

Thus (Q^TA_iQ).Delta_j = 0 (i=1,...,m, j=1,..,d+1) ---- (3)

Now since Lambda is positive definite, there exists an epsilon > 0, such that (Lambda + epsilon Delta_j), and (Lambda - epsilon Delta_j) are positive semidefinite (psd), j=1,...,d+1

Set Lambda_{j,1} = Lambda + epsilon Delta_j, and Lambda_{j,2} = Lambda - epsilon Delta_j, and let X_{j,1} = Q Lambda_{j,1} Q^T, and X_{j,2} = Q Lambda_{j,2} Q^T ---- (4)

The matrices X_{j,1}, X_{j,2} are in S [the feasible set for SDP], since they are psd, and they satisfy the primal constraints A_i.X = b_i, i=1,...,m

Also X = 1/2(X_{j,1} + X_{j,2}) (from (4)).

From the definition of a face above, we must have X_{j,1} and X_{j,2} are in F, j=1,...,d+1

Since the matrices Delta_1,...,Delta_{d+1} are linearly independent, the matrices (Lambda_{j,1} - Lambda) are linearly independent for j=1,...,d+1 (from (4)).

Thus the matrices Lambda, Lambda_{1,1}, ..., Lambda_{d+1,1} are affinely independent [if a set of points x^1,...,x^m is affinely independent, then (x^2-x^1),(x^3-x^1),....,(x^m-x^1) are linearly independent].

From (4), this implies that X, X_{1,1},....,X_{d+1,1} are affinely independent, and so we have (d+2) affinely independent points on the face F, and so the dim(F) >= d+1, i.e. the dimension of F should be at least (d+1).

However we assumed dim(F) = d at start, and this is a contradiction.

Thus we must have (1), i.e. r(r+1)/2 <= m+d

In particular for an basic feasible solution (of which the optimal solution is a special case), d = 0, and so we have r(r+1)/2 <= m, i.e. r is O(sqrt{m}).

--------------------------------------------------------------------------------------------------------------------------------------

The crucial observation here is that a matrix X of rank r can be expressed as X = Q Lambda Q^T, where Lambda is a positive definite matrix of size r.
This is NOT TO BE CONFUSED with the spectral factorization of X which is only a SPECIAL CASE of the above factorization, with Lambda also required to be DIAGONAL. This was the major source of confusion in class on Friday.

Please let me know if you have any questions.

Thank you.

Kartik