Definition of matrix norms

In my previous post, I introduced various definitions of matrix norms in (mathbb{R}^{n imes n}) based on the corresponding vector norms in (mathbb{R}^n). Meanwhile, the equivalence of different vector norms and their induced metrics and topologies in (mathbb{R}^n) is also inherited into (mathbb{R}^{n imes n}). In this article, we’ll show why the above defined matrix norms are valid.

Generally, the definition of a matrix norm in (mathbb{R}^{n imes n}) should satisfy the following four conditions:

Positive definiteness: for all (A in mathbb{R}^{n imes n}), ( orm{A} geq 0). ( orm{A} = 0) if and only if (A = 0).
Absolute homogeneity: for all (alpha in mathbb{R}) and (A in mathbb{R}^{n imes n}), ( orm{alpha A} = abs{alpha} orm{A}).
Triangle inequality: for all (A, B in mathbb{R}^{n imes n}), ( orm{A + B} leq orm{A} + orm{B}).
Sub-multiplicity: for all (A, B in mathbb{R}^{n imes n}), ( orm{AB} leq orm{A} orm{B}).

Therefore, we need to prove the following theorem in order to meet the above requirements.

Theorem Let ( orm{cdot}) be a norm on (mathbb{R}^n). Then for all (A in mathbb{R}^{n imes n}), its matrix norm (zeta: mathbb{R}^{n imes n} ightarrow mathbb{R}) can be defined as
[
zeta(A) = sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{A vect{x}}}{ orm{vect{x}}} = sup_{forall vect{x} in mathbb{R}^n, orm{vect{x}}=1} orm{A vect{x}}
]

Proof a) Positive definiteness and absolute homogeneity directly inherit from vector norms.

b) The triangle inequality can be proved as following.
[
egin{aligned}
zeta(A + B) &= sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{(A + B) vect{x}}}{ orm{vect{x}}} = sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{Avect{x} + Bvect{x}}}{ orm{vect{x}}} \
& leq sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{Avect{x}} + orm{Bvect{x}}}{ orm{vect{x}}} leq sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{Avect{x}}}{ orm{vect{x}}} + sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{Bvect{x}}}{ orm{vect{x}}} \
&= zeta(A) + zeta(B).
end{aligned}
]

c) For sub-multiplicity, we have
[
egin{aligned}
zeta(AB) &= sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{ABvect{x}}}{ orm{vect{x}}} = sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{ABvect{x}} orm{Bvect{x}}}{ orm{Bvect{x}} orm{vect{x}}} \
&leq sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{Avect{x}}}{ orm{vect{x}}} cdot sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{Bvect{x}}}{ orm{vect{x}}} = orm{A} cdot orm{B}.
end{aligned}
]
d) Prove (zeta(A) = sup_{forall vect{x} in mathbb{R}^n, orm{vect{x}} = 1} orm{Avect{x}}).

Note that (frac{1}{ orm{vect{x}}}) is a scalar value in (mathbb{R}), then with the proved absolute homogeneity, we have
[
zeta(A) = sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{Avect{x}}}{ orm{vect{x}}} = sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} leftVert A cdot frac{vect{x}}{ orm{vect{x}}} ightVert.
]
By letting (vect{x}' = frac{vect{x}}{ orm{vect{x}}}), we have this part proved.

Summarizing a) to d), ( orm{cdot}) is literally a matrix norm induced from the corresponding vector norm.

Next, we prove the validity of the detailed formulations of the matrix norms, i.e.

1-norm: ( orm{A}_1 = max_{1 leq j leq n} sum_{i=1}^n abs{a_{ij}}), which is the maximum column sum;
2-norm: ( orm{A}_2 = sqrt{ ho(A^T A)}), where ( ho) represents the spectral radius, i.e. the maximum eigenvalue of (A^TA);
(infty)-norm: ( orm{A}_{infty} = max_{1 leq i leq n} sum_{j=1}^n abs{a_{ij}}), which is the maximum row sum.

a) 1-norm: Because
[
egin{aligned}
orm{Avect{x}}_1 &= sum_{i=1}^n leftvert sum_{j=1}^n a_{ij} x_j ightvert leq sum_{i=1}^n sum_{j=1}^n abs{a_{ij} x_j} = sum_{j=1}^n left( abs{x_j} sum_{i=1}^n abs{a_{ij}} ight) \
&leq left( sum_{j=1}^n abs{x_j} ight) cdot max_{1 leq j leq n} left( sum_{i=1}^n abs{a_{ij}} ight),
end{aligned}
]
we have
[
orm{A}_1 leq sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{Avect{x}}_1}{ orm{vect{x}}_1} leq frac{left( sum_{j=1}^n abs{x_j} ight) cdot max_{1 leq j leq n} left( sum_{i=1}^n abs{a_{ij}} ight)}{sum_{j=1}^n abs{x_j}} = max_{1 leq j leq n} left( sum_{i=1}^n abs{a_{ij}} ight).
]
Then, we need to show that the maximum value on the right hand side is achievable.

Assume that when (j = j_0), (sum_{i=1}^n abs{a_{ij}}) has the maximum value. If this value is zero, it means (A) is a zero matrix and the definition of matrix 1-norm is trivially true. If this value is not zero, by letting (vect{x} = (delta_{ij_0})_{i geq 1}^n) with (delta_{ij_0}) being the Kronecker delta, we have
[
frac{ orm{Avect{x}}_1}{ orm{vect{x}}_1} = frac{sum_{i=1}^n abs{a_{ij_0}}}{1} = max_{1 leq j leq n} left( sum_{i=1}^n abs{a_{ij}} ight).
]
b) 2-norm: The proof for this part needs the intervention of inner product (langle cdot, cdot angle) of vectors in (mathbb{R}^n), from which the vector 2-norm can be induced. Then we have
[
orm{A}_2 = sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} frac{ orm{Avect{x}}_2}{ orm{vect{x}}_2} = sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} sqrt{frac{langle Avect{x}, Avect{x} angle}{langle vect{x}, vect{x} angle}} = sup_{forall vect{x} in mathbb{R}^n, vect{x} eq 0} sqrt{frac{langle A^*Avect{x}, vect{x} angle}{langle vect{x}, vect{x} angle}},
]
where (A^*) is the adjoint operator, i.e. transpose of (A). Therefore, (A^*A) is a real valued symmetric matrix which has (n) real eigenvalues ({lambda_i}_{i=1}^n) with (0 leq lambda_1 leq cdots leq lambda_n) and (n) corresponding orthonormal eigenvectors ({vect{v}_i}_{i=1}^n) (N.B. There may be duplicates in the eigenvalues). For all (vect{x} in mathbb{R}^n), it can be expanded as (vect{x} = sum_{i=1}^n a_i vect{v}_i) and (A^*Avect{x} = sum_{i=1}^n a_i A^*A vect{v}_i = sum_{i=1}^n a_i lambda_i vect{v}_i). Then we have
[
egin{aligned}
langle A^*Avect{x}, vect{x} angle &= leftlangle sum_{i=1}^n a_i lambda_i vect{v}_i, sum_{j=1}^n a_j vect{v}_j ight angle = sum_{i=1}^n sum_{j=1}^n lambda_i a_i^2 langle vect{v}_i, vect{v}_j angle \
&= sum_{i=1}^n sum_{j=1}^n lambda_i a_i^2 delta_{ij} = sum_{i=1} lambda_i a_i^2.
end{aligned}
]
Meanwhile,
[
langle vect{x}, vect{x} angle = leftlangle sum_{i=1}^n a_i vect{v}_i, sum_{j=1}^n a_j vect{v}_j ight angle = sum_{i=1}^n sum_{j=1}^n a_i a_j langle vect{v}_i, vect{v}_j angle = sum_{i=1}^n a_i^2.
]
Therefore,
[
frac{ orm{Avect{x}}_2}{ orm{vect{x}}_2} leq sqrt{frac{lambda_n sum_{i=1}^n a_i^2}{sum_{i=1}^n a_i^2}} = sqrt{lambda_n}.
]
By letting (a_1 = a_2 = cdots = a_{n-1} = 0) and (a_n = 1), we have (frac{ orm{Avect{x}}_2}{ orm{vect{x}}_2} = sqrt{lambda_n}). Hence,
[
orm{A}_2 = sqrt{lambda_n} = sqrt{ ho(A^*A)}
]
and the definition of matrix 2-norm is valid.

c) (infty)-norm:
[
egin{aligned}
orm{Avect{x}}_{infty} &= max_{1 leq i leq n} left( leftvert sum_{j=1}^n a_{ij} x_j ightvert ight) leq max_{1 leq i leq n} left( sum_{j=1}^n abs{a_{ij}} cdot abs{x_j} ight) \
&= max_{1 leq i leq n} left( left( sum_{j=1}^n abs{a_{ij}} ight) cdot left( max_{1 leq j leq n} abs{x_j} ight) ight) = left( max_{1 leq i leq n} sum_{j=1}^n abs{a_{ij}} ight) cdot left( max_{1 leq j leq n} abs{x_j} ight) \
orm{vect{x}}_{infty} &= max_{1 leq i leq n} abs{x_i}
end{aligned}
]
Therefore, (frac{ orm{Avect{x}}_{infty}}{ orm{vect{x}}_{infty}} leq max_{1 leq i leq n} sum_{j=1}^n abs{a_{ij}}). Then, we need to prove this maximum value is achievable.

Assume when (i = i_0), (sum_{j=1}^n abs{a_{i_0 j}}) achieves the maximum. If this value is zero, (A) is a zero matrix and the definition of matrix (infty)-norm is trivially true. If this value is not zero, by letting (vect{x} = (sgn(a_{i_0 1}), cdots, sgn(a_{i_0 n}))^{ m T}), we have ( orm{vect{x}}_{infty} = 1) and ( orm{Avect{x}}_{infty} = sum_{j=1}^n abs{a_{i_0 j}} = max_{1 leq i leq n} sum_{j=1}^n abs{a_{ij}}). Hence, (frac{ orm{Avect{x}}_{infty}}{ orm{vect{x}}_{infty}} = max_{1 leq i leq n} sum_{j=1}^n abs{a_{ij}}) and the definition of (infty)-norm is valid.