Orthogonal and Affine Projection
Orthogonal projection
Suppose \mathcal{M} is a subspace in the vector space \mathcal{V}. Since \mathcal{M}^{\perp} is a orthogonal complementary subspace of \mathcal{M}, we have \mathcal{M} \oplus \mathcal{M}^{\perp} = \mathcal{V}.
For v \in \mathcal{V}, let v = m + n, where m \in \mathcal{M} and n \in \mathcal{M}^{\perp}. We define a linear operator \mathbf{P}_{\mathcal{M}} \in \mathbb{C}^{n \times n} such that
\mathbf{P}_{\mathcal{M}} v = m.
where
m is the orthogonal projection of v onto \mathcal{M} along \mathcal{M}^{\perp},
\mathbf{P}_{\mathcal{M}} is the orthogonal projection matrix of v onto \mathcal{M} along \mathcal{M}^{\perp}.
Properties of orthogonal projection
Suppose m is the orthogonal projection of v on the subspace \mathcal{M}
m = \mathbf{P}_{\mathcal{M}} v.
(orthogonal-projection-property-1)=
m always exists and is unique.
(orthogonal-projection-property-2)=
Orthogonality of the difference.
\langle v - m, x \rangle = 0, \quad \forall x \in \mathcal{M}
(orthogonal-projection-property-3)=
Closest point theorem: m is the closest point in \mathcal{M} to v in terms of the ip norm:
m = \min_{x \in \mathcal{M}} \lVert v - x \rVert.
Orthogonal projection matrix
If \mathbf{P}_{\mathcal{M}} is the orthogonal projection matrix of v onto the subspace \mathcal{M}, and the columns of \mathbf{M} \in \mathbb{C}^{n \times r} are r bases for \mathcal{M}, then
\mathbf{P}_{\mathcal{M}} = \mathbf{M} (\mathbf{M}^{H} \mathbf{M})^{-1} \mathbf{M}^{H}.
Properties of orthogonal projection matrix
(orthogonal-projection-matrix-property-1)=
\mathbf{P} is a orthogonal projection matrix if and only if
R (\mathbf{P}) = N (\mathbf{P})^{\perp}.
(orthogonal-projection-matrix-property-2)=
\mathbf{P} is a orthogonal projection matrix if and only if
\mathbf{P}^{H} = \mathbf{P}.
Application: least square problem
Consider the problem of solving a system of linear equation for \mathbf{x} \in \mathbb{C}^{n} given \mathbf{y} \in \mathbb{C}^{m} and \mathbf{A} \in \mathbb{C}^{m \times n},
\mathbf{y} = \mathbf{A} \mathbf{x}.
This problem has a solution only when \mathbf{y} \in R (\mathbf{A}). When it has no solution, the objective is changed to solve the least square problem
\mathbf{x}^{*} = \min_{\mathbf{x} \in \mathbb{C}^{n}} \lVert \mathbf{y} - \mathbf{A} \mathbf{x} \rVert_{2}^{2}
so that \mathbf{A} \mathbf{x} can be as close to \mathbf{y} as possible.
Solving the least square problem is the same as solving an orthogonal projection problem,
\begin{aligned} \mathbf{x}^{*} & = \min_{\mathbf{x} \in \mathbb{C}^{n}} \lVert \mathbf{y} - \mathbf{A} \mathbf{x} \rVert_{2}^{2} \\ & = \min_{\mathbf{x} \in \mathbb{C}^{n}} \lVert \mathbf{y} - \mathbf{A} \mathbf{x} \rVert_{2} & [\lVert \mathbf{y} - \mathbf{A} \mathbf{x} \rVert_{2} \geq 0] \\ \mathbf{z}^{*} & = \min_{\mathbf{z} \in R (\mathbf{A})} \lVert \mathbf{y} - \mathbf{z} \rVert_{2} & [\mathbf{z} = \mathbf{A} \mathbf{x}, \mathbf{z}^{*} = \mathbf{A} \mathbf{x}^{*}]. \end{aligned}
which is the problem of finding the closest point of \mathbf{y} on R (\mathbf{A}),
\begin{aligned} \mathbf{z}^{*} & = \mathbf{P}_{R (\mathbf{A})} \mathbf{y} \\ \mathbf{A} \mathbf{x}^{*} & = \mathbf{P}_{R (\mathbf{A})} \mathbf{y}. \end{aligned}
We can then deduce the system of normal equations:
\mathbf{A} \mathbf{x}^{*} = \mathbf{P}_{R (\mathbf{A})} \mathbf{y} \iff \mathbf{A}^{H} \mathbf{A} \mathbf{x}^{*} = \mathbf{A}^{H} \mathbf{y}.
(affine-projection)=
Affine projection
Affine space
A set of vectors in the vector space \mathcal{V} forms an affine space \mathcal{A} if they are the sums of the vectors in a subspace \mathcal{M} \subset \mathcal{V} and a non-zero vector v \in \mathcal{V},
\mathcal{A} = v + \mathcal{M}.
That is, all vectors in \mathcal{A} are sums of vectors in \mathcal{M} and v.
\mathcal{A} is a subspace as it does NOT necessarily contain 0 vector.
\mathcal{A} can visualized as the subspace \mathcal{M} translated away from origin through v.
Affine projection
Although affine spaces are not a subspaces, the concept of orthogonal projection can also be applied to affine spaces.
Given a vector b \in \mathcal{V} and an affine space \mathcal{A}, the affine projection a \in \mathcal{A} is the orthogonal projection of b onto the affine space \mathcal{A} and can be expressed as
a = v + \mathbf{P}_{\mathcal{M}} (b - v),
where \mathbf{P}_{\mathcal{M}} is the orthogonal projection matrix of \mathcal{M}.
Hyperplanes
An affine space \mathcal{H} = \mathbf{v} + \mathcal{M} \subseteq \mathbb{R}^{n} for which \text{dim} (\mathcal{M}) = n − 1 is called a hyperplane, and is usually expressed as the set
\mathcal{H} = \{ \mathbf{x} | \mathbf{w}^{T} \mathbf{x} = \beta \}
where \beta is a scalar and \mathbf{w} is a non-zero vector.
In this case, the hyperplane can be viewed as the subspace
\mathcal{M} = \mathbf{w}^{\perp}
translated by the vector
\mathbf{v} = \frac{\beta}{\mathbf{w}^{T} \mathbf{w}} \mathbf{w}.
The orthogonal projection \mathbf{a} of a point \mathbf{b} \in \mathbb{R}^{n} onto the hyperplane \mathcal{H} is given by
\mathbf{a} = \mathbf{b} - \left( \frac{ \mathbf{w}^{T} \mathbf{b} - \beta }{ \mathbf{w}^{T} \mathbf{w} } \right) \mathbf{w}.