Projective Geometry ^{1} is a term used to describe properties of projections of a given geometric shape. When a shape is projected onto $\mathbb{R}^2$(commonly known as the 2D real space), it is called a Planar Projection. This idea can be extended to a shape being projected as $H : \mathbb{R}^m \to \mathbb{R}^n$where $H$ is a homogenous matrix. To understand concepts, we'll use Planar geometry because that is the easiest to visualize.
From elementary geometry, we know that a line in the plane is represented by $ax + by + c = 0$. In vector form, a line can also be written as $(a,b,c)^T$. As can be easily seen, any multiple of this vector will also represent the same line. Hence, we can define the equivalence: $(a,b,c)^T \equiv k(a,b,c)^T$. The class of such vectors is know as the homogenous vector. The set of all such equivalence classes in $\mathbb{R}^3 - (0,0,0)^T$ form the Projective Space $\mathbb{P}^2$ with $(0,0,0)^T$ excluded because it does not represent a line.
From the same equation above, the point $(x,y)^T$ lies on the line $(a,b,c)^T$ and hence can be rewritten as
Now, observe here that for convenience we represent $(x,y)^T$ in $\mathbb{R}^2$ as $(x,y,1)^T$ in $\mathbb{P}^2$. Generalizing it further, let us say we represent points in $\mathbb{P}^2$ as $(x_1,x_2,x_3)^T$, which when accounted for scale (read equivalence class), can be rewritten as the homogenous vector $(x_1/x_3,x_2/x_3, 1)$ in $\mathbb{P}^2$ and represents $(x_1/x_3,x_2/x_3)$ in $\mathbb{R}^2$.
Now that we have basic definitions in place, let us see how this representation helps.
Let us consider two lines $\textbf{l} = (a,b,c)^T$ and $\textbf{l}^\prime = (a^{\prime},b^{\prime},c^{\prime})^T$.
The point of intersection of two lines is given by
It is easy to see why the above formula works. But, in $\mathbb{R}^2$, there is one special case - parallel lines. We know about the notion that parallel lines meets at infinity but infinity is just another way of saying that something is not defined. Instead in $\mathbb{P}^2$, we'll see that there exists no such special case.
Consider lines $x = 1$ and $x = 2$ in $\mathbb{R}^2$. These are represented by lines $(1,0,-1)$ and $(1,0,-2)$ in $\mathbb{P}^2$. The point of intersection will be given by the cross product -
The vector $\textbf{x} = (0,-1,0)^T$ is very well defined and hence we did not need to handle parallel lines as a special case in $\mathbb{P}^2$. But what point does this homogenous vector describe in $\mathbb{R}^2$? By definition, it should be $(\frac{0}{0}, \frac{-1}{0})^T$ which is not defined and hence, the lines meet at infinity as expected. So, we satisfy definitions of parallel lines and their intersection in both spaces.
We define such points where $x_3 = 0$ as ideal points and belong to the line at infinity. For any general ideal point $(a,b,0)^T$, it is easy to see that the equivalence class of the line at infinity can be represented by $\textbf{l}_{\infty} = (0,0,1)^T$. This fact that parallel lines do intersect at well-defined points in $\mathbb{P}^2$ will have a very important implication in projections.
Geometry is the study of properties invariant under a given set of transformation. For the purpose of this illustration, we will see 2D projective geometry, which is the study of properties of the projective plane $\mathbb{P}^2$.
Definition 1: We define a projectivity as an invertible mapping $h: \mathbb{P}^2 \to \mathbb{P}^2$ such that three points $\textbf{x}_1, \textbf{x}_2, \textbf{x}_3$ are collinear iff $h(\textbf{x}_1), h(\textbf{x}_2), h(\textbf{x}_3)$ are collinear. It is alternatively known as homography.
There is an alternative algebraic way to write this.
Definition 2: A mapping $h: \mathbb{P}^2 \to \mathbb{P}^2$ is called a projectivity iff there exists a $3 \times 3$ non-invertible matrix such that $\textbf{x}^{\prime} = H\textbf{x}$.
To see how both of these definitions are equivalent, consider points $x_1,x_2,x_3$ on line $\textbf{l}$, which by definition would imply -
Introducing $H$ and some manipulations,
Hence, the matrix $H$ transforms the set of collinear points $\textbf{x}_i$ on $\textbf{l}$ to points ${\textbf{x}_i}^\prime$ on $\textbf{l}^\prime$ proving the equivalence of Definition 1 and Definition 2.
Now that we have convinced ourselves about the algebraic nature of projections, let us take a look some special cases of Projective Transformations. Note that each transformation discussed below has a given set of invariants. The ability to assess and reconstruct such patterns is the essence of scene geometry understanding in Computer Vision.
These are transformations of the plane $\mathbb{R}^2$ that preserve the Euclidean distance.
This transformation can be written in block form as:
where $\epsilon = \pm 1$ for orientation preservation, $\textbf{R}$ is the orthogonal rotation matrix and $\textbf{t}$ is the translation vector.
This transformation is an extension of isometry with scaling involved. Hence, instead of the Euclidean distance being preserved, the ratios like between two lengths or two areas are preserved. In essence, shapes are preserved.
This transformation can be written in block form as:
where $s \in \mathbb{R}$ represents the scaling factor.
Affine transformations are an extension to the similarity transform but with an added deformity factor. This preserves parallelism.
This transformation can be written in block form as:
where $A$ represents a general linear part. It is easy to observe that lines at infinity remain at infinity.
These are the most generalized set of transformations and can be written in block form as:
The invariant property here, other than the previously discussed collinearity is something known as a cross-ratio which is defined as a ratio of ratios -
where $\begin{vmatrix} x_ix_j \end{vmatrix} = det \begin{vmatrix} x_\text{i1} & x_\text{i2} \\ x_\text{j1}& x_\text{j2} \end{vmatrix}$ in $\mathbb{P}^1$ (can be extended). Note that the order of the points will change the ratio (which have a simple relationship) but as long as the definition is consistent, it is an invariant.
Coming back to an earlier discussion on homogenous points, consider an ideal point transformed by a Projective Transformation.
where $\textbf{v}^T \neq \textbf{0}^T$. It can be observed that lines at infinity DONOT remain at infinity.
which will not remain an ideal point anymore due to the $\textbf{v}^T\textbf{x}$ component and hence helps model vanishing points - imagine two parallel rail tracks meeting at the edge of the horizon as seen by the eyes.
A projective transformation can be decomposed into the following set of transformations -
$H_P$ moves the line at infinity, $H_A$ makes an affine transformation and preserves the invariants of $H_P$, and $H_S$ is the similarity transform and preserves the invariants of both $H_P$ and $H_A$.
All the above equips us with a fundamental understanding of what matrices need to be computed when trying to reconstruct geometry from given scene and keypoints. I will discuss one of the simpler rectification problems here in theory.
We will see that identifying the line at infinity will allow us to recover the affine properties of an image. Remember from the above discussion, that
and that a line at infinity is only preserved when H is an affinity. Effectively, what we need to find a projective matrix which transforms an identified ${\textbf{l}^\prime}_\infty$ in the image to the canonical position of $\textbf{l}_\infty = (0,0,1)^T$.
Consider a given image where we have identified 2 pairs of parallel lines in homogenous coordinates - $\textbf{l}_1 \parallel \textbf{l}_2$ and $\textbf{m}_1 \parallel \textbf{m}_2$. To calculate the line of infinity in the image space,
where $\textbf{p}_l$ and $\textbf{p}_m$ are the ideal points from respective pair of parallel lines. ${\textbf{l}^\prime}_\infty$ is the line of infinity constructed via the ideal points. Using this line, we need to find a projective matrix which converts it into the canonical line at infinity $(0,0,1)^T$.
where $(l_1,l_2,l_3)^T = {\textbf{l}^\prime}_\infty$. It is verifiable that ${H_1}^\text{-T}{\textbf{l}^\prime}_\infty = \textbf{l}_\infty$.
If we now apply the same transformation matrix to all points on the image, the image will be rectified to actually show parallel edges as parallel.
The above discussion introduces the basic matrices and invariant properties which we look for while processing images for geometric scene understanding. A more practical approach to the rectification problem above and more complex problems like 2D Homography and Stereo Matching will be discussed in another post.
Harltey, A., & Zisserman, A. (2003). Multiple view geometry in computer vision (2. ed.). ↩