A Primer on Projective Geometry

Understanding the basics behind 2D projections

🧮 math

Table of Contents

Projective Geometry 1 is a term used to describe properties of projections of a given geometric shape. When a shape is projected onto R2\mathbb{R}^2(commonly known as the 2D real space), it is called a Planar Projection. This idea can be extended to a shape being projected as H:RmRnH : \mathbb{R}^m \to \mathbb{R}^nwhere HH is a homogenous matrix. To understand concepts, we’ll use Planar geometry because that is the easiest to visualize.

Homogenous Representation of Points and Lines

From elementary geometry, we know that a line in the plane is represented by ax+by+c=0ax + by + c = 0. In vector form, a line can also be written as (a,b,c)T(a,b,c)^T. As can be easily seen, any multiple of this vector will also represent the same line. Hence, we can define the equivalence: (a,b,c)Tk(a,b,c)T(a,b,c)^T \equiv k(a,b,c)^T. The class of such vectors is know as the homogenous vector. The set of all such equivalence classes in R3(0,0,0)T\mathbb{R}^3 - (0,0,0)^T form the Projective Space P2\mathbb{P}^2 with (0,0,0)T(0,0,0)^T excluded because it does not represent a line.

From the same equation above, the point (x,y)T(x,y)^T lies on the line (a,b,c)T(a,b,c)^T and hence can be rewritten as

(x,y,1)T(a,b,c)xTl=0(x,y,1)^T(a,b,c) \equiv \textbf{x}^T\textbf{l} = 0

Now, observe here that for convenience we represent (x,y)T(x,y)^T in R2\mathbb{R}^2 as (x,y,1)T(x,y,1)^T in P2\mathbb{P}^2. Generalizing it further, let us say we represent points in P2\mathbb{P}^2 as (x1,x2,x3)T(x_1,x_2,x_3)^T, which when accounted for scale (read equivalence class), can be rewritten as the homogenous vector (x1/x3,x2/x3,1)(x_1/x_3,x_2/x_3, 1) in P2\mathbb{P}^2 and represents (x1/x3,x2/x3)(x_1/x_3,x_2/x_3) in R2\mathbb{R}^2.

Now that we have basic definitions in place, let us see how this representation helps.

Intersection of Lines

Let us consider two lines l=(a,b,c)T\textbf{l} = (a,b,c)^T and l=(a,b,c)T\textbf{l}^\prime = (a^{\prime},b^{\prime},c^{\prime})^T.

The point of intersection of two lines is given by

x=l×l\textbf{x} = \textbf{l} \times \textbf{l}^{\prime}

It is easy to see why the above formula works. But, in R2\mathbb{R}^2, there is one special case - parallel lines. We know about the notion that parallel lines meets at infinity but infinity is just another way of saying that something is not defined. Instead in P2\mathbb{P}^2, we’ll see that there exists no such special case.

Consider lines x=1x = 1 and x=2x = 2 in R2\mathbb{R}^2. These are represented by lines (1,0,1)(1,0,-1) and (1,0,2)(1,0,-2) in P2\mathbb{P}^2. The point of intersection will be given by the cross product -

[ijk101102]=(010)\begin{bmatrix} i & j & k \\ 1 & 0 & -1 \\ 1 & 0 & -2 \end{bmatrix} = \begin{pmatrix} 0 \\ -1 \\ 0 \end{pmatrix}

The vector x=(0,1,0)T\textbf{x} = (0,-1,0)^T is very well defined and hence we did not need to handle parallel lines as a special case in P2\mathbb{P}^2. But what point does this homogenous vector describe in R2\mathbb{R}^2? By definition, it should be (00,10)T(\frac{0}{0}, \frac{-1}{0})^T which is not defined and hence, the lines meet at infinity as expected. So, we satisfy definitions of parallel lines and their intersection in both spaces.

We define such points where x3=0x_3 = 0 as ideal points and belong to the line at infinity. For any general ideal point (a,b,0)T(a,b,0)^T, it is easy to see that the equivalence class of the line at infinity can be represented by l=(0,0,1)T\textbf{l}_{\infty} = (0,0,1)^T. This fact that parallel lines do intersect at well-defined points in P2\mathbb{P}^2 will have a very important implication in projections.

Projective Transformations

Geometry is the study of properties invariant under a given set of transformation. For the purpose of this illustration, we will see 2D projective geometry, which is the study of properties of the projective plane P2\mathbb{P}^2.

Definition 1: We define a projectivity as an invertible mapping h:P2P2h: \mathbb{P}^2 \to \mathbb{P}^2 such that three points x1,x2,x3\textbf{x}_1, \textbf{x}_2, \textbf{x}_3 are collinear iff h(x1),h(x2),h(x3)h(\textbf{x}_1), h(\textbf{x}_2), h(\textbf{x}_3) are collinear. It is alternatively known as homography.

There is an alternative algebraic way to write this.

Definition 2: A mapping h:P2P2h: \mathbb{P}^2 \to \mathbb{P}^2 is called a projectivity iff there exists a 3×33 \times 3 non-invertible matrix such that x=Hx\textbf{x}^{\prime} = H\textbf{x}.

To see how both of these definitions are equivalent, consider points x1,x2,x3x_1,x_2,x_3 on line l\textbf{l}, which by definition would imply -

xiTl=0i1,2,3{\textbf{x}_i}^T\textbf{l} = 0 \kern{3em} \forall i \in {1,2,3}

Introducing HH and some manipulations,

xiT(H1H)Tl=0\Rightarrow \textbf{x}_i^T(H^{-1}H)^T\textbf{l} = 0 xiTHTHTl=0\Rightarrow \textbf{x}_i^TH^TH^{-T}\textbf{l} = 0 (Hxi)T(HTl)=0\Rightarrow (H\textbf{x}_i)^T(H^{-T}\textbf{l}) = 0 (xi)Tl=0\Rightarrow (\textbf{x}_i^\prime)^T\textbf{l}^\prime = 0

Hence, the matrix HH transforms the set of collinear points xi\textbf{x}_i on l\textbf{l} to points xi{\textbf{x}_i}^\prime on l\textbf{l}^\prime proving the equivalence of Definition 1 and Definition 2.

Class of Projections

Now that we have convinced ourselves about the algebraic nature of projections, let us take a look some special cases of Projective Transformations. Note that each transformation discussed below has a given set of invariants. The ability to assess and reconstruct such patterns is the essence of scene geometry understanding in Computer Vision.

Isometries

These are transformations of the plane R2\mathbb{R}^2 that preserve the Euclidean distance.

(xy1)=[ϵcosθsinθtxϵsinθcosθty001](xy1)\begin{pmatrix} x^\prime \\ y^\prime \\ 1 \end{pmatrix} = \begin{bmatrix} \epsilon\cos\theta & - \sin\theta & t_x \\ \epsilon\sin\theta & \cos\theta & t_y \\ 0 & 0 & 1 \end{bmatrix} \begin{pmatrix} x \\ y \\ 1 \end{pmatrix}

This transformation can be written in block form as:

HE=[Rt0T1]H_E = \begin{bmatrix} \textbf{R} & \textbf{t} \\ \textbf{0}^T & 1 \end{bmatrix}

where ϵ=±1\epsilon = \pm 1 for orientation preservation, R\textbf{R} is the orthogonal rotation matrix and t\textbf{t} is the translation vector.

Similarity Transformations

This transformation is an extension of isometry with scaling involved. Hence, instead of the Euclidean distance being preserved, the ratios like between two lengths or two areas are preserved. In essence, shapes are preserved.

(xy1)=[scosθssinθtxssinθscosθty001](xy1)\begin{pmatrix} x^\prime \\ y^\prime \\ 1 \end{pmatrix} = \begin{bmatrix} s\cos\theta & - s\sin\theta & t_x \\ s\sin\theta & s\cos\theta & t_y \\ 0 & 0 & 1 \end{bmatrix} \begin{pmatrix} x \\ y \\ 1 \end{pmatrix}

This transformation can be written in block form as:

HS=[sRt0T1]H_S = \begin{bmatrix} s\textbf{R} & \textbf{t} \\ \textbf{0}^T & 1 \end{bmatrix}

where sRs \in \mathbb{R} represents the scaling factor.

Affine Transformations

Affine transformations are an extension to the similarity transform but with an added deformity factor. This preserves parallelism.

(xy1)=[a_11a_12txa_21a_22ty001](xy1)\begin{pmatrix} x^\prime \\ y^\prime \\ 1 \end{pmatrix} = \begin{bmatrix} a\_\text{11} & a\_\text{12} & t_x \\ a\_\text{21} & a\_\text{22} & t_y \\ 0 & 0 & 1 \end{bmatrix} \begin{pmatrix} x \\ y \\ 1 \end{pmatrix}

This transformation can be written in block form as:

HA=[At0T1]H_A = \begin{bmatrix} \textbf{A} & \textbf{t} \\ \textbf{0}^T & 1 \end{bmatrix}

where AA represents a general linear part. It is easy to observe that lines at infinity remain at infinity.

Projective Transformations

These are the most generalized set of transformations and can be written in block form as:

x=HPx=[AtvTw]\textbf{x}^\prime = H_P\textbf{x} = \begin{bmatrix} \textbf{A} & \textbf{t} \\ \textbf{v}^T & w \end{bmatrix}

The invariant property here, other than the previously discussed collinearity is something known as a cross-ratio which is defined as a ratio of ratios -

Cross(x1,x2,x3,x4)=x1x2x3x4x1x3x2x4Cross(x_1,x_2,x_3,x_4) = \frac{ \begin{vmatrix} x_1x_2 \end{vmatrix} \begin{vmatrix} x_3x_4 \end{vmatrix} }{ \begin{vmatrix} x_1x_3 \end{vmatrix} \begin{vmatrix} x_2x_4 \end{vmatrix} }

where xixj=detxi1xi2xj1xj2\begin{vmatrix} x_ix_j \end{vmatrix} = det \begin{vmatrix} x_\text{i1} & x_\text{i2} \\ x_\text{j1}& x_\text{j2} \end{vmatrix} in P1\mathbb{P}^1 (can be extended). Note that the order of the points will change the ratio (which have a simple relationship) but as long as the definition is consistent, it is an invariant.

Coming back to an earlier discussion on homogenous points, consider an ideal point transformed by a Projective Transformation.

x=HPx=[AtvTw]\textbf{x}^\prime = H_P\textbf{x} = \begin{bmatrix} \textbf{A} & \textbf{t} \\ \textbf{v}^T & w \end{bmatrix}

where vT0T\textbf{v}^T \neq \textbf{0}^T. It can be observed that lines at infinity DONOT remain at infinity.

[AtvTw](x0)=(AxvTx)\begin{bmatrix} \textbf{A} & \textbf{t} \\ \textbf{v}^T & w \end{bmatrix} \begin{pmatrix} \textbf{x} \\ 0 \end{pmatrix} = \begin{pmatrix} A\textbf{x} \\ \textbf{v}^T\textbf{x} \end{pmatrix}

which will not remain an ideal point anymore due to the vTx\textbf{v}^T\textbf{x} component and hence helps model vanishing points - imagine two parallel rail tracks meeting at the edge of the horizon as seen by the eyes.

Application

A projective transformation can be decomposed into the following set of transformations -

H=HSHAHP=[Rt0T1][K00T1][I0vTw]H = H_S H_A H_P = \begin{bmatrix} \textbf{R} & \textbf{t} \\ \textbf{0}^T & 1 \end{bmatrix} \begin{bmatrix} \textbf{K} & \textbf{0} \\ \textbf{0}^T & 1 \end{bmatrix} \begin{bmatrix} \textbf{I} & \textbf{0} \\ \textbf{v}^T & w \end{bmatrix}

HPH_P moves the line at infinity, HAH_A makes an affine transformation and preserves the invariants of HPH_P, and HSH_S is the similarity transform and preserves the invariants of both HPH_P and HAH_A.

All the above equips us with a fundamental understanding of what matrices need to be computed when trying to reconstruct geometry from given scene and keypoints. I will discuss one of the simpler rectification problems here in theory.

Affine Rectification of images

We will see that identifying the line at infinity will allow us to recover the affine properties of an image. Remember from the above discussion, that

l=HA-Tl=[A-T0t-1A-T1](001)=(001)=l\textbf{l}^\prime_\infty = H_A^\text{-T}\textbf{l}_\infty = \begin{bmatrix} \textbf{A}^\text{-T} & \textbf{0} \\ - \textbf{t}^\text{-1}\textbf{A}^\text{-T} & 1 \end{bmatrix} \begin{pmatrix}0 \\ 0 \\ 1 \end{pmatrix} = \begin{pmatrix}0 \\ 0 \\ 1 \end{pmatrix} = \textbf{l}_\infty

and that a line at infinity is only preserved when H is an affinity. Effectively, what we need to find a projective matrix which transforms an identified l{\textbf{l}^\prime}_\infty in the image to the canonical position of l=(0,0,1)T\textbf{l}_\infty = (0,0,1)^T.

Consider a given image where we have identified 2 pairs of parallel lines in homogenous coordinates - l1l2\textbf{l}_1 \parallel \textbf{l}_2 and m1m2\textbf{m}_1 \parallel \textbf{m}_2. To calculate the line of infinity in the image space,

pl=l1×l2\textbf{p}_l = \textbf{l}_1 \times \textbf{l}_2 pm=m1×m2\textbf{p}_m = \textbf{m}_1 \times \textbf{m}_2 l=pl×pm{\textbf{l}^\prime}_\infty = \textbf{p}_l \times \textbf{p}_m

where pl\textbf{p}_l and pm\textbf{p}_m are the ideal points from respective pair of parallel lines. l{\textbf{l}^\prime}_\infty is the line of infinity constructed via the ideal points. Using this line, we need to find a projective matrix which converts it into the canonical line at infinity (0,0,1)T(0,0,1)^T.

H1=[100010l1l2l3]H_1 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ l_1 & l_2 & l_3 \end{bmatrix}

where (l1,l2,l3)T=l(l_1,l_2,l_3)^T = {\textbf{l}^\prime}_\infty. It is verifiable that H1-Tl=l{H_1}^\text{-T}{\textbf{l}^\prime}_\infty = \textbf{l}_\infty.

If we now apply the same transformation matrix to all points on the image, the image will be rectified to actually show parallel edges as parallel.

The above discussion introduces the basic matrices and invariant properties which we look for while processing images for geometric scene understanding. A more practical approach to the rectification problem above and more complex problems like 2D Homography and Stereo Matching will be discussed in another post.

Footnotes

  1. Harltey, A., & Zisserman, A. (2003). Multiple view geometry in computer vision (2. ed.).