GEOMETRY

# Projective Transformations

## Overview

Perspective projection is a particular type of projection where all the rays of the projection pass through a single point. This puts constrains on the form of the matrix $$\mathbf{P}$$.

A perspective projection has the form:

$$\begin{bmatrix}x_1\\x_2\\x_3\end{bmatrix} = \mathbf{P} \begin{bmatrix}X\\Y\\Z\\S\end{bmatrix}$$

where:
$$(x_1,x_2,x_3)^\top$$ are the homogeneous coordinates of a point in the image plane $$\mathbf{P}$$
$$\mathbf{P}$$ is a 3x4 matrix
$$(X, Y, Z, S)^\top$$ are the homogeneous coordinates of a point in the world

## Plane To Image Projectivity

$$\rho \begin{bmatrix} q_1 \\ q_2 \\ 1 \end{bmatrix} = \mathbf{T} \begin{bmatrix} p_1 \\ p_2 \\ 1 \end{bmatrix} = \begin{bmatrix} A&B&C \\ D&E&F \\ G&H&1 \end{bmatrix} \begin{bmatrix} p_1 \\ p_2 \\ 1 \end{bmatrix}$$

where:
$$\rho$$ is the scale factor

This gives two linear equations with 8 unknowns:

$$Ap_1 + Bp_2 + C - Gp_1q_1 - Hp_2q_1 = q_1$$
$$Dp_1 + Ep_2 + F - Gp_1q_2 - Hp_2q_2 = q_2$$

Four pairs of points $$\hat{p} = (p_x, p_y), \hat{q} = (q_x, q_y)$$ (four points from each image which are paired together) give eight linear equations and then $$A, B, ..., H$$ can be solved. One condition is that no three of the four points can be collinear (i.e. lie on the same line).

The above projection algorithm can be used to perform “quad-to-quad” projection between 2 2D spaces (or two 2D coordinate systems). A quadrilateral (four sided polygon) is defined both in the input space and the output space. It is guaranteed that there is exactly one transformation that will map points from the input space to the output space as defined by the quadrilaterals.

Quadrilateral restrictions: No three of the four points in the quadrilateral can be collinear (i.e. lie on the same line). That is the same as saying that the quadrilateral must have four distinct edges.

The order in which you define the four vertices is not important. However, I define them in counter-clockwise order as this seem to be a common convention in industry.

We want to find the matrix $$\mathbf{T}$$ that projects an input point $$\hat{p}$$ to an output point $$\hat{q}$$, such that:

$$\hat{q} = \mathbf{T}\hat{p}$$

We can find the individual numbers $$A, B, ... , H$$ that make up $$\mathbf{T}$$ by forming some linear equations. For each point $$\hat{p} = (p_x, p_y)$$ that maps to point $$\hat{q} = (q_x, q_y)$$ we can form two linear equations:

$$Ap_x + Bp_y + C - Gp_xq_x - Hp_yq_x = q_x$$
$$Dp_x + Ep_y + F - Gp_xq_y - Hp_yq_y = q_y$$

The four corners of the input quad map to the four corners of the output quad, so this gives us 8 equations with 8 unknowns. To solve these linear equations, we can use matrices. If we pull all of the coefficients of $$A, B, ..., H$$ into a matrix $$\mathbf{A}$$, we can write the equation in the form $$\mathbf{A}\hat{x} = \mathbf{B}$$:

$$\small{\begin{bmatrix} p_{1,x} & p_{1,y} & 1 & 0 & 0 & 0 & -p_{1,x}q_{1,x} & -p_{1,y}q_{1,x} \\ 0 & 0 & 0 & p_{1,x} & p_{1,y} & 1 & -p_{1,x}q_{1,y} & -p_{1,y}q_{1,y} \\ p_{2,x} & p_{2,y} & 1 & 0 & 0 & 0 & -p_{2,x}q_{2,x} & -p_{2,y}q_{2,x} \\ 0 & 0 & 0 & p_{2,x} & p_{2,y} & 1 & -p_{2,x}q_{2,y} & -p_{2,y}q_{2,y} \\ p_{3,x} & p_{3,y} & 1 & 0 & 0 & 0 & -p_{3,x}q_{3,x} & -p_{3,y}q_{3,x} \\ 0 & 0 & 0 & p_{3,x} & p_{3,y} & 1 & -p_{3,x}q_{3,y} & -p_{3,y}q_{3,y} \\ p_{4,x} & p_{4,y} & 1 & 0 & 0 & 0 & -p_{4,x}q_{4,x} & -p_{4,y}q_{4,x} \\ 0 & 0 & 0 & p_{4,x} & p_{4,y} & 1 & -p_{4,x}q_{4,y} & -p_{4,y}q_{4,y} \\ \end{bmatrix} \begin{bmatrix} A\\B\\C\\D\\E\\F\\G\\H \end{bmatrix} = \begin{bmatrix} q_{1,x}\\q_{1,y}\\q_{2,x}\\q_{2,y}\\q_{3,x}\\q_{3,y}\\q_{4,x}\\q_{4,y}\end{bmatrix}}$$

This can then be re-arranged to solve for $$\hat{x}$$ (which is our vector of coefficients $$A, B, ... H$$ which we will eventually put back into the transformation matrix $$\mathbf{T}$$).

$$\hat{x} = A^{-1}B$$

Once $$\hat{x}$$ has been found, $$\mathbf{T}$$ can be made from the values in $$\hat{x}$$. You can then convert points from your input coordinate space to the output coordinate space using:

$$\hat{q} = \mathbf{T}\hat{p}$$

Worked Example

For example, take the square (which is a simple quadrilateral) defined by (0, 0), (1, 0), (1, 1), (0, 1) (the blue square below) and a second quadrilateral defined by (1, 2), (1, 4), (3, 4), (3, 2) (the red square below). Find the transformation matrix $$T$$ which maps points from $$P$$ to $$Q$$.

We can then pair the points together, i.e. $$p1 = (0, 0)$$ maps to $$q1 = (1, 2)$$.

$$\small{\begin{bmatrix} 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 1 & 1 & 0 & 0 & 0 & 0 & -1 \\ 0 & 0 & 0 & 0 & 1 & 1 & 0 & -4 \\ 1 & 1 & 1 & 0 & 0 & 0 & -3 & -3 \\ 0 & 0 & 0 & 1 & 1 & 1 & -4 & -4 \\ 1 & 0 & 1 & 0 & 0 & 0 & -3 & 0 \\ 0 & 0 & 0 & 1 & 0 & 1 & -2 & 0 \\ \end{bmatrix} \begin{bmatrix} A\\B\\C\\D\\E\\F\\G\\H \end{bmatrix} = \begin{bmatrix} 1\\2\\1\\4\\3\\4\\3\\2 \end{bmatrix}}$$

Solving for vector $$\hat{x}$$ gives:

$$\hat{x} = \begin{bmatrix} 2\\0\\1\\0\\2\\2\\0\\0 \end{bmatrix}$$

These are our values $$A, B, ..., H$$ that we need to build the transformation matrix $$\mathbf{T}$$!

$$\mathbf{T} = \begin{bmatrix} 2&0&1\\0&2&2\\0&0&1 \end{bmatrix}$$

Note that the last element in $$\mathbf{T}$$ is always 1! We can now transform any point from out input space to our output space using:

$$\hat{q} = \mathbf{T}\hat{p}$$
$$\begin{bmatrix}q_x\\q_y\\1\end{bmatrix} = \begin{bmatrix} 2&0&1\\0&2&2\\0&0&1 \end{bmatrix} \begin{bmatrix}p_x\\p_y\\1\end{bmatrix}$$

Let’s select point $$\hat{p} = (0.5, 0.5)$$. This is in the middle of our input polygon, so we should expect it to be transformed to the middle of our output polygon, at $$\hat{q} = (2, 3)$$.

$$\begin{bmatrix}q_x\\q_y\\1\end{bmatrix} = \begin{bmatrix} 2&0&1\\0&2&2\\0&0&1 \end{bmatrix} \begin{bmatrix}0.5\\0.5\\1\end{bmatrix}$$
$$q_x = 2*0.5 + 0*0.5 + 1*1 = 2 \\ q_y = 0*0.5 + 2*0.5 + 2*1 = 3$$

… which is what we expected!

Of course, Quad-to-Quad transformations do not have to use simple square, any four sided polygon can be used in the transformation, as shown in the below image:

Qt provides a QTranform::quadToQuad() method which can be used to create a transformation object that can then be applied to things such as images. Note how ever that that when transforming an image, “extra” translation is removed from the output so that the translated image is contained within the smallest number of pixels possible. Qt also specifies the structure of the transformation matrix slightly differently, with the order of each element being different from “standard” (it looks like it has been mirrored around the leading diagonal).
Python’s image library (PIL) provides a transform() function with can do perspective transformations, along with interpolation (this is what is used to transform the above “Hello” image).