Interactive Exploration of Homogeneous Coordinates and Rational Splines

by Dr.-Ing. Johannes S. Mueller-Roemer

Introduction

Homogeneous coordinates are a core concept in computer graphics, computer vision, as well as other fields such as robotics. Furthermore, they simplify the definition and understanding of rational splines.

The aim of this page is to help gain an intuitive understanding of this mathematical concept through the use of interactive 3D diagrams. All diagrams can be rotated freely in three dimensions. Additionally, all points can be edited by clicking the corresponding node and dragging the arrow manipulators or by editing the entries in the homogeneous coordinate tables below each diagram.

Homogeneous Coordinates

Homogeneous (or projective) coordinates map each \(d\)-dimensional Cartesian coordinate \(\mathbf{p}\) to a set of \((d+1)\)-dimensional vectors \(\mathbf{p}_H\) in projective space:

\[\mathbf{p} \in \mathbb{R}^d \equiv \begin{pmatrix}w\mathbf{p}\\w\end{pmatrix} \in \mathbb{R}^{d + 1}\ \forall\ w \in \mathbb{R} \setminus \{0\} \]

Therefore, every point in Cartesian space maps to a line in projective space. When mapping from a Cartesian coordinate to a homogeneous coordinate, \(w\) is typically chosen to be 1. However, any value of \(w\) — except 0 — can be used. The inverse mapping corresponds to a simple division.

\[\mathbf{p} = \begin{pmatrix}p_1\\\vdots\\p_d\end{pmatrix} \rightarrow \begin{pmatrix}p_1\\\vdots\\p_d\\1\end{pmatrix} \qquad \mathbf{p}_H = \begin{pmatrix}p_{H,1}\\\vdots\\p_{H,d}\\w\end{pmatrix} \rightarrow \frac{1}{w} \begin{pmatrix}p_{H,1}\\\vdots\\p_{H,d}\end{pmatrix} \]

Usually, four-dimensional homogeneous coordinates are used to represent points in 3D space. However, four-dimensional coordinates are notoriously hard to grasp and visualize. Therefore, let us have a look at three-dimensional homogeneous coordinates representing points in the 2D plane. Figure 1 is a simple example showing a single homogeneous coordinate and the corresponding projected Cartesian coordinate. Try…

…rotating the diagram
…editing the point in 3D
…editing the point using the provided text fields

Also consider the following: if every point in Cartesian coordinates corresponds to a line of homogeneous coordinate with \(w \neq 0\), what could a homogeneous coordinate with \(w = 0\) (but \(\|(p_{H,1}, \dots, p_{H,d})^T\|_2 > 0\)) signify?

Figure 1: A single point with initial homogeneous coordinates \(\mathbf{p}_H = (6, 4, 2)^T\), corresponding to the point \(\mathbf{p} = (3, 2)^T\) in Cartesian space. Try rotating the diagram and clicking the point to edit it. Both the two-dimensional Cartesian point and the three-dimensional homogeneous coordinate are shown, as well as the line of equivalent homogeneous coordinates.

Now that we have a grasp of how homogeneous coordinates work, let us move on to the why. Why should you introduce more complexity by adding an additional dimension, just to divide the additional coordinate away? One central reason we use homogenous coordinates in computer graphics is that affine transformations become just as easily representable and composable as linear transformations in Cartesian coordinates. In 2D Cartesian coordinates, any linear transformation — rotation and scaling around the origin, as well as shears and reflections — can be represented using a 2×2 matrix \(\mathbf{A}\):

\[\mathbf{\hat{p}} = \mathbf{A}\mathbf{p}\]

Furthermore, any series of linear transformations \(\mathbf{A}_1, \ldots, \mathbf{A}_n\) can simply be composed by multiplying the corresponding matrices:

\[\mathbf{\hat{p}} = \mathbf{A}_n\ldots\mathbf{A}_1\mathbf{p}\]

However, when working with 2D — or 3D — points, affine transformations such as translations by a vector \(\mathbf{b}\), rotation/scaling around points other than the origin, etc. are frequently required. While these can be represented as

\[\mathbf{\hat{p}} = \mathbf{A}\mathbf{p} + \mathbf{b},\]

affine transformations are more difficult to compose and you have to keep track of two separate pieces of data — the matrix \(\mathbf{A}\) and the vector \(\mathbf{b}\).

In homogeneous coordinates, any affine transformation can be represented as

\[\mathbf{\hat{p}}_H = \mathbf{A}_H\mathbf{p}_H,\ \text{where}\ \mathbf{A}_H = \begin{pmatrix}\mathbf{A} & \mathbf{b}\\\mathbf{0}^T & 1\end{pmatrix}.\]

As affine transformations in Cartesian coordinates are simple linear transformations in homogeneous coordinates, they are both easier to store and easier to compose. When \(\mathbf{A}_H\) is set up as above and \(\mathbf{p}_H\) is set up with \(w = 1\), you could even omit the division when converting back to Cartesian coordinates.

Figure 2 allows you to experiment with transformations in homogeneous coordinates. Try setting up various translations, scalings, and 90° rotations. What kind of three-dimensional linear transform does a pure 2D translation

\[\mathbf{A}_H = \begin{pmatrix}1 & 0 & x\\0 & 1 & y\\0 & 0 & 1\end{pmatrix}.\]

correspond to?

You can also try setting

\[\mathbf{A}_H = \begin{pmatrix}1 & 0 & 0\\0 & 0 & 0\\0 & 1 & 0\end{pmatrix}.\]

Is the resulting transformation affine? Look at the transformed point in the XY-view while moving around the node, especially when moving it along X at different values of Y. How would you describe what is happening? Can you think of anything similar that could be useful in 3D graphics?

Figure 2: A single point with initial homogeneous coordinates \(\mathbf{p}_H = (6, 4, 2)^T\), corresponding to the point \(\mathbf{p} = (3, 2)^T\) in Cartesian space, as in Figure 1. Additionally, you can edit the transformation matrix \(\mathbf{A}_H\) — which is initialized to a translation of \((1, 1)^T\) — applied to the homogeneous coordinate. The transformed homogenous coordinate (and the corresponding projected coordinate and projection line) are shown transparently.

At this point, you probably have a good grasp of what homogeneous coordinates are, why they are useful, and how to use them — from an intuitive point of view and not only a theoretical one.

Rational Splines

Another area of computer graphics in which homogeneous coordinates are very useful are splines. Splines are mathematical functions that emulate physical splines (mostly called flexible curves today) — thin strips of flexible wood (or other materials) that are fixed at specific points and create a smooth curve in between. Splines are designed to have local support, i.e. the effect of a single control point is limited to a small part of the entire function. This is usually achieved using piecewise polynomial parametric curves.

One of the most popular splines in computer graphics is the B-spline. Given a knot sequence \(t_i\), i.e the values of the parameter \(u\) where the polynomial pieces meet, and a set of control points \(\mathbf{p}_i\), the B-spline basis function can be recursively defined as

\[B_{i,1}(u) \coloneqq \left\{ \begin{matrix}1 & \text{if} & t_i \leq u ≤ t_{i + 1} \\0 & & \text{otherwise}\end{matrix}\right.\]

\[B_{i,k+1}(u) \coloneqq \omega_{i,k}(u) B_{i,k}(u) + (1 - \omega_{i+1,k}(u))B_{i+1,k}(u),\]

where

\[\omega_{i,k}(u) \coloneqq \left\{ \begin{matrix}\frac{u - t_i}{t_{i+k} - t_i} & \text{if} & t_{i+k} \neq t_i \\ 0 & & \text{otherwise} \end{matrix}\right.. \]

Then, the order-\(n\) B-spline is evaluated as a simple weighted sum

\[S_{n}(u) = \sum_i \mathbf{p}_i B_{i,n}(u).\]

Figure 3 shows a third-order B-spline with 6 control points. The spline shown is a cardinal — or uniform — B-spline, i.e. the knots are evenly spaced. Try moving the control points and observe which parts of the spline are affected.

Figure 3: Third-order cardinal B-spline with 6 control points. Camera navigation is locked in this figure and \(w\) is fixed at 1. The \(x\) and \(y\) coordinates of each control point can be edited as usual. In addition to the smooth spline, the dashed lines connect the control points in a straight line.

While B-splines are already quite useful, many shapes, such as circles, can not be represented exactly using B-splines. Also, the importance of each control point is equal, so explicitly creating a smoother or less smooth part of the spline is not possible without adding control points and/or editing the knots. By evaluating the B-spline with \((d + 1)\)-dimensional homogeneous coordinates then projecting it into the Cartesian coordinate space, you get a rational B-spline.

As you can try out for yourself in Figure 4, this has the effect of turning the \(w\)-coordinate into a weight. Control points with \(w > 1\) have a larger effect on the resulting curve than those with \(w < 1\). As good a reason as any to call our vector components \(x\), \(y\), \(z\), and \(w\) — in that order — despite everything we learned in our ABCs.

Figure 4: Third-order uniform rational B-spline with 6 control points. While the initial Cartesian positions of the control points match those in Figure 3, \(\mathbf{p}_2\) and \(\mathbf{p}_4\) have an initial \(w > 1\). If you rotate the diagram out of the XY-view, you will see the projection surface resulting from the projection of the three-dimensional B-spline to the two-dimensional rational B-spline.

I hope this article and especially the interactive diagrams therein have been useful to you in learning about homogeneous coordinates and rational splines. Should you have any questions or suggestions for improvements, feel free to contact me.