\(
\newcommand{\BE}{\begin{equation}}
\newcommand{\EE}{\end{equation}}
\newcommand{\BA}{\begin{eqnarray}}
\newcommand{\EA}{\end{eqnarray}}
\newcommand\CC{\mathbb{C}}
\newcommand\FF{\mathbb{F}}
\newcommand\NN{\mathbb{N}}
\newcommand\QQ{\mathbb{Q}}
\newcommand\RR{\mathbb{R}}
\newcommand\ZZ{\mathbb{Z}}
\newcommand{\va}{\hat{\mathbf{a}}}
\newcommand{\vb}{\hat{\mathbf{b}}}
\newcommand{\vn}{\hat{\mathbf{n}}}
\newcommand{\vt}{\hat{\mathbf{t}}}
\newcommand{\bx}{\mathbf{x}}
\newcommand{\bv}{\mathbf{v}}
\newcommand{\bg}{\mathbf{g}}
\newcommand{\bn}{\mathbf{n}}
\newcommand{\by}{\mathbf{y}}
\)
Rotation with Quaternions
Overview
Unit quaternions as a rotation formalism can be most intuitively constructed from the principal rotation vector (PRV) discussed in the previous section of this multi-page article on rotation formalisms. However, before we do so, we offer a very basic introduction to quaternions in general from an algebraic perspective and then try to facilitate some intuition why conjugation with unit quaternions can be interpreted as a computationally efficient rotation in three-dimensional space. The unit quaternion, written in scalar-vector notation, is sometimes also referred to as the Euler parameters (see, for instance, this page on the Wolfram Mathworld website).
Definition and Basic Algebraic Properties
Quaternions are an extension of complex numbers. They are obtained by introducing additional “imaginary” numbers, beyond \(i=\sqrt{-1}\) from the complex numbers. These additional “imaginary” numbers are denoted by \(\mathbf{j}\) and \(\mathbf{k}\), and are, together with \(\mathbf{i}\) called the basic quaternions (boldface is typically used). A general quaternion is then represented as
\begin{equation}
q = q_0 + q_1\mathbf{i} + q_2\mathbf{j} + q_3\mathbf{k}
\end{equation}
where \(q_0, q_1, q_2\), and \(q_3\) are real numbers. \(q_0\) is called the scalar part of the quaternion and \(\mathbf{q}:=(q_1, q_2, q_3)\) is called the vector part. (While a quaternion can be viewed as a vector in four dimensional space, it is often just viewed as a vector in three-dimensional space, using its vector part.)
The explanations below are rather technical. The speedy reader not interested in the algebraic properties of quaternions can skip the rest of this section and fast forward to the section where relationships to other rotation formalisms are discussed. For our applications, we will create quaternions from the PRV and convert them into a DCM if we need to rotate a vector. However, while sound from an application perspective, by taking this shortcut, the reader will bypass even the most basic definition of quaternions.
Basic Relations and Operations
The basic quaternions obey the relations
\begin{equation}
\mathbf{i}^2 = \mathbf{j}^2 = \mathbf{ijk} = -1,
\end{equation}
and equation that was carved by William Rowan Hamilton into a stone of Brougham Bridge, as he came up with them.
There are some immediate consequences from the above relations:
\begin{eqnarray}
\mathbf{jk}&=&\mathbf{i}\\
\mathbf{kj}&=&-\mathbf{i}\\
\mathbf{ki}&=&\mathbf{j}\\
\mathbf{ik}&=&-\mathbf{j}
\end{eqnarray}
It is obvious from the above that quaternion multiplication is noncommutative. It is the only property that prevents quaternions from being a field, in the sense of group theory and abstract algebra (real and complex numbers are fields). Quaternions form a four-dimensional associative normed division algebra.
Intermezzo 1: Beyond Quaternions
One can extend complex numbers even further than quaternions, but at each step one gives up something. Octonions are the next extension, but one loses associativity. Beyond octonions are the sedenions, which contain zero divisors and cannot be normed; they find application in machine learning as part of neural networks. Octonions and sedenions—even their basic introduction—are beyond this article.
As a side note, one has even lost so something when introducing complex numbers as a extension of real numbers (while gaining other benefits): real numbers can be ordered, while complex numbers cannot.
Intermezzo 2: Applications of Quaternions
Quaternions rose to importance in mathematics after Hamilton’s discovery, but were eventually replaced in many applications by vector analysis, which is conceptually easier to understand and to write down. They found a resurgence, however, in applications involving spatial rotations, including computer graphics and attitude control, which is why we discuss them here. Unlike Euler angles, they do not suffer from coordinate singularities (gimbal lock), tough they are a bit more difficult to visualize.
Indeed, when Hamilton discovered quaternions, he was looking for a way to describe points in three dimensional space similarly to how complex numbers can be used to describe points in two dimensional space (with complex numbers being interpreted as points in the complex plane – we have seen, for instance, in our complex analysis primer, how powerful the complex number formalism can be even in physics in describing incompressible, irrotational two-dimensional flows).
Discussing the algebraic properties of quaternions goes beyond this brief introductory article. We have only briefly touched on its very basics. We encourage the reader to consult the Wikipedia article on quaternions and references therein. What we will do next is to study how quaternions can be interpreted as rotations. From this above, this is far from obvious so far. To this end, we will introduce the unit quaternion, which is used to describe spatial rotations, and we shall do so based on the PRV, which we have discussed earlier.
Addition and Multiplication of Quaternions
From the above relations, the definitions for quaternion addition and multiplication follow (to be performed similarly as we have done for complex numbers). Addition happens component wise for the four quaternion components. For the multiplication, one must evaluate the products of the basic quaternions according to their fundamental computation rules. For two quaternions \(p=(p_0, p_1, p_2, p_3)\) and \(q=(q_0, q_1, q_2, q_3)\), written as four-dimensional vectors over the real numbers, this results in:
\begin{eqnarray}
p+q &=&
\begin{pmatrix}
p_0+q_0\\
p_1+q_1\\
p_2+q_2\\
p_3+q_3
\end{pmatrix}\\
pq &=&
\begin{pmatrix}
p_0q_0-p_1q_1-p_2q_2-p_3q_3\\
p_1q_0+p_0q_1-p_3q_2+p_2q_3\\
p_2q_0+p_3q_1+p_0p_2-p_1q_3\\
p_3q_1-p_2q_1+p_1q_2+p_0q_3
\end{pmatrix}
\end{eqnarray}
Note that the addition operation is commutative, while the multiplication is not, i.e. in general \(p+q=q+p\) but \(pq\not=qp\).
It is easy to see from the above that quaternion multiplication can be written as a matrix multiplication, if we use the following construction:
\begin{equation}
pq=
\begin{pmatrix}
p_0 & -p_1 & -p_2 & -p_3 \\
p_1 & p_0 & -p_3 & p_2 \\
p_2 & p_3 & p_0 & -p_1 \\
p_3 & -p_2 & p_1 & p_0
\end{pmatrix}
\begin{pmatrix}
q_0\\
q_1\\
q_2\\
q_3
\end{pmatrix}
\end{equation}
The result is a four-tuple of numbers representing the quaternion \(pq\). (Below we will encounter a similarly looking matrix formalism which will represent quaternions always as real \(4\times4\) matrices.)
It turns out that multiplication of two quaternions \(p\) and \(q\) can also be written with the help of the dot (\(\ \cdot\ \)) and cross (\(\ \times\ \)) product between the vector parts \(\mathbf{p}=(p_1, p_2, p_3)\) and \(\mathbf{q}=(q_1, q_2, q_3)\) of the two quaternions \(p\) and \(q\):
\begin{equation}
pq = p_0q_0 – \mathbf{p} \cdot \mathbf{q} + p_0\mathbf{q} + q_0\mathbf{p} + \mathbf{p} \times \mathbf{q}.
\end{equation}
These are all different ways to achieve the same multiplication result.
Conjugate and Norm
The unit quaternion (also called a versor) which describes a rotation in three-dimensional space, as we shall see shortly, is a quaternion with norm \(||q||=1\). For this to make sense, we need to introduce a norm on quaternions first. The norm is defined as:
\begin{equation}
||q|| = \sqrt{q\overline{q}} = \sqrt{q_0^2+q_1^2+q_2^2+q_3^2},
\end{equation}
where the overlined quantity \(\overline{q}\) is the conjugate quaternion:
\begin{equation}
\overline{q} = q_0 – q_1\mathbf{i} – q_2\mathbf{j} – q_3\mathbf{k}.
\end{equation}
Matrix Representation of Quaternions
Similarly to how we have arrived at a matrix representation of complex numbers in our complex analysis primer, and turned complex number addition and multiplication into standard matrix addition and multiplication of real \(2\times2\) matrices of a special form, one can do the same for quaternions. Quaternions can be represented either as complex \(2\times2\) matrices or as real \(4\times4\) matrices, where the latter representation is not unique.
As a complex \(2\times2\) matrix, we have for a quaternion \(q=q_0+ q_1\mathbf{i}+q_2\mathbf{j}+q_3\mathbf{k}\):
\begin{equation}
\begin{pmatrix}
q_0 + iq_1 & q_2 + iq_3 \\
-q_2+iq_3 & q_0 – iq_1
\end{pmatrix}
\end{equation}
where the \(i\) of complex numbers here is distinct from the \(\mathbf{i}\) of the quaternions.
Alternatively, we can also represent quaternion \(q=(q_0, q_1, q_2, q_3)\) as a real \(4\times4\) matrix:
\begin{equation}
\begin{pmatrix}
q_0 & -q_1 & -q_2 & -q_3 \\
q_1 & q_0 & -q_3 & q_2 \\
q_2 & q_3 & q_0 & -q_1 \\
q_3 & -q_2 & q_1 & q_0
\end{pmatrix}
\end{equation}
(The quaternion can also be recovered from such a \(4\times4\) matrix by comparing the entries.) The four basic quaternions 1, \(\mathbf{i}\), \(\mathbf{j}\), and \(\mathbf{k}\) then take the form
\begin{eqnarray}
1 &=&
\begin{pmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1
\end{pmatrix}\\
\mathbf{i} &=&
\begin{pmatrix}
0 & -1 & 0 & 0 \\
1 & 0 & 0 & 0 \\
0 & 0 & 0 & -1 \\
0 & 0 & 1 & 0
\end{pmatrix}\\
\mathbf{j} &=&
\begin{pmatrix}
0 & 0 & -1 & 0 \\
0 & 0 & 0 & 1 \\
1 & 0 & 0 & 0 \\
0 & -1 & 0 & 0
\end{pmatrix}\\
\mathbf{k} &=&
\begin{pmatrix}
0 & 0 & 0 & -1 \\
0 & 0 & -1 & 0 \\
0 & 1 & 0 & 0 \\
1 & 0 & 0 & 0
\end{pmatrix}
\end{eqnarray}
It is easy to verify that these matrices satisfy the defining equations for basic quaternions introduced earlier. This lets us perform all quaternion additions and multiplications as matrix operations between \(4\times4\) matrices with real entries. (It also gives us a hint that direct quaternion mathematics may be computationally more efficient than the part of vector/matrix mathematics it is capable of encoding.)
The conjugate and the norm of a quaternion can be obtained from these matrices as well with standard matrix computation tools like transposition and determinant. Since we will not use any of this here further, we refer the reader to the Wikipedia article on quaternions for further information.
Rotations Using Quaternions
Unit quaternions, i.e. quaternions with norm \(||q||=1\), can be used to describe rotations in three-dimensional space. We shall first describe “A-style rotations” with them (i.e. the active rotation of a vector with respect to a fixed basis; see the section on the rotation matrix formalism on a separate page for an ad hoc definition of this term), and later comment on how to use them for “B-style rotations”, which rotate the basis underneath a vector perceived as fixed in space (basis transformation of the coordinate matrix of the vector to a different basis which is rotated compared to the original one).
Reminder: Rotations in 2D using Complex Numbers
Before proceeding with discussing how quaternions allow us to perform rotations in three-dimensional space, let us remind ourselves how complex numbers are used to perform rotations in two-dimensional space. The complex plane can be viewed as representing a two-dimensional Euclidean space.
Imagine that we have a point in this two-dimensional space given by the coordinate pair \((a,b)\). In order to rotate this point counterclockwise by angle \(\theta\), we can perform the following mathematical trick. We write \((a, b)\) as a complex number \(z=a+ib\), and simply multiply it by the complex number \(c:=\cos\theta +i \sin\theta\) (note that \(c\) defined this way has unit norm, because \(||c||^2=\cos^2\theta+\sin^2\theta=1\)). The resulting point in the complex plane, \(z_2\), with
\begin{equation}
z_2 = cz = (\cos\theta +i \sin\theta) (a+ib) = (a \cos\theta – b \sin\theta) + i (b \cos\theta + a \sin\theta)
\end{equation}
is the original point \(z\) rotated by an angle of \(\theta\) counterclockwise. We can again separate out the real and imaginary parts of \(z_2\) to view it as a coordinate pair \((a_2, b_2): = (a \cos\theta – b \sin\theta, b \cos\theta + a \sin\theta)\) in \(\RR^2\). We then also discover that the mathematical manipulation achieved by the complex multiplication with a unit complex number \(c\) of the above form is the same as accomplished by a 2D rotation matrix \(A\) applied to vector \((a, b)^T\), with
\begin{equation}
A=
\begin{pmatrix}
\cos\theta & -\sin\theta\\
\sin \theta & \cos\theta
\end{pmatrix},
\end{equation}
i.e. \((a_2, b_2)^T = A(a, b)^T\). The beauty of the above complex formalism is that we do not need to deal with vectors and matrices; we can just work with complex numbers. (Note that if we used a complex number \(c\) which does not have norm 1, then the norm of \(z\) would be stretched by the norm of \(c\) during the multiplication.)
More generally, multiplying two complex numbers adds their angles and multiplies their norms (which is why we picked a complex number \(c\) with norm \(||c||=1\) above, such that we only get a rotation with no stretching, leaving the norm of \(z\) unchanged by the multiplication with \(c\)).
Quaternions accomplish a similar rotation feat in three dimensions, except that they are a little more complicated than complex numbers (as we have just seen) and that we will have to use quaternion conjugation instead of simple multiplication to achieve the rotation (as we shall see next). Due to these complications, some of the elegance is lost; quaternion math is usually not viewed as simpler than vector/matrix calculations, and generally actually less intuitive. But quaternions have other computational benefits over using a rotation matrix or other rotation formalisms, so they find widespread use in attitude control, computer graphics, etc.
Rotations in 3D using Quaternions
Rotation using quaternions is accomplished as follows. For a rotation in three dimensions, it is not sufficient to just specify an angle. We also have to specify an axis. Let \(\Phi\) be our angle of rotation and let the axis be specified by the three-dimensional unit vector \(\mathbf{e}=(e_1, e_2, e_3)\). We construct the rotation quaternion \(q\) (it is the equivalent of the complex number \(c\) in the 2D case) by
\begin{equation}
q:= \cos \frac{\Phi}{2} + \sin\frac{\Phi}{2} (e_1\mathbf{i}+e_2\mathbf{j}+e_3\mathbf{k})
\end{equation}
or written more compactly as a scalar and vector part, \(q=(\cos(\Phi/2), \sin(\Phi/2)\mathbf{e})\). The reason why only half the angle appears will become clear shortly. Note that the \(q\) constructed this way has unit norm, which is why it is said that 3D rotations are accomplished by unit quaternions (just like 2D rotations are accomplished by unit 2D complex numbers).
Now, imagine we have a point (or vector) \(\mathbf{p}=(p_1, p_2, p_3)\) in space (equivalent to \((a,b)\) in the two dimensional case above). We can turn it into a quaternion with zero scalar part, i.e. \(p=(0, p_1, p_2, p_3)\). In order to obtain the rotation of \(\mathbf{p}\) by angle \(\Phi\) around axis \(\mathbf{e}\) we compute the conjugation of \(p\) with \(q\):
\begin{equation}
r = q p \overline{q}
\end{equation}
(This is akin to the multiplication \(z_2=cz\) obtained earlier in the 2 dimensional case with complex numbers, but a little different.) The physical, 3-dimensional part of the resulting quaternion \(r=(r_0,r_1,r_2,r_3)\) is recovered by just looking at the vector part \(\mathbf{r}=(r_1, r_2, r_3)\) (besides, \(r_0=0\)).
Intuition for the Above Construction of a Rotation
Why half the angle and why conjugation instead of just multiplication? It seems plausible (though not obvious and we have not proven it here by any means) that one needs to take only half the angle \(\Phi\) in the trigonometric functions, when defining the rotation unit quaternion \(q\), because q appears twice in the conjugation with \(p\), when accomplishing the rotation.
But we can do better. First, let us note that unit complex numbers rotated by multiplication in the complex plane, while we want unit quaternions to do a 3D rotation, not 4D. So there is a difference. To further develop some intuition, let us study what multiplication and conjugation with \(\mathbf{i}\) does to a pure quaternion, i.e. to one with no scalar part. Somehow, we would want the action of \(\mathbf{i}\) on some 3D vector \(\mathbf{p}\) in space to provide a rotation around the \(\mathbf{i}\)-axis (coordinate direction of first component) by 90 degrees. But we will see that multiplication rotates in two directions in 4D space and we will have to undo one of them by conjugation.
Let us use three distinct pictures, the second of which will consist of two frames:
- Picture 1: Let us visualize the 3D space spanned by \(\mathbf{i}\), \(\mathbf{j}\), and \(\mathbf{k}\). The scalar part of a quaternion (i.e. the real part), is not depicted here, it is somehow off-world or lifted into a fourth, invisible dimension.
- Picture 2: Let us perform two sections though the 4D quaternion space to obtain two 2D frames. One such section contains the real axis (off-world axis) and one spatial axis spanned by \(\mathbf{i}\) (i.e. the \(\mathbf{1i}\) plane). Since we will associate \(\mathbf{i}\) with a spatial direction though, let us pick the \(\mathbf{i}\) axis to be horizontal and the real axis to be vertical (opposite to how we usually visualize complex numbers. The second frame is the \(\mathbf{jk}\) plane spanned by the other two basic quaternions.
- Picture 3: We can also form a partial synthesis of the two aforementioned pictures, by a 3D picture which shows the \(\mathbf{ij}\) plane horizontally and the real axis vertically. In this picture, the dimension spanned by \(\mathbf{k}\) is missing (we can regard the two dimensional horizontal plane as a 3D hyperplane, and if we need to visualize the third dimension explicitly, we can take a look at Frame 2 of Picture 2).
The claim is now going to be that left-multiplication by \(\mathbf{i}\) causes a combined rotation around two axes, which we will be able to visualize in the two frames of Picture 2:
- Rotation 1: One rotation is a 90-degree rotation in the \(1\mathbf{i}\) plane, just like we had with complex numbers. This is because of \(\mathbf{i}^2=-1\). Except that we shall not see it as a rotation in real 2D space. Here, this rotation is unwanted, because we do not see this 2D plane as two spatial dimensions, only as one and some invisible off-world dimension. This rotation somehow “lifts” a point on the \(\mathbf{i}\) axis into the fourth dimension, and only the remaining projection along the \(\mathbf{i}\) axis will remain visible (this projection is actually zero for a 90 degree rotation). From a perspective of desiring a rotation in real space, this process is unwanted: it makes a vector along the \(\mathbf{i}\) axis simply shorter in the visible world.
- Rotation 2: The second rotation is due to the relations \(\mathbf{ij}=\mathbf{k}\). In the second frame of Picture 2, the multiplication of a quaternion from the left by \(\mathbf{i}\) causes a 90-degree rotation in the \(\mathbf{jk}\) plane. It does so around the \(\mathbf{i}\)-axis according to the right hand rule: \(\mathbf{j}\) goes into \(\mathbf{k}\) if multiplied from the left by \(\mathbf{i}\), and \(\mathbf{k}\) goes into \(-\mathbf{j}\).
We wish to come up with a way how an operation involving \(\mathbf{i}\) would only cause the second rotation above in the \(\mathbf{jk}\) plane around the \(\mathbf{i}\) axis. To this end, we observe the following. Right-multiplication by \(\mathbf{i}\) leaves Frame 1 of Picture 2 unchanged, because \(\mathbf{i}^2=-1\) is commutative. However, because \(\mathbf{ij}=-\mathbf{ji}\), the direction of the rotation in the \(\mathbf{jk}\) plane is reversed and now follows the left-hand rule. Right multiplication with \(-\mathbf{i}\), on the other hand, changes both directions, i.e. the rotation in the \(\mathbf{1i}\)-plane is reversed and the rotation in the \(\mathbf{jk}\) plane becomes the same as the rotation due to left-multiplication by \(\mathbf{i}\).
Assume we want to rotate a point \(\mathbf{p}\) in space around the \(\mathbf{i}\) axis. We can write \(\mathbf{p}\) as a pure quaternion \(p(0, \mathbf{p}\) with zero scalar part. It follows from the above that if we, instead of just multiplying \(\mathbf{i}p\), sandwich \(p\) into a conjugation \(\mathbf{i}p(-\mathbf{i})\), there will be no undesired lifting action in the \(\mathbf{1i}\) plane, and we will get twice the rotation (i.e. by 180 degrees) in the \(\mathbf{jk}\)-plane around the \(\mathbf{i}\) axis.
The same applies to conjugation with the quaternions \(\mathbf{j}\) and \(\mathbf{k}\) and linear combinations. In general, therefore, conjugation by a quaternion \(q=(q_0, q_1, q_2, q_3)\) leads to a rotation around the axis determined by \(\mathbf{q}=(q_1, q_2, q_3)\). We also only need to take a minus sign in front of the vector part of the quaternion during the multiplication from the right, so rather than using \(-q\) from the right, we shall use the conjugate \(\overline{q}\).
But how do we make sure that there is no stretching and how do we control the rotation angle? The angle control happens by “lifting” the quaternion \(q\) off the \(\mathbf{ijk}\) three-dimensional hyperplane by giving it a non-zero scalar part. This makes intuitively sense, since we saw that conjugation by \(\mathbf{i}\) causes a 180-degree rotation. Likewise, conjugation by quaternion \(1\) causes no rotation at all. It would be therefore more appropriate to speak of “lowering” the quaternion towards the \(\mathbf{ijk}\) plane in some direction determined by unit vector \(\mathbf{e}=(e_1, e_2, e_3)\) from its default position at \(q_0=1\) and \(\mathbf{q}=\mathbf{0}\) by some lowering angle \(\theta\).
Ensuring that there is no stretching, is done by making \(q\) a unit quaternion, i.e. one with \(||q||=q_0^2+||\mathbf{q}||^2=1\). To preserve the norm, the “lowering” angle \(\theta\) has to be implemented by giving the scalar part of the quaternion a prefactor of \(\cos\theta\), while giving the vector part a prefactor of \(\sin\theta\), just like you would to the angle of a complex number in the complex plane. We shall keep in mind that the ensuing rotation around the \(\mathbf{e}\) axis will be by \(2\theta\) because of the two sides of the conjugation adding up. So if our desired rotation angle is \(\Phi\), we must choose \(\theta=\Phi/2\).
We therefore construct our desired rotation unit quaternion for a rotation by an angle \(\Phi\) around an axis given by unit vector \(\mathbf{e}=(e_1, e_2, e_3)\) as
\begin{equation}
q=
\begin{pmatrix}
\cos\frac{\Phi}{2}\\
e_1\sin\frac{\Phi}{2}\\
e_2\sin\frac{\Phi}{2}\\
e_3\sin\frac{\Phi}{2}\\
\end{pmatrix}
\end{equation}
This construction automatically also tells us how to construct quaternions from a principal rotation vector, because \(\mathbf{e}\) and \(\Phi\) are exactly the rotation parameters of the PRV formalism.
The above is not a mathematically rigorous proof by any means. The train of thought is merely designed to develop some intuition. We now know why the rotation happens around the \(\mathbf{e}\) axis, why we need to use conjugation with a quaternion instead of multiplication (to rid ourselves of an unwanted rotation in the direction of the axis), and why we need to take only half the angle, because the two parts of the conjugation which cancel the rotation along (not around) the rotation axis double the rotation around the axis.
This may have been somewhat hard to follow in written form. For an elaborate visual illustration of this (including projecting the entire circle of Frame 2 of Picture 2 onto the \(\mathbf{i}\) axis by means of stereographic projection), we recommend you watch the following three videos on YouTube, in the following in order:
- Quaternions and 3D rotation, explained interactively (by 3blue1brown)
- Visualizing quaternions (4d numbers) with stereographic projection (by 3blue1brown)
- How quaternions produce 3D rotation (by PenguinMaths)
Illustrates why conjugation is needed to create a rotation and why one needs to take half the angle.
Short Rotation \((q_0>0)\) and Long Rotation \((q_0<0)\)
We will use the above pictures to understand how quaternions distinguish between short rotations (less than 180 degrees) and long rotation (more than 180 degrees) resulting in the same final direction/attitude.
Let us start with quaternion \(q=(1,0,0,0)\). The quaternion has “lowering angle” \(theta=0\), and therefore no rotation happens, because \(\Phi=2\theta\). This unit quaternion is located above the spatial hyperplane in Picture 3. We pick a 3D rotation axis \(\mathbf{e}\) and start lowering the quaternion towards the spatial 3D hyperplane by increasing the “lowering angle” \(\theta\). By the time we have lowered the quaternion into the hyperplane, i.e. \(\theta=90^\circ\), we have performed a 180-degree rotation around spatial axis \(\mathbf{e}\), because \(\Phi=2\theta\). During this part, we had \(q_0>0\) (\(q_0=0, when the quaternion lies in the spatial hyperplane and has no scalar part).
As we increase \(theta\) further than 90 degrees and thus get below the spatial hyperplane into the \(q_0<0\) region of 4D quaternion space, our spatial rotation by angle \(\Phi\) will become more than 180 degrees, eventually reaching 360 by the time \(theta=180^\circ\) and \(q=(-1,0,0,0)\). We have thus performed a full 360-degree rotation.
Let us now reverse the unit vector giving the rotation axis, i.e. pick \(-\mathbf{e}\) and start again at \(q=(1,0,0,0)\). Then the rotation will happen in the opposite direction. For \(q_0>0\) we will again have a short rotation. Its attitude (i.e. final point without caring in which direction we rotated to get there) will be the same as for the quaternion with \(q_0<0\) and axis vector \(\mathbf{e}\).
We therefore see that the quaternions \(q\) and \(-q\) describe the same final attitude (even though the direction in which the rotation happened and the degrees traveled during the rotation are different/opposite). And we can read off whether it was a long or short rotation by the sign of the \(q_0\) entry.
Basis Rotation Underneath a Fixed Vector ("\(B\)-style Rotation")
The above is an “\(A\)-style rotation”, rotating a vector in a fixed coordinate system. Similarly, we can perform a “\(B\)-style rotation”, where the vector is fixed and the basis is rotated underneath (basis transformation of the coordinate matrix of a fixed vector to a new, rotated basis). To do this, we can express a fixed vector \(\mathbf{v}\) with respect to a new basis \(\mathcal{B}\) which is rotated with respect to the original basis \(\mathcal{A}\) by angle \(\Phi\) around unit vector \(\mathbf{e}\) by
\begin{equation}
\mathbf{v}_B=\overline{q}\mathbf{v}_A q
\end{equation}
The formula is similar as for the “\(A\)-style rotation”, but the place of the conjugate has changed, because the rotation is performed in the opposite direction.
This latter computation corresponds to the same rotation as accomplished by the rotation matrix \(B\) defined previously in the rotation matrix section of this article by the relation \(\mathbf{v}_B=B\mathbf{v}_A\). Instead of applying matrix multiplication with a rotation matrix \(B\), we apply conjugation with a quaternion \(q\) that represents the same rotation. While yielding the same result as using the DCM, \(B\), the quaternion calculation requires fewer multiplications and additions than the matrix operation and is therefore computationally faster. We shall give an expression of \(B\) in terms of quaternion coefficients \((q_0, q_1, q_2, q_3)\), and its reverse relation, further below.
Consecutive Rotations
The sequential application of two consecutive rotations is simple in the quaternion formalism. If \(p\) and \(q\) are the quaternions of two rotations applied consecutively (with \(p\) being applied first), the whole rotation can be performed as one rotation by conjugation with the quaternion \(qp\), i.e. the “addition” of two rotations corresponds to quaternion multiplication of the quaternions of the two individual rotations. The non-commutativity of of quaternion multiplication characterizes that in three dimensions the order matters in which consecutive rotations are performed.
Relations between Quaternions and other Rotation Formalisms
In this section we relate unit quaternions to other rotation formalisms. The relation to RPVs follows essentially automatically from the above construction, and could be even taken as the definition of unit quaternions, if one were contact with disposing of any of the above algebraic contemplations and the computational efficiency of performing quaternion conjugation to achieve a rotation and simply resorted to vector algebra. Indeed, for our practical applications, we could have avoided most of the insights above and constructed the quaternions from the PRV. However, even if one chooses to go eventually through the rotation matrix, parametrizing attitude with quaternions retains many of its advantages.
Relation between Quaternions and PRV
The relation between unit quaternions and the PRV follows directly from how we constructed the rotation quaternion \(q\) above, its scalar part being the cosine of half the rotation angle and its vector part being the sine of half the rotation angle times the unit rotation axis vector \(\mathbf{e}\). We therefore obtain:
\begin{eqnarray}
q_0 &=& \cos\left(\frac{\Phi}{2}\right) \\
q_1 &=& e_1\sin\left(\frac{\Phi}{2}\right) \\
q_2 &=& e_2\sin\left(\frac{\Phi}{2}\right) \\
q_3 &=& e_3\sin\left(\frac{\Phi}{2}\right)
\end{eqnarray}
Likewise, one can compute the PRV \({\bf e}=(e_1, e_2, e_3)^T\) and rotation angle \(\Phi\) from:
\begin{eqnarray}
\Phi &=& 2 \arccos(q_0) \\
e_1 &=& \frac{q_1}{\sin\left(\frac{\Phi}{2}\right)}\\
e_2 &=& \frac{q_2}{\sin\left(\frac{\Phi}{2}\right)}\\
e_3 &=& \frac{q_3}{\sin\left(\frac{\Phi}{2}\right)}
\end{eqnarray}
Relation between Quaternions and Rotation Matrix
Sometimes it is more convenient just to use linear algebra with matrix operations than to bother with with quaternion operations. Or we may want to create the rotation matrix to translate to another rotation formalism. Whatever the motivation, if we want to avoid the quaternion conjugation to express a rotation and instead use the rotation matrix on vectors (i.e. computing \(\mathbf{v}_{\mathcal{B}}=B\mathbf{v}_{\mathcal{A}}\) instead of \(\mathbf{v}_{\mathcal{B}}=\overline{q}\mathbf{v}_{\mathcal{A}}q\)), we need to construct \(B\) first from a given unit quaternion \(q\), which accomplishes the rotation. This is done with the following formula:
\begin{equation}
B = \begin{pmatrix}
q_0^2+q_1^2-q_2^2-q_3^2 & 2(q_1q_2+q_0q_3) & 2(q_1q_3-q_0q_2) \\
2(q_1q_2-q_0q_3) & q_0^2-q_1^2+q_2^2-q_3^2 & 2(q_2q_3+q_0q_2) \\
2(q_1q_3+q_0q_2) & 2(q_2q_3-q_0q_1) & q_0^2-q_1^2-q_2^2+q_3^2
\end{pmatrix}
\end{equation}
The inverse transformation from the DCM, \(B\), back to the unit quaternion \(q\) accomplishing the rotation is found by combining suitable matrix entries in the above as:
\begin{eqnarray}
q_0 &=& \pm \frac{1}{2}\sqrt{B_{11} + B_{22} + B_{33} + 1} \\
q_1 &=& \frac{B_{23}-B_{32}}{4q_0} \\
q_2 &=& \frac{B_{31}-B_{13}}{4q_0} \\
q_3 &=& \frac{B_{12}-B_{21}}{4q_0}
\end{eqnarray}
Quaternion Differential Kinematic Equations
The kinematic differential equations for quaternions is given by
\begin{equation}
\begin{pmatrix}\dot q_0 \\ \dot q_1 \\ \dot q_2 \\ \dot q_3\end{pmatrix}
=\frac{1}{2}\begin{pmatrix}
q_0 & -q_1 & -q_2 & -q_3 \\
q_1 & q_0 & -q_3 & q_2 \\
q_2 & q_3 & q_0 & -q_1 \\
q_3 & -q_2 & q_1 & q_0
\end{pmatrix}
\begin{pmatrix}0 \\ \omega_1 \\ \omega_2 \\ \omega_3\end{pmatrix}
\end{equation}
We review this result without derivation. The components of \(\boldsymbol{\omega}\) are assumed to be given in the non-inertial axis system, the motion of which the quaternion \(q(t)\) describes, i.e. in the body frame, \((\omega_1, \omega_2, \omega_3)=(p,q,r)\)), if \(q=(q_0,q_1,q_2,q_3)\) describes the attitude of the aircraft (i.e. the attitude of the body axis system).
This equation is the quaternion equivalent of the matrix \(\mathcal{L}\) taking \((p, q, r)\) to \(\dot \psi\), \(\dot \theta\), \(\dot \phi\) for Euler angles. The \(\dot q_i\)’s play the role of the Euler angle time derivatives. A computational advantage of the above equation is that it does not require the evaluation of any trigonometric functions, unlike the matrix \(\mathcal{L}\) for Euler angles, which we have encountered earlier.