Understand Linear Transformation Intuitively
Last Update:May 18, 2024 am
Notations
In order to avoid possible ambiguity, let’s specify all the notations that we need first.
Linear Space: Since the goal of this paper is to give an intuitive picture, linear spaces $V$ are Euclidean spaces $\mathbb{R}^n$ by default. According to convention, the vectors in $\mathbb{R}^n$ are column vectors. At the same time, it should be noted that this paper is trying to ensure the generality. For those statements that do not involve $ \mathbb{R}^n$, the discussion of these special properties can be extended to more general linear space immediately.
Basis: A basis is made up of a set of linearly independent vectors that can generate the whole space. We use Greek letters to represent a basis, such as $\eta=\left(\eta_1,\eta_2,\cdots,\eta_n\right)$, where $\eta_i\in \mathbb{R}^n(i=1,2,\cdots,n)$ is called the $i$-th basis vecter of $\eta$.
Orthonormal basis: $\epsilon=(\epsilon_1,\epsilon_2,\cdots,\epsilon_n)$, where the $i$-th component of $\epsilon_i\in\mathbb{R}^n\left(i=1,2,\cdots,n\right)$ is $1$ while other components equal $0$.
Coordinates: Since we will consider multiple bases, it is necessary to distinguish the concepts of coordinates and vectors. Given a vector $v=(v_1,v_2,\cdots,v_n )^T$, the coordinate of $v$ under the basis $\eta$ is an $n$-dimensional vector $v^\eta=(v^\eta_1, v^\eta_2,\cdots, v^\eta_n)^T$, which can be determined by
Specially, we have $v^\epsilon =(\epsilon_1 , \epsilon_2 , \cdots, \epsilon_n )v^\epsilon=v$.
Basis Matrix: $M_{\eta}$ $=( \eta_1^{\epsilon} \; \eta_2^\epsilon \; \cdots \; \eta_n^\epsilon)$. Note that the coordinates of the basis vectors are column vectors, so the only natural way to get the basis matrix is to put them together by column. In fact, according to the above discussion, we have
It can be seen that the basis matrix of orthonormal basis is the identity matrix $I$ and $v^\eta=M_\eta^{-1}v$.
Linear Transformation: The linear transformation $\mathcal{A}$ on $\mathbb{R}^n$ is a mapping from $\mathbb{R}^n$ to $\mathbb{R}^n$ such that $\forall v_1,v_2\in\mathbb{R}^n$, $\forall \lambda _1, \lambda _2\in\mathbb{R}$,
$\mathcal{A} (v)$ can be denoted by $ \mathcal{A} v$ for simplicity.
Matrix Representation of Linear Transformation
We will show that a linear transformation can be expressed as a matrix under a given basis. Conversely, given a basis, a matrix can also determine a linear transformation.
One way to understand a linear space is to imagine that it the space is linearly generated by basis. The basis is a basic framework and vectors can “grow” on this framework. It is easier to consider the action of $\mathcal{A}$ on the basis than the action of that on $\mathbb{R}^n$. Once we know its action on the basis, its action on the vectors growing on this basis will be explicit. More specifically, if $\mathcal{A}$ is generated as follows:
that is, if the coordinates of $v$ under the basis $\eta$ is $v^\eta$, then we have
That means the coordinates of $ \mathcal{A} v $ in the basis $ \mathcal{A} \eta = (\mathcal{A} \eta_1, \cdots, \mathcal{A} \eta_n) $ are also $ v^\eta $.
From this, we can see that although $ v $ and $ \mathcal{A} v $ grow on different bases ($ \eta $ and $ \mathcal{A} \eta$), their growth patterns (coordinates) are the same.
We assert that if we know the result of the action of $ \mathcal{A} $ on the old basis $ \eta $, or if we know the coordinates of the new basis $ \mathcal{A} \eta $ under the old basis $ \eta $, then all information about $ \mathcal{A} $ is known. In fact, suppose
namely
The matrix formed by arranging the $n$ coordinate vectors of $\mathcal{A} \eta$ in the basis $\eta$ is denoted as
which is called the matrix representation of $\mathcal{A}$ in the basis $\eta$. This naming is reasonable because we will soon see that $A^\eta$, along with the basis $\eta$, determines the result of $\mathcal{A}$ acting on any vector in $\mathbb{R}^n$, thereby fully determining $\mathcal{A}$. For any $v \in \mathbb{R}^n$, we have
which shows that the coordinates of $\mathcal{A} v$ in the basis $\eta$ are $A^\eta v$. In particular, in the standard orthonormal basis, $\mathcal{A} v = A^\epsilon v$.
Furthermore, it can be proven that this one-to-one correspondence between matrices and linear transformations preserves linear operations, multiplication (composition of mappings), and the identity element. This constitutes linear isomorphism, ring isomorphism, and associative algebra isomorphism.
Understand Linear Transformation Intuitively
Now let’s turn our attention to the concrete Euclidean plane. Suppose we choose the standard orthonormal basis $\epsilon$ on $\mathbb{R}^2$, where the coordinates of vectors in the basis are equal to the vectors themselves. Then the matrix
represents a linear transformation. According to the conclusion we obtained earlier, the $i$th column of $A^\epsilon$ is the new vector resulting from the linear transformation acting on the $i$th basis vector of the standard orthonormal basis, so the $i$th column of $A^\epsilon$ equals $A^\epsilon \epsilon_i$. Of course, we can also see this directly from the relation
Because $\epsilon_1 = (1,0)^T$ is mapped to $A^\epsilon \epsilon_1 = (1,1)^T$ and $\epsilon_2 = (0,1)^T$ is mapped to $A^\epsilon \epsilon_2 = (-1,1)^T$, we know that the linear transformation represented by $A^\epsilon$ is a $45^\circ$ counterclockwise rotation followed by a $\sqrt{2}$ scaling.
Basis Transformation and Similar Matrices
Next, we illustrate an important fact: given a linear transformation, its matrices in different bases are similar.
Let $\eta, \zeta$ be two bases. Define the linear transformation $\mathcal{T}$ by
which maps $\eta_i$ to $\zeta_i$ for $i = 1, 2, \cdots, n$. In other words, the basis $\zeta$ equals the basis $\mathcal{T} \eta$. $\mathcal{T}$ is called the transition transformation from the basis $\eta$ to the basis $\zeta$, and the matrix representation of $\mathcal{T}$ in the basis $\eta$, denoted as $T^\eta$, is called the transition matrix from the basis $\eta$ to the basis $\zeta$.
Given a linear transformation $\mathcal{A}$, we have on the one hand
On the other hand,
Thus, we obtain
which means that the matrices of the same linear transformation in different bases are similar.
Basis Transformation and Coordinate Transformation
Let $\mathcal{T}$ be the transition transformation from the basis $\eta$ to the basis $\zeta$, and let $T^\eta$ be the transition matrix from the basis $\eta$ to the basis $\zeta$. If the coordinates of $v$ in the old basis $\eta$ are
then the coordinates of $v$ in the new basis $\zeta$ are
Thus, we have
By comparing the coefficients, we see
So we obtain the relationship between the coordinates of the vector $v$ in the new basis $\zeta$ and the old basis $\eta$:
which is also called the coordinate transformation formula.
Note the commutative diagram below. (A commutative diagram means that two composite morphisms are equal if their starting points and endpoints are the same. In this case, commutativity means $\mathrm{coor}^\zeta\circ\mathrm{id}=(T^\eta)^{-1}\circ\mathrm{coor}^\eta$.)
From the perspective of the linear space itself, a change of basis is just a trivial identity transformation. However, from the viewpoint of the coordinate space, coordinate transformation is a non-trivial linear transformation.
Eigenvalues and Eigenvectors
Given a linear transformation $\mathcal{A}$, if we can find a basis $\zeta$ such that the matrix representation of $\mathcal{A}$ in $\zeta$, $A^\zeta$, is a diagonal matrix $\Lambda$, i.e.,
then we have
That is, the $n$ diagonal elements of the diagonal matrix are the eigenvalues of the linear transformation $\mathcal{A}$, and $\zeta_1, \zeta_2, \cdots, \zeta_n$ are the corresponding eigenvectors. Geometrically, if we take the eigenvectors corresponding to the $n$ eigenvalues as the basis, the action of the linear transformation on the basis vectors is merely a simple scaling.
To calculate a specific example, let us again turn our attention to the Euclidean plane with the standard orthonormal basis $\epsilon$. We know that the matrix
represents a linear transformation that maps $\epsilon_1 = (1,0)^T$ to $A^\epsilon \epsilon_1 = (1,-1)^T$ and $\epsilon_2 = (0,1)^T$ to $A^\epsilon \epsilon_2 = (2,4)^T$.
This transformation process is a bit hard to visualize. But if we choose a basis consisting of eigenvectors $\zeta_1 = (2,1)^T$ and $\zeta_2 = (1,1)^T$ and calculate the action of the linear transformation represented by the matrix $A^\epsilon$ on the basis $\zeta_1, \zeta_2$, we find that
This shows that the linear transformation represented by the matrix $A^\epsilon$ can be regarded as a scaling in the directions of $\zeta_1$ and $\zeta_2$.
Copyright Notice: All articles in this blog are licensed under CC BY-SA 4.0 unless declaring additionally.