Last Update：August 30, 2025 pm

Notations

In order to avoid possible ambiguity, let’s specify all the notations that we need first.

Linear Space: Since the goal of this paper is to give an intuitive picture, linear spaces $V$ are Euclidean spaces $\mathbb{R}^n$ by default. According to convention, the vectors in $\mathbb{R}^n$ are column vectors. At the same time, it should be noted that this paper is trying to ensure the generality. For those statements that do not involve $ \mathbb{R}^n$, the discussion of these special properties can be extended to more general linear space immediately.
Basis: A basis is made up of a set of linearly independent vectors that can generate the whole space. We use Greek letters to represent a basis, such as $\eta=\left(\eta_1,\eta_2,\cdots,\eta_n\right)$, where $\eta_i\in \mathbb{R}^n(i=1,2,\cdots,n)$ is called the $i$-th basis vecter of $\eta$.
Orthonormal basis: $\epsilon=(\epsilon_1,\epsilon_2,\cdots,\epsilon_n)$, where the $i$-th component of $\epsilon_i\in\mathbb{R}^n\left(i=1,2,\cdots,n\right)$ is $1$ while other components equal $0$.
Coordinates: Since we will consider multiple bases, it is necessary to distinguish the concepts of coordinates and vectors. Given a vector $v=(v_1,v_2,\cdots,v_n )^T$, the coordinate of $v$ under the basis $\eta$ is an $n$-dimensional vector $v^\eta=(v^\eta_1, v^\eta_2,\cdots, v^\eta_n)^T$, which can be determined by
$v^\eta_1 \eta _1+\cdots+ v^\eta_n \eta _n = ( \eta_1 \; \eta_2 \;\cdots\; \eta_n)\begin{pmatrix} v^\eta_1 \\ v^\eta_2 \\ \vdots\\ v^\eta_n \end{pmatrix} =v.$
Specially, we have $v^\epsilon =(\epsilon_1 , \epsilon_2 , \cdots, \epsilon_n )v^\epsilon=v$.
Basis Matrix: $M_{\eta}$ $=( \eta_1^{\epsilon} \; \eta_2^\epsilon \; \cdots \; \eta_n^\epsilon)$. Note that the coordinates of the basis vectors are column vectors, so the only natural way to get the basis matrix is to put them together by column. In fact, according to the above discussion, we have
$M_\eta=(\eta_1 \;\eta_2 \; \cdots \; \eta_n).$
It can be seen that the basis matrix of orthonormal basis $M_\epsilon$ is the identity matrix $I$ and $v^\eta=M_\eta^{-1}v$.
Linear Transformation: The linear transformation $\mathcal{A}$ on $\mathbb{R}^n$ is a mapping from $\mathbb{R}^n$ to $\mathbb{R}^n$ such that $\forall v_1,v_2\in\mathbb{R}^n$, $\forall \lambda _1, \lambda _2\in\mathbb{R}$,
$\mathcal{A} (\lambda_1 v_1+\lambda_2 v_2)= \lambda_1 \mathcal{A} (v_1)+\lambda_2 \mathcal{A} (v_2).$
$\mathcal{A} (v)$ can be denoted by $ \mathcal{A} v$ for simplicity.

Matrix Representation of Linear Transformation

We will show that a linear transformation can be expressed as a matrix under a given basis. Conversely, given a basis, a matrix can also determine a linear transformation.

One way to understand a linear space is to imagine that it the space is linearly generated by basis. The basis is a basic framework and vectors can “grow” on this framework. It is easier to consider the action of $\mathcal{A}$ on the basis than the action of that on $\mathbb{R}^n$. Once we know its action on the basis, its action on the vectors growing on this basis will be explicit. More specifically, if $\mathcal{A}$ is generated as follows:

$v=v^\eta_1 \eta_1+\cdots+ v^\eta_n \eta_n ,$

that is, if the coordinates of $v$ under the basis $\eta$ is $v^\eta$, then we have

$\mathcal{A} v= \mathcal{A} ( v^\eta_1 \eta_1+\cdots+ v^\eta_n \eta_n )=v^\eta_1 \mathcal{A} \eta_1 +\cdots + v^\eta_n \mathcal{A} \eta_n.$

That means the coordinates of $ \mathcal{A} v $ in the basis $ \mathcal{A} \eta = (\mathcal{A} \eta_1, \cdots, \mathcal{A} \eta_n) $ are also $ v^\eta $.

From this, we can see that although $ v $ and $ \mathcal{A} v $ grow on different bases ($ \eta $ and $ \mathcal{A} \eta$), their growth patterns (coordinates) are the same.

We assert that if we know the result of the action of $ \mathcal{A} $ on the old basis $ \eta $, or if we know the coordinates of the new basis $ \mathcal{A} \eta $ under the old basis $ \eta $, then all information about $ \mathcal{A} $ is known. In fact, suppose

$\mathcal{A} \eta_i = (\eta_1\; \eta_2\;\cdots\; \eta_n )\begin{pmatrix}a_{i1}\\a_{i2}\\ \vdots\\a_{in} \end{pmatrix}$

namely

$( \mathcal{A} \eta_1 \; \mathcal{A} \eta_2 \;\cdots\; \mathcal{A} \eta_n) = (\eta_1\; \eta_2\;\cdots\; \eta_n ) \begin{pmatrix} a_{11}& a_{12} &\cdots&a_{1n}\\ a_{21}& a_{22} &\cdots&a_{2n}\\ \vdots& \vdots &&\vdots\\ a_{n1}& a_{n2} &\cdots&a_{nn} \end{pmatrix}.$

The matrix formed by arranging the $n$ coordinate vectors of $\mathcal{A} \eta$ in the basis $\eta$ is denoted as

$A^\eta = \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n}\\ a_{21} & a_{22} & \cdots & a_{2n}\\ \vdots & \vdots & & \vdots\\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{pmatrix} = \left( (\mathcal{A} \eta_1)^\eta \; (\mathcal{A} \eta_2)^\eta \; \cdots \; (\mathcal{A} \eta_n)^\eta \right),$

which is called the matrix representation of $\mathcal{A}$ in the basis $\eta$. This naming is reasonable because we will soon see that $A^\eta$, along with the basis $\eta$, determines the result of $\mathcal{A}$ acting on any vector in $\mathbb{R}^n$, thereby fully determining $\mathcal{A}$. For any $v \in \mathbb{R}^n$, we have

$\mathcal{A} v = v^\eta_1 \mathcal{A} \eta_1 + \cdots + v^\eta_n \mathcal{A} \eta_n = (\mathcal{A} \eta_1 \; \mathcal{A} \eta_2 \; \cdots \; \mathcal{A} \eta_n) \begin{pmatrix} v^\eta_1 \\ v^\eta_2 \\ \vdots \\ v^\eta_n \end{pmatrix} = (\eta_1 \; \eta_2 \; \cdots \; \eta_n) A^\eta v,$

which shows that the coordinates of $\mathcal{A} v$ in the basis $\eta$ are $A^\eta v$. In particular, in the standard orthonormal basis, $\mathcal{A} v = A^\epsilon v$.

Furthermore, it can be proven that this one-to-one correspondence between matrices and linear transformations preserves linear operations, multiplication (composition of mappings), and the identity element. This constitutes linear isomorphism, ring isomorphism, and associative algebra isomorphism.

Understand Linear Transformation Intuitively

Now let’s turn our attention to the concrete Euclidean plane. Suppose we choose the standard orthonormal basis $\epsilon$ on $\mathbb{R}^2$, where the coordinates of vectors in the basis are equal to the vectors themselves. Then the matrix

$A^\epsilon = \begin{pmatrix} 1 & -1 \\ 1 & 1 \end{pmatrix}$

represents a linear transformation. According to the conclusion we obtained earlier, the $i$th column of $A^\epsilon$ is the new vector resulting from the linear transformation acting on the $i$th basis vector of the standard orthonormal basis, so the $i$th column of $A^\epsilon$ equals $A^\epsilon \epsilon_i$. Of course, we can also see this directly from the relation

$(\mathcal{A} \epsilon_1 \; \mathcal{A} \epsilon_2 \; \cdots \; \mathcal{A} \epsilon_n) = (\epsilon_1 \; \epsilon_2 \; \cdots \; \epsilon_n) A^\epsilon = I A^\epsilon = A^\epsilon.$

Because $\epsilon_1 = (1,0)^T$ is mapped to $A^\epsilon \epsilon_1 = (1,1)^T$ and $\epsilon_2 = (0,1)^T$ is mapped to $A^\epsilon \epsilon_2 = (-1,1)^T$, we know that the linear transformation represented by $A^\epsilon$ is a $45^\circ$ counterclockwise rotation followed by a $\sqrt{2}$ scaling.

Basis Transformation and Similar Matrices

Next, we illustrate an important fact: given a linear transformation, its matrices in different bases are similar.

Let $\eta, \zeta$ be two bases. Define the linear transformation $\mathcal{T}$ by

$\mathcal{T}(\lambda_1 \eta_1 + \cdots + \lambda_n \eta_n) = \lambda_1 \zeta_1 + \cdots + \lambda_n \zeta_n,$

which maps $\eta_i$ to $\zeta_i$ for $i = 1, 2, \cdots, n$. In other words, the basis $\zeta$ equals the basis $\mathcal{T} \eta$. $\mathcal{T}$ is called the transition transformation from the basis $\eta$ to the basis $\zeta$, and the matrix representation of $\mathcal{T}$ in the basis $\eta$, denoted as $T^\eta$, is called the transition matrix from the basis $\eta$ to the basis $\zeta$.

Given a linear transformation $\mathcal{A}$, we have on the one hand

$(\mathcal{A} \zeta_1 \; \mathcal{A} \zeta_2 \; \cdots \; \mathcal{A} \zeta_n) = (\zeta_1 \; \zeta_2 \; \cdots \; \zeta_n) A^\zeta = (\mathcal{T} \eta_1 \; \mathcal{T} \eta_2 \; \cdots \; \mathcal{T} \eta_n) A^\zeta = (\eta_1 \; \eta_2 \; \cdots \; \eta_n) T^\eta A^\zeta.$

On the other hand,

$(\mathcal{A} \zeta_1 \; \mathcal{A} \zeta_2 \; \cdots \; \mathcal{A} \zeta_n) = (\mathcal{A} \mathcal{T} \eta_1 \; \mathcal{A} \mathcal{T} \eta_2 \; \cdots \; \mathcal{A} \mathcal{T} \eta_n) = (\eta_1 \; \eta_2 \; \cdots \; \eta_n) A^\eta T^\eta.$

Thus, we obtain

$A^\zeta = (T^\eta)^{-1} A^\eta T^\eta,$

which means that the matrices of the same linear transformation in different bases are similar.

Basis Transformation and Coordinate Transformation

Let $\mathcal{T}$ be the transition transformation from the basis $\eta$ to the basis $\zeta$, and let $T^\eta$ be the transition matrix from the basis $\eta$ to the basis $\zeta$. If the coordinates of $v$ in the old basis $\eta$ are

$v = (\eta_1 \; \eta_2 \; \cdots \; \eta_n) \begin{pmatrix} v^\eta_1 \\ v^\eta_2 \\ \vdots \\ v^\eta_n \end{pmatrix},$

then the coordinates of $v$ in the new basis $\zeta$ are

$v= ( \zeta _1 \; \zeta _2 \;\cdots\; \zeta _n) \begin{pmatrix} v^\zeta_1 \\ v^\zeta_2 \\ \vdots\\ v^\zeta_n \end{pmatrix},$

Thus, we have

$v= (\mathcal{T} \eta _1 \; \mathcal{T}\eta _2 \;\cdots\; \mathcal{T}\eta _n) \begin{pmatrix} v^\zeta_1 \\ v^\zeta_2 \\ \vdots\\ v^\zeta_n \end{pmatrix}= (\eta _1 \; \eta _2 \;\cdots\; \eta _n)T^\eta \begin{pmatrix} v^\zeta_1 \\ v^\zeta_2 \\ \vdots\\ v^\zeta_n \end{pmatrix}.$

By comparing the coefficients, we see

$\begin{pmatrix} v^\eta_1 \\ v^\eta_2 \\ \vdots\\ v^\eta_n \end{pmatrix} =T^\eta \begin{pmatrix} v^\zeta_1 \\ v^\zeta_2 \\ \vdots\\ v^\zeta_n \end{pmatrix}.$

So we obtain the relationship between the coordinates of the vector $v$ in the new basis $\zeta$ and the old basis $\eta$:

$\begin{pmatrix} v^\zeta_1 \\ v^\zeta_2 \\ \vdots\\ v^\zeta_n \end{pmatrix} =(T^\eta)^{-1} \begin{pmatrix} v^\eta_1 \\ v^\eta_2 \\ \vdots\\ v^\eta_n \end{pmatrix},$

which is also called the coordinate transformation formula.

Note the commutative diagram below. (A commutative diagram means that two composite morphisms are equal if their starting points and endpoints are the same. In this case, commutativity means $\mathrm{coor}^\zeta\circ\mathrm{id}=(T^\eta)^{-1}\circ\mathrm{coor}^\eta$.)

From the perspective of the linear space itself, a change of basis is just a trivial identity transformation. However, from the viewpoint of the coordinate space, coordinate transformation is a non-trivial linear transformation.

Eigenvalues and Eigenvectors

Given a linear transformation $\mathcal{A}$, if we can find a basis $\zeta$ such that the matrix representation of $\mathcal{A}$ in $\zeta$, $A^\zeta$, is a diagonal matrix $\Lambda$, i.e.,

$A^\zeta = \Lambda = \begin{pmatrix} \lambda_{1} & & & \\ & \lambda_{2} & & \\ & & \ddots & \\ & & & \lambda_{n} \end{pmatrix},$

then we have

$(\mathcal{A} \zeta_1 \; \mathcal{A} \zeta_2 \; \cdots \; \mathcal{A} \zeta_n) = (\zeta_1 \; \zeta_2 \; \cdots \; \zeta_n) \begin{pmatrix} \lambda_{1} & & & \\ & \lambda_{2} & & \\ & & \ddots & \\ & & & \lambda_{n} \end{pmatrix} = (\lambda_{1} \zeta_1 \; \lambda_{2} \zeta_2 \; \cdots \; \lambda_{n} \zeta_n).$

That is, the $n$ diagonal elements $\lambda_{1}, \lambda_{2}, \cdots, \lambda_{n}$ of the diagonal matrix are the eigenvalues of the linear transformation $\mathcal{A}$, and $\zeta_1, \zeta_2, \cdots, \zeta_n$ are the corresponding eigenvectors. Geometrically, if we take the eigenvectors corresponding to the $n$ eigenvalues as the basis, the action of the linear transformation on the basis vectors is merely a simple scaling.

To calculate a specific example, let us again turn our attention to the Euclidean plane with the standard orthonormal basis $\epsilon$. We know that the matrix

$A^\epsilon = \begin{pmatrix} 1 & 2 \\ -1 & 4 \end{pmatrix}$

represents a linear transformation that maps $\epsilon_1 = (1,0)^T$ to $A^\epsilon \epsilon_1 = (1,-1)^T$ and $\epsilon_2 = (0,1)^T$ to $A^\epsilon \epsilon_2 = (2,4)^T$.

This transformation process is a bit hard to visualize. But if we choose a basis consisting of eigenvectors $\zeta_1 = (2,1)^T$ and $\zeta_2 = (1,1)^T$ and calculate the action of the linear transformation represented by the matrix $A^\epsilon$ on the basis $\zeta_1, \zeta_2$, we find that

$A^\epsilon \zeta_1 = \begin{pmatrix} 4 \\ 2 \end{pmatrix} = 2 \zeta_1, \quad A^\epsilon \zeta_2 = \begin{pmatrix} 3 \\ 3 \end{pmatrix} = 3 \zeta_2.$

This shows that the linear transformation represented by the matrix $A^\epsilon$ can be regarded as a scaling in the directions of $\zeta_1$ and $\zeta_2$.

Mathematics

Mathematics Linear Algebra

Tensor - Calculation Previous