Skip to content

3 Elementary Matrix Operations and Systems of Linear Equations

3.1 Elementary Matrix Operations and Elementary Matrices

Elementary Matrix Operations

Definitions: Elementary Matrix Operations

Let A be an m×n matrix. Any one of the following three operations on the rows [columns] of A is called an elementary row [column] operation:

  1. Interchanging any two rows [columns] of$A $;
  2. Multiplying any row [column] of A by a nonzero scalar;
  3. Adding any scalar multiple of a row [column] of A to another row [column].

Any of these three operations is called an elementary operation. Elementary operations are of type 1, type 2, or type 3 depending on whether they are obtained by (1), (2), or (3).

Example 3.1.1

Let

A=[123421134012]

Interchanging the second row of A with the first row is an example of an elementary row operation of type 1. The resulting matrix is

B=[211312344012]

Multiplying the second column of A by 3 is an example of an elementary column operation of type 2. The resulting matrix is

C=[163423134012]

Adding 4 times the third row of A to the first row is an example of an elementary row operation of type 3. In this case, the resulting matrix is

M=[17271221134012]

elementary matrix

Definition: elementary matrix

An n×n elementary matrix is a matrix obtained by performing an elementary operation on In. The elementary matrix is said to be of type 1, 2, or 3 according to whether the elementary operation performed on In is a type 1, 2, or 3 operation, respectively.

Theorem 3.1

Let AMm×n(F), and suppose that B is obtained from A by performing an elementary row [column] operation. Then there exists an m×m [n×n] elementary matrix E such that B=EA [B=AE]. In fact, E is obtained from Im [In] by performing the same elementary row [column] operation as that which was performed on A to obtain B. Conversely, if E is an elementary m×m [n×n] matrix, then EA [AE] is the matrix obtained from A by performing the same elementary row [column] operation as that which produces E from Im [In].

Proof is skipped. Verifying Theorem 3.1 for each type of elementary row operation. The proof for column operations can then be obtained by using the matrix transpose to transform a column operation into a row operation.

Example 3.1.2

Consider the matrices A and B in Example 1. In this case, B is obtained from A by interchanging the first two rows of A. Performing this same operation on I3, we obtain the elementary matrix

E=[010100001]

Note that EA=B. (EA means row operation according to the definition of matrix multiplication)

In the second part of Example 1, C is obtained from A by multiplying the second column of A by 3. Performing this same operation on I4, we obtain the elementary matrix

E=[1000030000100001]

Observe that AE=C.

Theorem 3.2: inverse of elementary matrix

Elementary matrices are invertible, and the inverse of an elementary matrix is an elementary matrix of the same type.

3.2 The Rank of a Matrix and Matrix Inverses

Definition: rank

If AMm×n(F), we define the rank of A, denoted rank(A), to be the rank of the linear transformation LA:FnFm.

Theorem 3.3

Let T:VW be a linear transformation between finite-dimensional vector spaces, and let β,γ be ordered bases for V,W, respectively. Then rank(T)=rank([T]βγ)

Every matrix A is the matrix representation of the linear transformation LA w.r.t. the appropriate standard ordered bases. Thus the rank of the linear transformation LA is the same as the rank of one of its matrix representations, namely, A.

Theorem 3.4

Let A be an m×n matrix. If P and Q are invertible m×m and n×n matrices, respectively, then

  • rank(AQ)=rank(A),
  • rank(PA)=rank(A), and therefore,
  • rank(PAQ)=rank(A).

Corollary

Let V,W be finite-dimensional vector space and T:VW be an isomorphism. Let V0 be a subspac of V, then

  • T(V0) is a subspace of W;
  • dim(V0)=dim(T(V0)).

Corollary

Elementary row and column operations on a matrix are rank-preserving.

Theorem 3.5

The rank of any matrix equals the maximum number of its linearly independent columns; that is, the rank of a matrix is the dimension of the subspace generated by its columns.

Example 3.2.1 (Rank Determination)

Let

A=[101011101].

Observe that the first and second columns of A are linearly independent and that the third column is a linear combination of the first two. Thus,

rank(A)=dimspan{[101],[010],[111]}=2.

Example 3.2.2

Let

A=[121103112].

If we substract the 1st row of A from rows 2 and 3 (type 3 elementary row operations), the result is

[121022011].

If we now substract twice the 1st column from the 2nd and substract the 1st column from the 3rd (type 3 elementary column operations), we obtain

[100022011].

It's now obvious that the maximum number of linearly independent columns of this matrix is 2. Hence the rank of A is 2.

Theorem 3.6

Let A be an m×n matrix of rank r. Then rm, rn, and by means of a finite number of elementary row and column operations, A can be transformed into the matrix

D=[Ir010203],

where Ir is the r×r identity matrix and the other blocks are zero matrices. Thus Dii=1 for ir and Dij=0 otherwise.

Example 3.2.3

control

Conclusion about rank

Corollary 1

Let A be an m×n matrix of rank r. Then there exist invertible matrices B (size m×m) and C (size n×n) such that D=BAC, where

D=[Ir010203]

is the m×n matrix in which 01,02,03 are zero matrices.

Corollary 2

Let A be an m×n matrix. Then

  • rank(A)=rank(A).
  • The rank of any matrix equals the maximum number of its linearly independent rows; that is, the rank of a matrix is the dimension of the subspace generated by its rows.
  • The rows and columns of any matrix generate subspaces of the same dimension, numerically equal to the rank of the matrix.

Corollary 3

Every invertible matrix is a product of elementary matrices

Theorem 3.7

Let T:VW and U:WZ be linear transformations on finite- dimensional vector spaces V,W,Z, and let A and B be matrices s.t. the product AB is defined. Then

  • rank(UT)rank(U)
  • rank(UT)rank(T)
  • rank(AB)rank(A)
  • rank(AB)rank(B)

Example 3.2.4

control

The Inverse of a Matrix

Definition: augented matrix

Let A and B be m×n and m×p matrices, respectively. By the augmented matrix (A|B), we mean the m×(n+p) matrix (A B), that is, the matrix whose first n columns are the columns of A, and whose last p columns are the columns of B.

Let A be an invertible n×n matrix, and consider the n×2n augmented matrix C=(A|In). By Exercise 15, we have A1C=(A1A|A1In)=(In|A1).(1)

By Corollary 3 to Theorem 3.6, A1 is the product of elementary matrices, say A1=EpEp1E1. Thus (1) becomes EpEp1E1(A|In)=A1C=(In|A1).

Because multiplying a matrix on the left by an elementary matrix transforms the matrix by an elementary row operation (Theorem 3.1 p. 149), we have the following result: If A is an invertible n×n matrix, then it is possible to transform the matrix (A|In) into the matrix (In|A1) by means of a finite number of elementary row operations.

Conversely, suppose that A is invertible and that, for some n×n matrix B, the matrix (A|In) can be transformed into the matrix (In|B) by a finite number of elementary row operations. Let E1,E2,...,Ep be the elementary matrices associated with these elementary row operations as in Theorem 3.1; then

(2)EpEp1E1(A|In)=(In|B).

Letting M=EpEp1E1, we have from (2) that (MA|MIn)=(In|B). Hence MA=In and M=B. It follows that M=A1. So B=A1. Thus we have the following result: If A is an invertible n×n matrix, and the matrix (A|In) is transformed into a matrix of the form (InB) by means of a finite number of elementary row operations, then B=A1.

If, on the other hand, A is an n×n matrix that is not invertible, then rank(A)<n. Hence any attempt to transform (A|In) into a matrix of the form (In|B) by means of elementary row operations must fail because otherwise A can be transformed into In using the same row operations. This is impossible, however, because elementary row operations preserve rank. In fact, A can be transformed into a matrix with a row containing only zero entries, yielding the following result: If A is an n×n matrix that is not invertible, then any attempt to transform (A|In) into a matrix of the form (In|B) produces a row whose first n entries are zeros.

Example 3.2.5

control

Example 3.2.6

We determine whether the matrix

A=(121211154)

is invertible, and if it is, we compute its inverse. Using a strategy similar to the one used in Example 3.2.5, we attempt to use elementary row operations to transform

(A|I)=[121100211010154001]

into a matrix of the form (I|B). We first add 2 times row 1 to row 2 and 1 times row 1 to row 3. We then add row 2 to row 3. The result,

[121100211010154001][121100033210033101][121100033210000311].

is a matrix with a row whose 1st 3entries are zeros. Therefore A is not invertible.

Example 3.2.7 [core]

Let T:P2(R)P2(R) be defined by

T(f(x))=f(x)+f(x)+f(x),

where f(x) and f(x) denote the first and second derivatives of f(x). We use Corollary 1 of Theorem 2.18 to test T for invertibility and compute the inverse if T is invertible. Taking β to be the standard ordered basis of P2(R), we have

[T]β=[112012001].

Using the method of Examples 5 and 6, [T]β is invertible with inverse

([T]β)1=(110012001).

Thus T is invertible, and ([T]β)1=[T1]β. Hence by Theorem 2.14 (p. 91), we have

[T1(a0+a1x+a2x2)]β=[110012001][a0a1a2]=[a0a1a12a2a2]

Therefore

T1(a0+a1x+a2x2)=(a0a1)+(a12a2)x+a2x2.

3.3 Systems of Linear Equations - Theoretical Aspects

The system of equations

{a11X1+a12X2++a1nXn=b1a21X1+a22X2++a2nXn=b2am1X1+am2X2++amnXn=bm

where aij and bi are scalars in a field F and x1,x2,,xn are n variables taking values in F, is called a system of m linear equations in n unknowns over the field F.

The m×n matrix

A=[a11a12a1na21a22a2nam1am2amn]

is called the coefficient matrix of the system (S).

We can write the system as a single matrix equation

Ax=b,

where

x=(x1x2xn),b=(b1b2bm).

To exploit the results that we have developed, we often consider a system of linear equations as a single matrix equation.

A solution to the system (S) is an n-tuple

s=(s1,s2,,sn)Fn

such that As=b. The set of all solutions to the system (S) is called the solution set of the system. System (S) is consistent if its solution set is nonempty; otherwise it is inconsistent.

Example 3.3.1

(a) Consider the system

{x1+x2=3x1x2=1

The solution is unique: x1=2, x2=1.

In matrix form:

(1111)(x1x2)=(31).

(b) Consider the system

{2x1+3x2+x3=1x1x2+2x3=6

which has infinitely many solutions, such as: s=(6,2,7) and s=(8,4,3).

(c) Consider

{x1+x2=0x1+x2=1

that is

[1111][x1x2]=[01]

It's evident that this system has no solutions. Thus we see that a system of linear equations can have one, many, or no solutions.

homogeneous

Definitions: homogeneous

A system Ax=b is said to be homogeneous if b=0, otherwise nonhomogeneous.

Theorem 3.8

Let Ax=0 be a homogeneous system of m linear equations in n unknowns over a field F. Let K denote the set of all solutions to Ax=0. Then K=N(LA); hence K is a subspace of Fn of dimension nrank(LA)=nrank(A).

Corollary

If m<n, the system Ax=0 has a nonzero solution.

Example 3.3.2 [core]

(a) Consider the system

{x1+2x2+x3=0x1x2+x3=0

Let

A=(211001111)

be the coefficient matrix of this system. It is clear that rank(A)=2. If K is the solution set of this system, then dim(K)=32=1. Thus any nonzero solution constitutes a basis for K. For example, since

(123)

is a solution to the given system, it is a basis for K. Thus any vector in K is of the form

(t2t3t)

where tR.

(b) Consider the system x12x2+x3=0 of one equation in three unknowns. If A=(121) is the coefficient matrix, then rank(A)=1. Hence if K is the solution set, then dim(K)=31=2. Note that

[210] and [101]

are linearly independent vectors in K. Thus they constitute a basis for K, so that

K={t1[210]+t2[101]:t1,t2R}.

Theorem 3.9

Let K be the solution set of a system Ax=b, and let KH be the solution set of the corresponding homogeneous system Ax=0. Then for any solution s to Ax=b,

K={s}+KH={s+k:kKH}.

Example 3.3.3

(a) Consider the system

{x1+2x2+x3=7x1+x2x3=4.

The corresponding homogeneous system is the system in Example 3.3.2(a). It is easily verified that

s=(114)

is a solution to the preceding nonhomogeneous system. So the solution set of the system is

K=(114)+t(123),tR

by Theorem 3.9.

(b) Consider the system

2x12x2+x3=4.

The corresponding homogeneous system is the system in Example 3.3.2(b). Since

s=(400)

is a solution to the given system, the solution set K can be written as

K=(400)+t1(210)+t2(101):t1,t2R.

Theorem 3.10

Theorem 3.10

Let Ax=b be a system of n linear equations in n unknowns. If A is invertible, then the system has exactly one solution, namely,

x=A1b.

Conversely, if the system has exactly one solution, then A is invertible.

Example 3.3.4

Consider the system:

{2x2+4x3=22x1+4x2+2x3=33x1+3x2+x3=1

Using the inverse matrix of the coefficient matrix, the unique solution is

(x1x2x3)=A1b=(785418).

Theorem 3.11

Theorem 3.11

Let Ax=b be a system of linear equations. Then the system is consistent if and only if

rank(A)=rank(A|b).

Example 3.3.5

Recall the system of equations

{x1+x2=0,x1+x2=1.

in Example 3.3.1(c). Since

A=(11),(A|b)=(110111),rank(A)=1,rank(A|b)=2.

Because the two ranks are unequal, the system has no solutions.

Example 3.3.6

We can use Theorem 3.11 to determine whether (3,3,2) is in the range of the linear transformation T:R3R3 defined by

T(a1,a2,a3)=(a1+a2+a3,a1a2+a3,a1+a3).

We check if (3,3,2)R(T) by solving the system

x1+x2+x3=3,x1x2+x3=3,x1+x3=2

Since the rank of the coefficient matrix is 2 but the augmented matrix is 3, this system has no solutions. Hence (3,3,2)R(T).

3.4 Systems of Linear Equations - Computational Aspects

Definition: equivalent

Two systems of linear equations are called equivalent if they have the same solution set.

Theorem 3.13

Let Ax=b be a system of m linear equations in n unknowns, and let C be an invertible m×m matrix. Then (CA)x=Cb is equivalent to Ax=b.

Corollary

If (A|b) is obtained from (A|b) by a finite number of elementary row operations, then the system Ax=b is equivalent to the original system.

Gaussian Elimination Method Outline:

  1. Form the augmented matrix (A|b).
  2. Use elementary row operations to transform (A|b) into reduced row echelon form.
  3. Solve the system from the reduced form using back substitution.

We now describe a method for solving any system of linear equations.

Consider the system:

3x1+2x2+3x32x4=1,x1+x2+x3=3,x1+2x2+x3x4=2.

First, form the augmented matrix:

(323211110312112).

Then perform elementary row operations to transform it into an upper triangular matrix in reduced row echelon form using steps involving row interchanges, scaling, and elimination as described.

  1. In the leftmost nonzero column, create a 1 in the first column;
(121121110332321).
  1. By means of type 3 row operations, use the first row to obtain zeros in the remaining positions of the leftmost nonzero column;(121120101104015).
  2. Create a 1 in the next row in the leftmost possible column, without using previous rows;
(121120101104015).
  1. Use type 3 elementary row operations to obtain zeros below the 1 created in the preceding step;
(121120101100039).
  1. Repeat 3 and 4 until no nonzero rows remain;
(121120101100013).
  1. Work upward, beginning with the last nonzero row, and add multiples of each row to the rows above. Repeat the process for each preceding row until it is performed with the 2nd row, at which time the reduction process is completed.
(101010100200013).

Definition: Reduced row echelon form

A matrix is said to be in reduced row echelon form if it satisfies: (a) Any row containing a nonzero entry precedes any row with all zero entries.

(b) The first nonzero entry in each row is the only nonzero entry in its column.

(c) The first nonzero entry in each row is 1 and it occurs to the right of the first nonzero entry in the preceding row.

The method described with forward elimination and back substitution is known as Gaussian elimination.

Theorem 3.14

Gaussian elimination transforms any matrix into its reduced row echelon form.

Theorem 3.15

Let Ax=b be a system of r nonzero equations in n unknowns with rank(A)=rank(A|b)=r. Suppose (A|b) is in reduced row echelon form. Then:

  • rank(A)=r.
  • If the general solution is
s=s0+t1u1++tnrunr,

then {u1,,unr} is a basis for the solution set of the corresponding homogeneous system, and s0 is the solution to the original system.

Theorem 3.16

Let A be an m×n matrix of rank r>0, and B the reduced row echelon form of A. Then:

(a) The number of nonzero rows in B is r.

(b) For each i=1,,r, there is a column bji of B such that bji=ei.

(c) The columns of A numbered j1,,jr are linearly independent.

(d) For each k=1,,n, if column k of B is a linear combination diei, then column k of A is diaji.

Corollary

The reduced row echelon form of a matrix is unique.

Example 3.4.2

Let

A=(24624123112480036759).

The reduced row echelon form of A is

B=(12040001100000100000).

Since B has three nonzero rows, the rank of A is 3 . The first, third, and fifth columns of B are e1,e2, and e3; so Theorem 3.16(c) asserts that the first, third, and fifth columns of A are linearly independent.

Let the columns of A be denoted a1,a2,a3,a4, and a5. Because the second column of B is 2e1, it follows from Theorem 3.16(d) that a2=2a1, as is easily checked. Moreover, since the fourth column of B is 4e1+(1)e2, the same result shows that

a4=4a1+(1)a3.

Example 3.4.3

The set

S={2+x+2x2+3x3,4+2x+4x2+6x3,6+3x+8x2+7x3,2+x+5x3,4+x+9x3}

generates a subspace V of P3(R). To find a subset of S that is a basis for V, we consider the subset

S={(2,1,2,3),(4,2,4,6),(6,3,8,7),(2,1,0,5),(4,1,0,9)}

consisting of the images of the polynomials in S under the standard representation of P3(R) with respect to the standard ordered basis. Note that the 4×5 matrix in which the columns are the vectors in S is the matrix A in Example 2. From the reduced row echelon form of A, which is the matrix B in Example 2, we see that the first, third, and fifth columns of A are linearly independent and the second and fourth columns of A are linear combinations of the first, third, and fifth columns. Hence

{(2,1,2,3),(6,3,8,7),(4,1,0,9)}

is a basis for the subspace of R4 that is generated by S. It follows that

{2+x+2x2+3x3,6+3x+8x2+7x3,4+x+9x3}

is a basis for the subspace V of P3(R).

Example 3.4.4 [mid-term]

Let

V={(x1,x2,x3,x4,x5)R5:x1+7x2+5x34x4+2x5=0}.

It is easily verified that V is a subspace of R5 and that

S={(2,0,0,1,1),(1,1,2,1,1),(5,1,0,1,1)}

is a linearly independent subset of V.

To extend S to a basis for V, we first obtain a basis β for V. To do so, we solve the system of linear equations that defines V. Since in this case V is defined by a single equation, we need only write the equation as

x1=7x25x3+4x42x5

and assign parametric values to x2,x3,x4,x5. If x2=t1,x3=t2,x4=t3, and x5=t4, then the vectors in V have the form

(x1,x2,x3,x4,x5)=(7t15t2+4t32t4,t1,t2,t3,t4)=t1(7,1,0,0,0)+t2(5,0,1,0,0)+t3(4,0,0,1,0)+t4(2,0,0,0,1).

Hence

β={(7,1,0,0,0),(5,0,1,0,0),(4,0,0,1,0),(2,0,0,0,1)}

is a basis for V by Theorem 3.15.

The matrix whose columns consist of the vectors in S followed by those β is

(21575420111000020010011100101110001)

and its reduced row echelon form is

(10011010100.5000011.50000000110000000).

Thus

{(2,0,0,1,1),(1,1,2,1,1),(5,1,0,1,1),(4,0,0,1,0)}

is a basis for V containing S.

Released under the MIT License.