Implementing Gaussian Elimination#
We start with an implementation of Gaussian elimination as described previously. Note that a feature of this implementation is that the input A
and b
are changed by this routine, and on output they reflect the row-echelon
form. This is done to save memory.
We also, by default, will print the steps along the way
import numpy as np
def gauss_elim(A, b, quiet=0):
""" perform gaussian elimination with pivoting, solving A x = b.
A is an NxN matrix, x and b are an N-element vectors. Note: A
and b are changed upon exit to be in upper triangular (row
echelon) form """
assert b.ndim == 1, "ERROR: b should be a vector"
N = len(b)
# A is square, with each dimension of length N
assert A.shape == (N, N), "ERROR: A should be square with each dim of same length as b"
# allocation the solution array
x = np.zeros((N), dtype=A.dtype)
# find the scale factors for each row -- this is used when pivoting
scales = np.max(np.abs(A), 1)
# keep track of the number of times we swapped rows
num_row_swap = 0
if not quiet:
print_Ab(A, b)
# main loop over rows
for k in range(N):
# find the pivot row based on the size of column k -- only consider
# the rows beyond the current row
row_max = np.argmax(A[k:, k]/scales[k:])
if k > 0:
row_max += k # we sliced A from k:, correct for total rows
# swap the row with the largest scaled element in the current column
# with the current row (pivot) -- do this with b too!
if not row_max == k:
A[[k, row_max], :] = A[[row_max, k], :]
b[[k, row_max]] = b[[row_max, k]]
scales[[k, row_max]] = scales[[row_max, k]]
if not quiet:
print("pivoted")
num_row_swap += 1
# do the forward-elimination for all rows below the current
for i in range(k+1, N):
coeff = A[i, k] / A[k, k]
for j in range(k+1, N):
A[i, j] += -A[k, j] * coeff
A[i, k] = 0.0
b[i] += -coeff * b[k]
# check if the row is all zeros -- singular
if A[i, :].min() == 0 and A[i, :].max() == 0:
raise ValueError("matrix is singular")
if not quiet:
print_Ab(A, b)
# back-substitution
# last solution is easy
x[N-1] = b[N-1] / A[N-1, N-1]
for i in reversed(range(N-1)):
bsum = b[i]
for j in range(i+1, N):
bsum += -A[i, j] * x[j]
x[i] = bsum / A[i, i]
return x
This routine prints the augmented matrix \(({\bf A} | {\bf b})\).
def print_Ab(A, b):
""" printout the matrix A and vector b in a pretty fashion. We
don't use the numpy print here, because we want to make them
side by side"""
N = len(b)
# draw pretty brackets
topl = "\u256d"
topr = "\u256e"
side = "\u2502"
botl = "\u2570"
botr = "\u256F"
space = 8*" "
top_str = topl + N*" {:>7.03f} " + topr + space + topl + " {:6.3f} " + topr
sid_str = side + N*" {:>7.03f} " + side + space + side + " {:6.3f} " + side
bot_str = botl + N*" {:>7.03f} " + botr + space + botl + " {:6.3f} " + botr
print(" ")
for i in range(N):
if i == 0:
pstr = top_str
elif i == N-1:
pstr = bot_str
else:
pstr = sid_str
out = tuple(A[i, :]) + (b[i],)
print(pstr.format(*out))
Now we can test this out.
Important: be sure to create your arrays as floating point arrays. If we don’t specify, and just include integers in the initialization, then the array will be an integer array, and we will truncate all divisions and won’t get the right answer.
A = np.array([[1, 1, 1],
[-1, 2, 0],
[2, 0, 1]], dtype=np.float64)
b = np.array([6, 3, 5], dtype=np.float64)
# since our method changes A and b as it works, we'll send in copies
x = gauss_elim(A.copy(), b.copy())
╭ 1.000 1.000 1.000 ╮ ╭ 6.000 ╮
│ -1.000 2.000 0.000 │ │ 3.000 │
╰ 2.000 0.000 1.000 ╯ ╰ 5.000 ╯
╭ 1.000 1.000 1.000 ╮ ╭ 6.000 ╮
│ 0.000 3.000 1.000 │ │ 9.000 │
╰ 0.000 -2.000 -1.000 ╯ ╰ -7.000 ╯
╭ 1.000 1.000 1.000 ╮ ╭ 6.000 ╮
│ 0.000 3.000 1.000 │ │ 9.000 │
╰ 0.000 0.000 -0.333 ╯ ╰ -1.000 ╯
╭ 1.000 1.000 1.000 ╮ ╭ 6.000 ╮
│ 0.000 3.000 1.000 │ │ 9.000 │
╰ 0.000 0.000 -0.333 ╯ ╰ -1.000 ╯
print(x)
[1. 2. 3.]
To ensure that we got the right answer, we can compare \({\bf A}{\bf x}\) and \({\bf b}\). We can use the python matrix multiplication operator, @
:
print(A @ x - b)
[ 0.0000000e+00 -4.4408921e-16 0.0000000e+00]
\(4\times 4\) example#
Let’s try another, larger example
A = np.array([[1., 2., 3., 4.],
[5., 1., 6., -1.],
[10., 2., 13., -2.],
[4., 10., -2., -5.]])
b = np.array([3., 2., 4., 1.])
x = gauss_elim(A.copy(), b.copy())
╭ 1.000 2.000 3.000 4.000 ╮ ╭ 3.000 ╮
│ 5.000 1.000 6.000 -1.000 │ │ 2.000 │
│ 10.000 2.000 13.000 -2.000 │ │ 4.000 │
╰ 4.000 10.000 -2.000 -5.000 ╯ ╰ 1.000 ╯
pivoted
╭ 5.000 1.000 6.000 -1.000 ╮ ╭ 2.000 ╮
│ 0.000 1.800 1.800 4.200 │ │ 2.600 │
│ 0.000 0.000 1.000 0.000 │ │ 0.000 │
╰ 0.000 9.200 -6.800 -4.200 ╯ ╰ -0.600 ╯
pivoted
╭ 5.000 1.000 6.000 -1.000 ╮ ╭ 2.000 ╮
│ 0.000 9.200 -6.800 -4.200 │ │ -0.600 │
│ 0.000 0.000 1.000 0.000 │ │ 0.000 │
╰ 0.000 0.000 3.130 5.022 ╯ ╰ 2.717 ╯
pivoted
╭ 5.000 1.000 6.000 -1.000 ╮ ╭ 2.000 ╮
│ 0.000 9.200 -6.800 -4.200 │ │ -0.600 │
│ 0.000 0.000 3.130 5.022 │ │ 2.717 │
╰ 0.000 0.000 0.000 -1.604 ╯ ╰ -0.868 ╯
╭ 5.000 1.000 6.000 -1.000 ╮ ╭ 2.000 ╮
│ 0.000 9.200 -6.800 -4.200 │ │ -0.600 │
│ 0.000 0.000 3.130 5.022 │ │ 2.717 │
╰ 0.000 0.000 0.000 -1.604 ╯ ╰ -0.868 ╯
x
array([0.47186147, 0.18181818, 0. , 0.54112554])
A @ x - b
array([0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 6.66133815e-16])
Singular example#
changing A[2, 2]
from 13
to 12
makes the system singular, since row 2 is just double row 1.
A = np.array([[1., 2., 3., 4.],
[5., 1., 6., -1.],
[10., 2., 12., -2.],
[4., 10., -2., -5.]])
b = np.array([3., 2., 4., 1.])
x = gauss_elim(A, b)
╭ 1.000 2.000 3.000 4.000 ╮ ╭ 3.000 ╮
│ 5.000 1.000 6.000 -1.000 │ │ 2.000 │
│ 10.000 2.000 12.000 -2.000 │ │ 4.000 │
╰ 4.000 10.000 -2.000 -5.000 ╯ ╰ 1.000 ╯
pivoted
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[11], line 7
1 A = np.array([[1., 2., 3., 4.],
2 [5., 1., 6., -1.],
3 [10., 2., 12., -2.],
4 [4., 10., -2., -5.]])
5 b = np.array([3., 2., 4., 1.])
----> 7 x = gauss_elim(A, b)
Cell In[2], line 58, in gauss_elim(A, b, quiet)
56 # check if the row is all zeros -- singular
57 if A[i, :].min() == 0 and A[i, :].max() == 0:
---> 58 raise ValueError("matrix is singular")
60 if not quiet:
61 print_Ab(A, b)
ValueError: matrix is singular