Implementing Gaussian Elimination

Implementing Gaussian Elimination#

We start with an implementation of Gaussian elimination as described previously. Note that a feature of this implementation is that the input A and b are changed by this routine, and on output they reflect the row-echelon form. This is done to save memory.

We also, by default, will print the steps along the way

import numpy as np
def gauss_elim(A, b, quiet=0):
    """ perform gaussian elimination with pivoting, solving A x = b.

        A is an NxN matrix, x and b are an N-element vectors.  Note: A
        and b are changed upon exit to be in upper triangular (row
        echelon) form """

    assert b.ndim == 1, "ERROR: b should be a vector"

    N = len(b)

    # A is square, with each dimension of length N
    assert A.shape == (N, N), "ERROR: A should be square with each dim of same length as b"

    # allocation the solution array
    x = np.zeros((N), dtype=A.dtype)

    # find the scale factors for each row -- this is used when pivoting
    scales = np.max(np.abs(A), 1)

    # keep track of the number of times we swapped rows
    num_row_swap = 0

    if not quiet:
        print_Ab(A, b)

    # main loop over rows
    for k in range(N):

        # find the pivot row based on the size of column k -- only consider
        # the rows beyond the current row
        row_max = np.argmax(A[k:, k]/scales[k:])
        if k > 0:
            row_max += k  # we sliced A from k:, correct for total rows

        # swap the row with the largest scaled element in the current column
        # with the current row (pivot) -- do this with b too!
        if not row_max == k:
            A[[k, row_max], :] = A[[row_max, k], :]
            b[[k, row_max]] = b[[row_max, k]]
            scales[[k, row_max]] = scales[[row_max, k]]
            if not quiet:
                print("pivoted")
            num_row_swap += 1

        # do the forward-elimination for all rows below the current
        for i in range(k+1, N):
            coeff = A[i, k] / A[k, k]

            for j in range(k+1, N):
                A[i, j] += -A[k, j] * coeff

            A[i, k] = 0.0
            b[i] += -coeff * b[k]
            
            # check if the row is all zeros -- singular
            if A[i, :].min() == 0 and A[i, :].max() == 0:
                raise ValueError("matrix is singular")

        if not quiet:
            print_Ab(A, b)

    # back-substitution

    # last solution is easy
    x[N-1] = b[N-1] / A[N-1, N-1]

    for i in reversed(range(N-1)):
        bsum = b[i]
        for j in range(i+1, N):
            bsum += -A[i, j] * x[j]
        x[i] = bsum / A[i, i]

    return x

This routine prints the augmented matrix \(({\bf A} | {\bf b})\).

def print_Ab(A, b):
    """ printout the matrix A and vector b in a pretty fashion.  We
        don't use the numpy print here, because we want to make them
        side by side"""

    N = len(b)

    # draw pretty brackets

    topl = "\u256d"
    topr = "\u256e"
    side = "\u2502"
    botl = "\u2570"
    botr = "\u256F"

    space = 8*" "

    top_str = topl + N*" {:>7.03f} " + topr + space + topl + " {:6.3f} " + topr
    sid_str = side + N*" {:>7.03f} " + side + space + side + " {:6.3f} " + side
    bot_str = botl + N*" {:>7.03f} " + botr + space + botl + " {:6.3f} " + botr

    print(" ")
    for i in range(N):
        if i == 0:
            pstr = top_str
        elif i == N-1:
            pstr = bot_str
        else:
            pstr = sid_str
        out = tuple(A[i, :]) + (b[i],)
        print(pstr.format(*out))

Now we can test this out.

Important: be sure to create your arrays as floating point arrays. If we don’t specify, and just include integers in the initialization, then the array will be an integer array, and we will truncate all divisions and won’t get the right answer.

A = np.array([[1, 1, 1],
              [-1, 2, 0],
              [2, 0, 1]], dtype=np.float64)

b = np.array([6, 3, 5], dtype=np.float64)

# since our method changes A and b as it works, we'll send in copies
x = gauss_elim(A.copy(), b.copy())
 
╭   1.000    1.000    1.000 ╮        ╭  6.000 ╮
│  -1.000    2.000    0.000 │        │  3.000 │
╰   2.000    0.000    1.000 ╯        ╰  5.000 ╯
 
╭   1.000    1.000    1.000 ╮        ╭  6.000 ╮
│   0.000    3.000    1.000 │        │  9.000 │
╰   0.000   -2.000   -1.000 ╯        ╰ -7.000 ╯
 
╭   1.000    1.000    1.000 ╮        ╭  6.000 ╮
│   0.000    3.000    1.000 │        │  9.000 │
╰   0.000    0.000   -0.333 ╯        ╰ -1.000 ╯
 
╭   1.000    1.000    1.000 ╮        ╭  6.000 ╮
│   0.000    3.000    1.000 │        │  9.000 │
╰   0.000    0.000   -0.333 ╯        ╰ -1.000 ╯
print(x)
[1. 2. 3.]

To ensure that we got the right answer, we can compare \({\bf A}{\bf x}\) and \({\bf b}\). We can use the python matrix multiplication operator, @:

print(A @ x - b)
[ 0.0000000e+00 -4.4408921e-16  0.0000000e+00]

\(4\times 4\) example#

Let’s try another, larger example

A = np.array([[1., 2., 3., 4.],
              [5., 1., 6., -1.],
              [10., 2., 13., -2.],
              [4., 10., -2., -5.]])
b = np.array([3., 2., 4., 1.])
x = gauss_elim(A.copy(), b.copy())
 
╭   1.000    2.000    3.000    4.000 ╮        ╭  3.000 ╮
│   5.000    1.000    6.000   -1.000 │        │  2.000 │
│  10.000    2.000   13.000   -2.000 │        │  4.000 │
╰   4.000   10.000   -2.000   -5.000 ╯        ╰  1.000 ╯
pivoted
 
╭   5.000    1.000    6.000   -1.000 ╮        ╭  2.000 ╮
│   0.000    1.800    1.800    4.200 │        │  2.600 │
│   0.000    0.000    1.000    0.000 │        │  0.000 │
╰   0.000    9.200   -6.800   -4.200 ╯        ╰ -0.600 ╯
pivoted
 
╭   5.000    1.000    6.000   -1.000 ╮        ╭  2.000 ╮
│   0.000    9.200   -6.800   -4.200 │        │ -0.600 │
│   0.000    0.000    1.000    0.000 │        │  0.000 │
╰   0.000    0.000    3.130    5.022 ╯        ╰  2.717 ╯
pivoted
 
╭   5.000    1.000    6.000   -1.000 ╮        ╭  2.000 ╮
│   0.000    9.200   -6.800   -4.200 │        │ -0.600 │
│   0.000    0.000    3.130    5.022 │        │  2.717 │
╰   0.000    0.000    0.000   -1.604 ╯        ╰ -0.868 ╯
 
╭   5.000    1.000    6.000   -1.000 ╮        ╭  2.000 ╮
│   0.000    9.200   -6.800   -4.200 │        │ -0.600 │
│   0.000    0.000    3.130    5.022 │        │  2.717 │
╰   0.000    0.000    0.000   -1.604 ╯        ╰ -0.868 ╯
x
array([0.47186147, 0.18181818, 0.        , 0.54112554])
A @ x - b
array([0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 6.66133815e-16])

Singular example#

changing A[2, 2] from 13 to 12 makes the system singular, since row 2 is just double row 1.

A = np.array([[1., 2., 3., 4.],
              [5., 1., 6., -1.],
              [10., 2., 12., -2.],
              [4., 10., -2., -5.]])
b = np.array([3., 2., 4., 1.])

x = gauss_elim(A, b)
 
╭   1.000    2.000    3.000    4.000 ╮        ╭  3.000 ╮
│   5.000    1.000    6.000   -1.000 │        │  2.000 │
│  10.000    2.000   12.000   -2.000 │        │  4.000 │
╰   4.000   10.000   -2.000   -5.000 ╯        ╰  1.000 ╯
pivoted
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[11], line 7
      1 A = np.array([[1., 2., 3., 4.],
      2               [5., 1., 6., -1.],
      3               [10., 2., 12., -2.],
      4               [4., 10., -2., -5.]])
      5 b = np.array([3., 2., 4., 1.])
----> 7 x = gauss_elim(A, b)

Cell In[2], line 58, in gauss_elim(A, b, quiet)
     56     # check if the row is all zeros -- singular
     57     if A[i, :].min() == 0 and A[i, :].max() == 0:
---> 58         raise ValueError("matrix is singular")
     60 if not quiet:
     61     print_Ab(A, b)

ValueError: matrix is singular