Solving Linear Systems

This script is a tutorial for how to solve linear systems using both direct and iterative methods in Matlab.

A linear system of equations, of the form Ax=b, can be solved both directly and iteratively. Matlab has a number of different methods built into the function "\". To solve the system Ax=b you would call "A\b". This is by far the quickest way to directly solve a linear system since Matlab chooses the best method depending on the classification of the matrix.

N = 10;
A1=rand(N);
b1=rand(N,1);
tic
x=A1\b1;
toc
Elapsed time is 0.014921 seconds.

The other way of solving a linear system is by using iterative methods. This minimizes the residual or error to within a specified tolerance, solving the system. We can use the built in function "bicgstab" to use Biconjugate gradients stabilized method.

tic
bicgstab(A1,b1,1e-5,1000);
toc
bicgstab converged at iteration 17.5 to a solution with relative residual 7e-06.
Elapsed time is 0.056772 seconds.

Consider now if we change N to be much larger.

N = 1000;
A2=rand(N);
b2=rand(N,1);

tic
x=A2\b2;
toc

tic
bicgstab(A2,b2,1e-6,1000);
toc
Elapsed time is 0.034424 seconds.
bicgstab stopped at iteration 195 without converging to the desired tolerance 1e-06
because a scalar quantity became too small or too large to continue computing.
The iterate returned (number 1) has relative residual 0.5.
Elapsed time is 0.211840 seconds.

As you can see, for large dense matrices, the iterative method does not perform anywhere close to the matrix left divide function (in fact, in many cases it does not even converge). However, let's now consider a sparse matrix.

xmin = 0;
xmax = 50;
N = 2^10;
x = linspace(xmin,xmax,N+1);
dx=x(2)-x(1); dx2=dx*dx;
e=ones(N+1,1);
Dxx = spdiags([e -2*e e], -1:1, N+1, N+1);
Dxx=(1/dx2)*Dxx;
b3=ones(size(Dxx,1),1);

This is the second order differentiation matrix from an earlier script. It is sparse with only entries along the diagonal. Now let's try again:

tic
x=Dxx\b3;
toc

tic
bicgstab(Dxx,b3,1e-6,1000);
toc
Elapsed time is 0.000257 seconds.
bicgstab converged at iteration 600.5 to a solution with relative residual 5.3e-07.
Elapsed time is 0.061603 seconds.

In general iterative methods work better on sparse matrices and ones with diagonal dominance. In specific cases and for particular problems iterative methods will be better, but in most cases you should use the built in "backslash" solve since it has been optimized over decades to use the best method for a given matrix.

If you want to see exactly what "\" is doing under the hood, you can change the verbosity so that it prints its choices:

spparms('spumoni',2)
A2\rand(size(A2,1),1);
Dxx\b3;
sparse(rand(N).*round(rand(N)-0.2))\rand(N,1);
sp\: bandwidth = 1+1+1.
sp\: is A diagonal? no.
sp\: is it ok to solve real tridiagonal A without pivoting? yes.
sp\: bandwidth = 1023+1+1022.
sp\: is A diagonal? no.
sp\: is band density (0.299638) > bandden (0.5) to try banded solver? no.
sp\: is A triangular? no.
sp\: is A morally triangular? no.
sp\: is A a candidate for Cholesky (symmetric, real positive diagonal)? no.
sp\: use Unsymmetric MultiFrontal PACKagewith automatic reordering.
UMFPACK V5.4.0 (May 20, 2009), Control:
    Matrix entry defined as: double
    Int (generic integer) defined as: UF_long

    0: print level: 2
    1: dense row parameter:    0.2
        "dense" rows have    > max (16, (0.2)*16*sqrt(n_col) entries)
    2: dense column parameter: 0.2
        "dense" columns have > max (16, (0.2)*16*sqrt(n_row) entries)
    3: pivot tolerance: 0.1
    4: block size for dense matrix kernels: 32
    5: strategy: 0 (auto)
    6: initial allocation ratio: 0.7
    7: max iterative refinement steps: 2
    12: 2-by-2 pivot tolerance: 0.01
    13: Q fixed during numerical factorization: 0 (auto)
    14: AMD dense row/col parameter:    10
       "dense" rows/columns have > max (16, (10)*sqrt(n)) entries
        Only used if the AMD ordering is used.
    15: diagonal pivot tolerance: 0.001
        Only used if diagonal pivoting is attempted.
    16: scaling: 1 (divide each row by sum of abs. values in each row)
    17: frontal matrix allocation ratio: 0.5
    18: drop tolerance: 0
    19: AMD and COLAMD aggressive absorption: 1 (yes)

    The following options can only be changed at compile-time:
    8: BLAS library used:  Fortran BLAS.  size of BLAS integer: 8
    9: compiled for MATLAB
    10: CPU timer is POSIX times ( ) routine.
    11: compiled for normal operation (debugging disabled)
    computer/operating system: unknown
    size of int: 4 UF_long: 8 Int: 8 pointer: 8 double: 8 Entry: 8 (in bytes)

sp\: UMFPACK's factorization was successful.
sp\: UMFPACK's solve was successful.
UMFPACK V5.4.0 (May 20, 2009), Info:
    matrix entry defined as:          double
    Int (generic integer) defined as: UF_long
    BLAS library used: Fortran BLAS.  size of BLAS integer: 8
    MATLAB:                           yes.
    CPU timer:                        POSIX times ( ) routine.
    number of rows in matrix A:       1024
    number of columns in matrix A:    1024
    entries in matrix A:              314193
    memory usage reported in:         16-byte Units
    size of int:                      4 bytes
    size of UF_long:                  8 bytes
    size of pointer:                  8 bytes
    size of numerical entry:          8 bytes

    strategy used:                    unsymmetric
    ordering used:                    colamd on A
    modify Q during factorization:    yes
    prefer diagonal pivoting:         no
    pivots with zero Markowitz cost:               0
    submatrix S after removing zero-cost pivots:
        number of "dense" rows:                    1024
        number of "dense" columns:                 1024
        number of empty rows:                      0
        number of empty columns                    0
        submatrix S square and diagonal preserved
    pattern of square submatrix S:
        number rows and columns                    1024
        symmetry of nonzero pattern:               0.297084
        nz in S+S' (excl. diagonal):               534552
        nz on diagonal of matrix S:                289
        fraction of nz on diagonal:                0.282227
    symbolic factorization defragmentations:       0
    symbolic memory usage (Units):                 703584
    symbolic memory usage (MBytes):                10.7
    Symbolic size (Units):                         2629
    Symbolic size (MBytes):                        0
    symbolic factorization CPU time (sec):         0.02
    symbolic factorization wallclock time(sec):    0.02

    matrix scaled: yes (divided each row by sum of abs values in each row)
    minimum sum (abs (rows of A)):              1.28869e+02
    maximum sum (abs (rows of A)):              1.82775e+02

    symbolic/numeric factorization:      upper bound               actual      %
    variable-sized part of Numeric object:
        initial size (Units)                  642729               641704   100%
        peak size (Units)                    1731846              1504811    87%
        final size (Units)                    540112               520516    96%
    Numeric final size (Units)                545780               525672    96%
    Numeric final size (MBytes)                  8.3                  8.0    96%
    peak memory usage (Units)                1749383              1522348    87%
    peak memory usage (MBytes)                  26.7                 23.2    87%
    numeric factorization flops          7.10325e+08          6.96778e+08    98%
    nz in L (incl diagonal)                   522362               519916   100%
    nz in U (incl diagonal)                   524800               518172    99%
    nz in L+U (incl diagonal)                1046138              1037064    99%
    largest front (# entries)                1029120               984966    96%
    largest # rows in front                     1005                 1004   100%
    largest # columns in front                  1024                  983    96%

    initial allocation ratio used:                 0.7
    # of forced updates due to frontal growth:     3
    nz in L (incl diagonal), if none dropped       519916
    nz in U (incl diagonal), if none dropped       518172
    number of small entries dropped                0
    nonzeros on diagonal of U:                     1024
    min abs. value on diagonal of U:               1.16e-03
    max abs. value on diagonal of U:               2.45e-01
    estimate of reciprocal of condition number:    4.74e-03
    indices in compressed pattern:                 2607
    numerical values stored in Numeric object:     1038406
    numeric factorization defragmentations:        2
    numeric factorization reallocations:           2
    costly numeric factorization reallocations:    2
    numeric factorization CPU time (sec):          0.18
    numeric factorization wallclock time (sec):    0.14
    numeric factorization mflops (CPU time):       3870.99
    numeric factorization mflops (wallclock):      4976.98
    symbolic + numeric CPU time (sec):             0.20
    symbolic + numeric mflops (CPU time):          3483.89
    symbolic + numeric wall clock time (sec):      0.16
    symbolic + numeric mflops (wall clock):        4354.86

    solve flops:                                   6.99340e+06
    iterative refinement steps taken:              1
    iterative refinement steps attempted:          1
    sparse backward error omega1:                  1.60e-16
    sparse backward error omega2:                  0.00e+00
    solve CPU time (sec):                          0.01
    solve wall clock time (sec):                   0.00
    solve mflops (CPU time):                       699.34

    total symbolic + numeric + solve flops:        7.03771e+08
    total symbolic + numeric + solve CPU time:     0.21
    total symbolic + numeric + solve mflops (CPU): 3351.29