Solving Linear Systems
This script is a tutorial for how to solve linear systems using both direct and iterative methods in Matlab.
A linear system of equations, of the form Ax=b, can be solved both directly and iteratively. Matlab has a number of different methods built into the function "\". To solve the system Ax=b you would call "A\b". This is by far the quickest way to directly solve a linear system since Matlab chooses the best method depending on the classification of the matrix.
N = 10; A1=rand(N); b1=rand(N,1); tic x=A1\b1; toc
Elapsed time is 0.014921 seconds.
The other way of solving a linear system is by using iterative methods. This minimizes the residual or error to within a specified tolerance, solving the system. We can use the built in function "bicgstab" to use Biconjugate gradients stabilized method.
tic bicgstab(A1,b1,1e-5,1000); toc
bicgstab converged at iteration 17.5 to a solution with relative residual 7e-06. Elapsed time is 0.056772 seconds.
Consider now if we change N to be much larger.
N = 1000; A2=rand(N); b2=rand(N,1); tic x=A2\b2; toc tic bicgstab(A2,b2,1e-6,1000); toc
Elapsed time is 0.034424 seconds. bicgstab stopped at iteration 195 without converging to the desired tolerance 1e-06 because a scalar quantity became too small or too large to continue computing. The iterate returned (number 1) has relative residual 0.5. Elapsed time is 0.211840 seconds.
As you can see, for large dense matrices, the iterative method does not perform anywhere close to the matrix left divide function (in fact, in many cases it does not even converge). However, let's now consider a sparse matrix.
xmin = 0; xmax = 50; N = 2^10; x = linspace(xmin,xmax,N+1); dx=x(2)-x(1); dx2=dx*dx; e=ones(N+1,1); Dxx = spdiags([e -2*e e], -1:1, N+1, N+1); Dxx=(1/dx2)*Dxx; b3=ones(size(Dxx,1),1);
This is the second order differentiation matrix from an earlier script. It is sparse with only entries along the diagonal. Now let's try again:
tic x=Dxx\b3; toc tic bicgstab(Dxx,b3,1e-6,1000); toc
Elapsed time is 0.000257 seconds. bicgstab converged at iteration 600.5 to a solution with relative residual 5.3e-07. Elapsed time is 0.061603 seconds.
In general iterative methods work better on sparse matrices and ones with diagonal dominance. In specific cases and for particular problems iterative methods will be better, but in most cases you should use the built in "backslash" solve since it has been optimized over decades to use the best method for a given matrix.
If you want to see exactly what "\" is doing under the hood, you can change the verbosity so that it prints its choices:
spparms('spumoni',2)
A2\rand(size(A2,1),1);
Dxx\b3;
sparse(rand(N).*round(rand(N)-0.2))\rand(N,1);
sp\: bandwidth = 1+1+1. sp\: is A diagonal? no. sp\: is it ok to solve real tridiagonal A without pivoting? yes. sp\: bandwidth = 1023+1+1022. sp\: is A diagonal? no. sp\: is band density (0.299638) > bandden (0.5) to try banded solver? no. sp\: is A triangular? no. sp\: is A morally triangular? no. sp\: is A a candidate for Cholesky (symmetric, real positive diagonal)? no. sp\: use Unsymmetric MultiFrontal PACKagewith automatic reordering. UMFPACK V5.4.0 (May 20, 2009), Control: Matrix entry defined as: double Int (generic integer) defined as: UF_long 0: print level: 2 1: dense row parameter: 0.2 "dense" rows have > max (16, (0.2)*16*sqrt(n_col) entries) 2: dense column parameter: 0.2 "dense" columns have > max (16, (0.2)*16*sqrt(n_row) entries) 3: pivot tolerance: 0.1 4: block size for dense matrix kernels: 32 5: strategy: 0 (auto) 6: initial allocation ratio: 0.7 7: max iterative refinement steps: 2 12: 2-by-2 pivot tolerance: 0.01 13: Q fixed during numerical factorization: 0 (auto) 14: AMD dense row/col parameter: 10 "dense" rows/columns have > max (16, (10)*sqrt(n)) entries Only used if the AMD ordering is used. 15: diagonal pivot tolerance: 0.001 Only used if diagonal pivoting is attempted. 16: scaling: 1 (divide each row by sum of abs. values in each row) 17: frontal matrix allocation ratio: 0.5 18: drop tolerance: 0 19: AMD and COLAMD aggressive absorption: 1 (yes) The following options can only be changed at compile-time: 8: BLAS library used: Fortran BLAS. size of BLAS integer: 8 9: compiled for MATLAB 10: CPU timer is POSIX times ( ) routine. 11: compiled for normal operation (debugging disabled) computer/operating system: unknown size of int: 4 UF_long: 8 Int: 8 pointer: 8 double: 8 Entry: 8 (in bytes) sp\: UMFPACK's factorization was successful. sp\: UMFPACK's solve was successful. UMFPACK V5.4.0 (May 20, 2009), Info: matrix entry defined as: double Int (generic integer) defined as: UF_long BLAS library used: Fortran BLAS. size of BLAS integer: 8 MATLAB: yes. CPU timer: POSIX times ( ) routine. number of rows in matrix A: 1024 number of columns in matrix A: 1024 entries in matrix A: 314193 memory usage reported in: 16-byte Units size of int: 4 bytes size of UF_long: 8 bytes size of pointer: 8 bytes size of numerical entry: 8 bytes strategy used: unsymmetric ordering used: colamd on A modify Q during factorization: yes prefer diagonal pivoting: no pivots with zero Markowitz cost: 0 submatrix S after removing zero-cost pivots: number of "dense" rows: 1024 number of "dense" columns: 1024 number of empty rows: 0 number of empty columns 0 submatrix S square and diagonal preserved pattern of square submatrix S: number rows and columns 1024 symmetry of nonzero pattern: 0.297084 nz in S+S' (excl. diagonal): 534552 nz on diagonal of matrix S: 289 fraction of nz on diagonal: 0.282227 symbolic factorization defragmentations: 0 symbolic memory usage (Units): 703584 symbolic memory usage (MBytes): 10.7 Symbolic size (Units): 2629 Symbolic size (MBytes): 0 symbolic factorization CPU time (sec): 0.02 symbolic factorization wallclock time(sec): 0.02 matrix scaled: yes (divided each row by sum of abs values in each row) minimum sum (abs (rows of A)): 1.28869e+02 maximum sum (abs (rows of A)): 1.82775e+02 symbolic/numeric factorization: upper bound actual % variable-sized part of Numeric object: initial size (Units) 642729 641704 100% peak size (Units) 1731846 1504811 87% final size (Units) 540112 520516 96% Numeric final size (Units) 545780 525672 96% Numeric final size (MBytes) 8.3 8.0 96% peak memory usage (Units) 1749383 1522348 87% peak memory usage (MBytes) 26.7 23.2 87% numeric factorization flops 7.10325e+08 6.96778e+08 98% nz in L (incl diagonal) 522362 519916 100% nz in U (incl diagonal) 524800 518172 99% nz in L+U (incl diagonal) 1046138 1037064 99% largest front (# entries) 1029120 984966 96% largest # rows in front 1005 1004 100% largest # columns in front 1024 983 96% initial allocation ratio used: 0.7 # of forced updates due to frontal growth: 3 nz in L (incl diagonal), if none dropped 519916 nz in U (incl diagonal), if none dropped 518172 number of small entries dropped 0 nonzeros on diagonal of U: 1024 min abs. value on diagonal of U: 1.16e-03 max abs. value on diagonal of U: 2.45e-01 estimate of reciprocal of condition number: 4.74e-03 indices in compressed pattern: 2607 numerical values stored in Numeric object: 1038406 numeric factorization defragmentations: 2 numeric factorization reallocations: 2 costly numeric factorization reallocations: 2 numeric factorization CPU time (sec): 0.18 numeric factorization wallclock time (sec): 0.14 numeric factorization mflops (CPU time): 3870.99 numeric factorization mflops (wallclock): 4976.98 symbolic + numeric CPU time (sec): 0.20 symbolic + numeric mflops (CPU time): 3483.89 symbolic + numeric wall clock time (sec): 0.16 symbolic + numeric mflops (wall clock): 4354.86 solve flops: 6.99340e+06 iterative refinement steps taken: 1 iterative refinement steps attempted: 1 sparse backward error omega1: 1.60e-16 sparse backward error omega2: 0.00e+00 solve CPU time (sec): 0.01 solve wall clock time (sec): 0.00 solve mflops (CPU time): 699.34 total symbolic + numeric + solve flops: 7.03771e+08 total symbolic + numeric + solve CPU time: 0.21 total symbolic + numeric + solve mflops (CPU): 3351.29