% LU Addendum I was unhappy with my LUNaive script which is not at all careful about flop-count, although from an "idea" standpoint it is quite simple. It has flopcount around 10/3*n^3, about 5 times worse than the (reported) best possible result. So I went out on the internet and found a better algorithm, but was STILL getting flopcounts close to n^3. So I wrote the following script, which is a self-calling function. It creates the first row of U and the first column of L. It then reduces the target matrix to a smaller matrix and calls itself on that. It has the optimum reported 2/3*n^3 flopcount behavior. It is trivial to count the steps in my script directly, which is why I wrote it this way. The number of steps in each function call is easily seen to be n+(n-1)*2(n-1) where n is the size of the matrix at this step. Each function call reduces the size by 1 so if we sum this from n=2 to n=k we get the flopcount. Using summation identities we calculate the exact count to be n*(n+1)/2+n*(n-1)*(2*n-1)/3 -1. %*********************************************************** function [M,L,U] = LUNaive3(M,L,U) % LUNaive: Create an LU decomposition via naive Gauss elimination of matrix M. % Invoke it using a single square matrix argument, as LUNaive3(M). % This will differ from the MATLAB lu function in that it doesn't use pivoting % Because of this the script will break if it encounters % a zero in a pivot position. % input: % M = coefficient matrix % output: % L = Lower Triangular left factor % U = Upper Triangular right factor [m,n] = size(M); if m~=n, error('Matrix A must be square'); end if nargin<2,L=eye(size(M)); end if nargin<2,U=zeros(size(M)); end lsize=length(L); L1=L(lsize-n+1:lsize,lsize-n+1:lsize); U1=U(lsize-n+1:lsize,lsize-n+1:lsize); A=M; c=A(:,1)/A(1,1); L1(2:n,1)=c(2:n); U1(1,:)=A(1,:); for k=2:n U1(k,:)=[0, A(k,2:n)-c(k)*A(1,2:n)]; end L(lsize-n+1:lsize,lsize-n+1:lsize)=L1; U(lsize-n+1:lsize,lsize-n+1:lsize)=U1; M=U(lsize-n+2:lsize,lsize-n+2:lsize); if n>2 [M,L,U]=LUNaive3(M,L,U); end %*********************************************************** %*********************************************************** function [M,L,U,count] = LUNaive3Count(M,L,U,count) % LUNaive: Create an LU decomposition via naive Gauss elimination for matrix M. % Invoke it using a single square matrix argument, as LUNaive3Count(M). % This will differ from the MATLAB lu function in that it doesn't use pivoting % Because of this the script will break if it encounters % a zero in a pivot position. % input: % M = coefficient matrix % output: % L = Lower Triangular left factor % U = Upper Triangular right factor % Check out its behavior by plotting c obtained in the following code. % for w=1:50 % M=rand(w); % [M,L,U,c(w)]=LUNaive3Count(M)/w^3; % end % c [m,n] = size(M); if m~=n, error('Matrix A must be square'); end if nargin<2,L=eye(size(M));,end if nargin<2,U=zeros(size(M));,end if nargin<4,count=0; end lsize=length(L); L1=L(lsize-n+1:lsize,lsize-n+1:lsize); U1=U(lsize-n+1:lsize,lsize-n+1:lsize); A=M; c=A(:,1)/A(1,1); count=count+n; L1(2:n,1)=c(2:n); U1(1,:)=A(1,:); for k=2:n U1(k,:)=[0, A(k,2:n)-c(k)*A(1,2:n)]; count=count+2*(n-1); end L(lsize-n+1:lsize,lsize-n+1:lsize)=L1; U(lsize-n+1:lsize,lsize-n+1:lsize)=U1; M=U(lsize-n+2:lsize,lsize-n+2:lsize); if n>2 [M,L,U,count]=LUNaive3Count(M,L,U,count); end %*********************************************************** % I ran the following: clear c for w=4:50 M=rand(w); [M,L,U,count]=LUNaive3Count(M); c(w)=count/w^3; end c(30:50) % and got 0.6547 0.6550 0.6552 0.6555 0.6557 0.6560 0.6562 0.6564 0.6566 0.6568 0.6570 GREAT!!! However ....... I checked it out with tic toc on a big matrix .... M=rand(500); tic LUNaive(M); a(1)=toc; tic B=LUNaive2(M); a(2)=toc; tic C=LUNaive3(M); a(3)=toc and got the surprising a = 2.3831 0.1516 1.4409 So my min-flop solution was a flop, 10 times slower than the arithmetic-only version which had a 50% higher flopcount!!! My recursive solution is profligate in use of RAM, and apparently read/write to RAM adds up too!! Something to think about. But ... I'm tired of this problem now. But in case you aren't, here is the function I found somewhere on the interweb that is (in a practical sense) fastest, and the flopcount version of my original LU-maker too. %*********************************************************** function [L,U] = LUNaive2(A) % LUNaive: Create an LU decomposition via naive Gauss elimination % This will differ from the MATLAB lu function in that it doesn't use pivoting % Because of this the script will break if it encounters % a zero in a pivot position. % input: % A = coefficient matrix % output: % L = Lower Triangular left factor % U = Upper Triangular right factor % The outer loop calculates L % and the second inner loop calculates U [m,n] = size(A); if m~=n, error('Matrix A must be square'); end L = eye(m); for k=1:m L(k+1:m, k) = A(k+1:m, k)/A(k,k); for p = k+1:m A(p,:) = A(p,:) - L(p,k)*A(k,:); end end U=A; end %*********************************************************** function [c, L,U] = LUNaive2Count(A) % LUNaive: Create an LU decomposition via naive Gauss elimination % This will differ from the MATLAB lu function in that it doesn't use pivoting % Because of this the script will break if it encounters % a zero in a pivot position. % input: % A = coefficient matrix % output: % L = Lower Triangular left factor % U = Upper Triangular right factor % Test the order of the flopcount using the following. % for w=1:50 % M=rand(w); % c(w)=LUNaive2Count(M)/w^3; % end % c [m,n] = size(A); c=0; if m~=n, error('Matrix A must be square'); end L = eye(m); for k=1:m L(k+1:m, k) = A(k+1:m, k)/A(k,k); c=c+sum(length(k+1:m)); for p = k+1:m A(p,:) = A(p,:) - L(p,k)*A(k,:); c=c+2*m; end end U=A; end %*********************************************************** %*********************************************************** function [c,L,U] = LUNaiveCount(A) % LUNaive: Create an LU decomposition via naive Gauss elimination % This will differ from the MATLAB lu function in that it doesn't use pivoting % Because of this the script will break if it encounters % a zero in a pivot position. % input: % A = coefficient matrix % output: % L = Lower Triangular left factor % U = Upper Triangular right factor % The first time through the loop we calculate U % and the second time through we calculate L % Essentially, the second loop calculates the inverse of the % elementary row operations used to find U. c=0; [m,n] = size(A); if m~=n, error('Matrix A must be square'); end Aug = [A eye(size(A))]; % forward elimination to find U for k = 1:n-1 for i = k+1:n factor = Aug(i,k)/Aug(k,k); c=c+1; Aug(i,k:2*n) = Aug(i,k:2*n)-factor*Aug(k,k:2*n); c=c+2*length(k:2*n); end end U=Aug(:,1:n); % forward elimination to find L A=Aug(:,n+1:2*n); Aug = [A eye(size(A))]; % forward elimination for k = 1:n-1 for i = k+1:n factor = Aug(i,k)/Aug(k,k); c=c+1; Aug(i,k:2*n) = Aug(i,k:2*n)-factor*Aug(k,k:2*n); c=c+2*length(k:2*n); end end L=Aug(:,n+1:2*n); end %***********************************************************