% LU Addendum
I was unhappy with my LUNaive script which is not at all careful about flop-count, although from an "idea" standpoint it is quite simple. It has flopcount around 10/3*n^3, about 5 times worse than the (reported) best possible result. So I went out on the internet and found a better algorithm, but was STILL getting flopcounts close to n^3.
So I wrote the following script, which is a self-calling function. It creates the first row of U and the first column of L. It then reduces the target matrix to a smaller matrix and calls itself on that.
It has the optimum reported 2/3*n^3 flopcount behavior.
It is trivial to count the steps in my script directly, which is why I wrote it this way.
The number of steps in each function call is easily seen to be
n+(n-1)*2(n-1)
where n is the size of the matrix at this step. Each function call reduces the size by 1 so if we sum this from n=2 to n=k we get the flopcount. Using summation identities we calculate the exact count to be
n*(n+1)/2+n*(n-1)*(2*n-1)/3 -1.
%***********************************************************
function [M,L,U] = LUNaive3(M,L,U)
% LUNaive: Create an LU decomposition via naive Gauss elimination of matrix M.
% Invoke it using a single square matrix argument, as LUNaive3(M).
% This will differ from the MATLAB lu function in that it doesn't use pivoting
% Because of this the script will break if it encounters
% a zero in a pivot position.
% input:
% M = coefficient matrix
% output:
% L = Lower Triangular left factor
% U = Upper Triangular right factor
[m,n] = size(M);
if m~=n, error('Matrix A must be square'); end
if nargin<2,L=eye(size(M)); end
if nargin<2,U=zeros(size(M)); end
lsize=length(L);
L1=L(lsize-n+1:lsize,lsize-n+1:lsize);
U1=U(lsize-n+1:lsize,lsize-n+1:lsize);
A=M;
c=A(:,1)/A(1,1);
L1(2:n,1)=c(2:n);
U1(1,:)=A(1,:);
for k=2:n
U1(k,:)=[0, A(k,2:n)-c(k)*A(1,2:n)];
end
L(lsize-n+1:lsize,lsize-n+1:lsize)=L1;
U(lsize-n+1:lsize,lsize-n+1:lsize)=U1;
M=U(lsize-n+2:lsize,lsize-n+2:lsize);
if n>2 [M,L,U]=LUNaive3(M,L,U);
end
%***********************************************************
%***********************************************************
function [M,L,U,count] = LUNaive3Count(M,L,U,count)
% LUNaive: Create an LU decomposition via naive Gauss elimination for matrix M.
% Invoke it using a single square matrix argument, as LUNaive3Count(M).
% This will differ from the MATLAB lu function in that it doesn't use pivoting
% Because of this the script will break if it encounters
% a zero in a pivot position.
% input:
% M = coefficient matrix
% output:
% L = Lower Triangular left factor
% U = Upper Triangular right factor
% Check out its behavior by plotting c obtained in the following code.
% for w=1:50
% M=rand(w);
% [M,L,U,c(w)]=LUNaive3Count(M)/w^3;
% end
% c
[m,n] = size(M);
if m~=n, error('Matrix A must be square'); end
if nargin<2,L=eye(size(M));,end
if nargin<2,U=zeros(size(M));,end
if nargin<4,count=0; end
lsize=length(L);
L1=L(lsize-n+1:lsize,lsize-n+1:lsize);
U1=U(lsize-n+1:lsize,lsize-n+1:lsize);
A=M;
c=A(:,1)/A(1,1);
count=count+n;
L1(2:n,1)=c(2:n);
U1(1,:)=A(1,:);
for k=2:n
U1(k,:)=[0, A(k,2:n)-c(k)*A(1,2:n)];
count=count+2*(n-1);
end
L(lsize-n+1:lsize,lsize-n+1:lsize)=L1;
U(lsize-n+1:lsize,lsize-n+1:lsize)=U1;
M=U(lsize-n+2:lsize,lsize-n+2:lsize);
if n>2 [M,L,U,count]=LUNaive3Count(M,L,U,count);
end
%***********************************************************
% I ran the following:
clear c
for w=4:50
M=rand(w);
[M,L,U,count]=LUNaive3Count(M);
c(w)=count/w^3;
end
c(30:50)
% and got
0.6547 0.6550 0.6552 0.6555 0.6557 0.6560 0.6562 0.6564 0.6566 0.6568 0.6570
GREAT!!! However .......
I checked it out with tic toc on a big matrix ....
M=rand(500);
tic
LUNaive(M);
a(1)=toc;
tic
B=LUNaive2(M);
a(2)=toc;
tic
C=LUNaive3(M);
a(3)=toc
and got the surprising
a = 2.3831 0.1516 1.4409
So my min-flop solution was a flop, 10 times slower than the arithmetic-only version which had a 50% higher flopcount!!!
My recursive solution is profligate in use of RAM, and apparently read/write to RAM adds up too!!
Something to think about.
But ... I'm tired of this problem now. But in case you aren't, here is the function I found somewhere on the interweb that is (in a practical sense) fastest, and the flopcount version of my original LU-maker too.
%***********************************************************
function [L,U] = LUNaive2(A)
% LUNaive: Create an LU decomposition via naive Gauss elimination
% This will differ from the MATLAB lu function in that it doesn't use pivoting
% Because of this the script will break if it encounters
% a zero in a pivot position.
% input:
% A = coefficient matrix
% output:
% L = Lower Triangular left factor
% U = Upper Triangular right factor
% The outer loop calculates L
% and the second inner loop calculates U
[m,n] = size(A);
if m~=n, error('Matrix A must be square'); end
L = eye(m);
for k=1:m
L(k+1:m, k) = A(k+1:m, k)/A(k,k);
for p = k+1:m
A(p,:) = A(p,:) - L(p,k)*A(k,:);
end
end
U=A;
end
%***********************************************************
function [c, L,U] = LUNaive2Count(A)
% LUNaive: Create an LU decomposition via naive Gauss elimination
% This will differ from the MATLAB lu function in that it doesn't use pivoting
% Because of this the script will break if it encounters
% a zero in a pivot position.
% input:
% A = coefficient matrix
% output:
% L = Lower Triangular left factor
% U = Upper Triangular right factor
% Test the order of the flopcount using the following.
% for w=1:50
% M=rand(w);
% c(w)=LUNaive2Count(M)/w^3;
% end
% c
[m,n] = size(A);
c=0;
if m~=n, error('Matrix A must be square'); end
L = eye(m);
for k=1:m
L(k+1:m, k) = A(k+1:m, k)/A(k,k);
c=c+sum(length(k+1:m));
for p = k+1:m
A(p,:) = A(p,:) - L(p,k)*A(k,:);
c=c+2*m;
end
end
U=A;
end
%***********************************************************
%***********************************************************
function [c,L,U] = LUNaiveCount(A)
% LUNaive: Create an LU decomposition via naive Gauss elimination
% This will differ from the MATLAB lu function in that it doesn't use pivoting
% Because of this the script will break if it encounters
% a zero in a pivot position.
% input:
% A = coefficient matrix
% output:
% L = Lower Triangular left factor
% U = Upper Triangular right factor
% The first time through the loop we calculate U
% and the second time through we calculate L
% Essentially, the second loop calculates the inverse of the
% elementary row operations used to find U.
c=0;
[m,n] = size(A);
if m~=n, error('Matrix A must be square'); end
Aug = [A eye(size(A))];
% forward elimination to find U
for k = 1:n-1
for i = k+1:n
factor = Aug(i,k)/Aug(k,k);
c=c+1;
Aug(i,k:2*n) = Aug(i,k:2*n)-factor*Aug(k,k:2*n);
c=c+2*length(k:2*n);
end
end
U=Aug(:,1:n);
% forward elimination to find L
A=Aug(:,n+1:2*n);
Aug = [A eye(size(A))];
% forward elimination
for k = 1:n-1
for i = k+1:n
factor = Aug(i,k)/Aug(k,k);
c=c+1;
Aug(i,k:2*n) = Aug(i,k:2*n)-factor*Aug(k,k:2*n);
c=c+2*length(k:2*n);
end
end
L=Aug(:,n+1:2*n);
end
%***********************************************************