% Script 2 for Week 4 % Calculating inverse of square matrix A % Form [I A] then row-reduce A to I and the result will be [inv(A) I] % Number of flops to do this: roughly (8/3)n^3. % Matrix multiplication: A*B nxk and kxm matrices Number of flops to do this: nm times (k multiplications and k-1 additions) so roughly 2knm flops. For two nxn matrices this is roughly 2n^3 flops. % Multiplying nxn by nx1 takes roughly 2n^2 flops. % Calculating the determinant of a matrix takes roughly (2/3)n^3 flops. % Ill-conditioned systems are those for which solving A*X=B using X=inv(A)*B is likely to produce large calculation errors. These occur because in the solution process you will be forced to divide normal-sized numbers by tiny numbers or tiny numbers by huge numbers .... many times. % We want to know when this can happen and the size of the error this might introduce. The discussion below (eventually) takes us there. % A norm is a function N:V->R where V is a vector space which satisfies % N(x) >= 0 with equality only when x=0 % and N(x+y) <= N(x)+N(y) (triangle inequality) % and N(cx)=|c|N(x) (absolute homogeneity) % There are various norms for vectors in R^n: x=[-10 3 7 0 1] % The 2-norm: sqrt(sum(x.^2)) % The ordinary Euclidean norm, calculated in MATLAB as norm(x) % The 1-norm: sum(abs(x)) $ The city street norm, calculated in MATLAB as norm(x, 1) % The max (or infinity) norm: max(abs(x)) % This last one picks out the largest-magnitude entry, calculated in MATLAB as norm(x, 'inf') % Also, there are various norms for matrix vector spaces. A=[3 -2 8; 5 9 1] % The maximum row-sum (also called the uniform-matrix or matrix-infinity norm). I don't know why anyone would use this, but here it is: b=size(A) for n=1:b(1) C(n)=sum(abs(A(n,:))); end max(C) % This is calculated in MATLAB as norm(A, 'inf') % The maximum column sum norm. This one gives the max of the max norm of A applied to the list of standard basis vectors b=size(A) for n=1:b(2) C(n)=sum(abs(A(:,n))); end max(C) % This one is calculated in MATLAB as norm(A, 1) % The Frobenius norm: Euclidean norm of A thought of as a sliced-and-stacked member of R^(nm) c=0; for j=1:b(2) for i=1:b(1) c=c+A(i,j)^2; end end c=sqrt(c) % This is calculated directly in MATLAB as norm(A,'fro') % The most useful norm of a matrix for many purposes is the operator norm or spectral norm calculated as norm(A) in MATLAB. % This is the biggest "stretch factor" that A can impose on a vector x in its domain. Specifically, if x is any unit vector then the Euclidean length of A*x can never exceed (but can equal) norm(A). % You can see this by looking at the singular value decomposition of A. [Left Sing Right]=svd(A) % Recall that A equals Left*Sing*Right'. % The columns of Left form an O.N. basis of the "range" space of matrix A thought of as the linear transformation "left multiply by A." % Sing is diagonal with non-negative entries in non-increasing order. % The columns of Right form an O.N. basis of the domain space. % Whenever Sing(i,i) is nonzero, A sends the ith column of Right to Sing(i,i) times the ith column of Left. (These S(i,i) are called the singular values of A.) % This implies that A sends the unit sphere in its domain to an ellipsoid in the range with axes oriented along the columns of Left. The diameter of the ellipsoid at each of these columns is the singular value. % So Sing(1,1) should be norm(A). A=[3 -2 8;5 9 1]; [L S R]=svd(A); S(1,1) norm(A) % Below we get at this directly using a function that measures the stretch factor for each domain vector. We will use fminsearch to locate the minimum of the negative of this stretch factor. %*************************************** function z=testfunc3(x,y) A=[3 -2 8;5 9 1]; if x.^2+y.^2>1 z=0; else z=-norm( A*[x;y;sqrt(1-x.^2-y.^2)] ) end %**************************************** f=@(x) testfunc3(x(1),x(2)) [w,minmag]=fminsearch(f,[.1,.2]) A=[3 -2 8;5 9 1]; norm(A) norm([w sqrt(1-w(1)^2-w(2)^2)]) % Condition Numbers % We create the Hilbert Matrix: for i=1:3 for j=1:3 M(i,j)=1/(i+j-1); end end % (Actually, we could have created it directly by M=hilb(3).) We then create a new matrix N from M by multiplying each row by the constant needed so that the largest entry in each row is 1. This is called ``conditioning'' M. N=M; N(2,:)=2*M(2,:); N(3,:)=3*M(3,:) M % In a practical setting the entries of M*x each have units. By "conditioning" the matrix M to produce N you are just changing the unit in the various rows, which is NOT a physically significant issue (remember to change back after a calculation!) And by doing that you can increase the precision of the arithmetic involved in calculating an inverse matrix. L=[cond(M) cond(N) cond(N,1) cond(N,2) cond(N,'inf') cond(N,'fro')] log10(L) % In doing the calculation N*X=B by inv(N)*B=X you might lose (worst case scenario---often the error will be much smaller!) 2 or 3 decimal digits of precision, as explained on pages 295-6 of Chapra. % Note: on page 296 Chapra uses log to indicate log base 10, not log10! And in the formula on page 295 the matrix deltaA has entries which are the amount of uncertainty in the corresponding entries of A. % For instance if the (3,4) entry of A is 7.236 +/- 0.002 then the (3,4) entry of deltaA is 0.002. If all the entries of A have the same uncertainty then % deltaA = 0.002*ones(n) % where n is the size of square matrix A. The formula on page 295 is: % ||deltaX|| <= ||X||*cond(A)*||deltaA||/||A|| % Of course this number depends on which norm you use, but will be the same order of magnitude for all the common norms. %*********************************%********************************** % Text Problem 12.8 % Solve % 0=-x^2+x-y+0.75 % 0=x^2-5x*y-y G=@(x) [-x(1)^2+x(1)-x(2)+0.75; x(1)^2-5*x(1)*x(2)-x(2)] % The Jacobian of G is Jac=@(x) [-2*x(1)+1 , -1; 2*x(1)-5*x(2) , -5*x(1)-1] % The difference between G(x+c) and G(x) + Jac(x)*c % goes to the zero EVEN IN COMPARISON TO ||c|| as c->[0;0]. for n=1:5 delx=0.1^n; dely=0.2^n; reldiff{n}=(G([1.2;1.2])+Jac([1.2;1.2])*[delx;dely]-G([1.2+delx;1.2+dely]))/sqrt(delx^2+dely^2); end celldisp(reldiff) sqrt(delx^2+dely^2) %*********************************** function [J,f]=funcnr(x) J=[-2*x(1)+1 , -1; 2*x(1)-5*x(2) , -5*x(1)-1]; f= [-x(1)^2+x(1)-x(2)+0.75; x(1)^2-5*x(1)*x(2)-x(2)]; end %*********************************** [x,f,ea,iter]=newtmult(@funcnr,[1.2;1.2]) G(x) % Here is an additional example of a function from R^3 to R^2. % Where does % [x^2*y+z;x-y*z^3]=[-43,379]? % The Jacobian is [2*x*y , x^2, 1; 1, -z^3, -3*y*z^2] %*********************************** function [J,f]=funcnr3(x) J=[2*x(1)*x(2) x(1)^2 1; 1 -x(3)^3 -3*x(2)*x(3)^2]; f= [x(1)^2*x(2)+x(3); x(1)-x(2)*x(3)^3]-[-43;379]; end %*********************************** [J ,f]=funcnr3([3; -2 ; 4]) [x,f,ea,iter]=newtmult(@funcnr3,[3; -2 ; 4]) [x2,f2, ea, iter]=newtmult(@funcnr3,[7; -5 ; 6]) % Below we will create 100 seed values along a straight-line path between the root x2 and the root x and find the root corresponding to these seeds. Then plot the seeds and the roots in space. % The errors variable is the sum of the norms of all hundred function values. I wanted to make sure these 100 function values were ALL tiny. t=linspace(0,1); errors=0; for j=1:length(t) y(:,j)=(1-t(j))*x+t(j)*x2; % The plot of the y columns is a straight-line path from root x to root x2. root(:,j)=newtmult(@funcnr3,y(:,j)); [J,fval]=funcnr3(root(:,j)); errors=errors+norm(fval); end errors plot3(root(1,:),root(2,:),root(3,:),'o') hold on plot3( y(1,:),y(2,:),y(3,:),'-.r') grid on hold off % I wonder why newtmult abruptly switched from one curve of solutions to another. I suspect some saddle behavior caused this. A drop of water hitting 1 inch to the West of Snoqualmie pass goes to Elliott Bay. One inch to the East goes to the Columbia River. Maybe the line of seeds went over a pass. I bet using the slice "4-d" visualization aid from Script 1 Week 3, using the norm of the function value as the "coloring" 4th variable, (the set of all solutions will correspond to the lowest color) would give some insight into why that happened. % Below are examples of the Gauss-Seidel fixed-point method. Be aware that sometimes you will have to rewrite the system (change the order of the equations for instance) in order to make it DIAGONALLY DOMINANT: the absolute value of the diagonal coefficient in each of the equations must be larger than the sum of the absolute values of the other coefficients in the equation for diagonal dominance. A=[10 -2;-3 12] b=[8; 9] % The matrix equation A*X=b corresponds to % 10x-2y=8 % -3x+12y=9 % Note that this is a diagonally dominant system. Had the same equations been listed in different order it would not be. [x,iter]=GSLR(A,b) % Note!!! Reversing the order of the equations gives a problem. A=[-3 12;10 -2] b=[9; 8] [x,iter]=GSLR(A,b) % Here is another example. A=[3 -0.1 -0.2; .1 7 -0.3; 0.3 -0.2 10] b=[7.85 -19.3 71.4]' [x,iter]=GSLR(A,b) % Here is another example. A=[0 .1/3 .2/3; .1/7 0 .3/7; -.3/10 .2/10 0] b=[7.85/3 -19.3/7 71.4/10]' f=@(x) A(1,1)*x(1)+A(1,2)*x(2)+A(1,3)*x(3)+b(1) g=@(x) A(2,1)*x(1)+A(2,2)*x(2)+A(2,3)*x(3)+b(2) h=@(x) A(3,1)*x(1)+A(3,2)*x(2)+A(3,3)*x(3)+b(3) [x,iter]=GSNR([1;1;1],1,f,g,h) % So we have found a value of x for which x=A*x+b. % You can interpret this as an x for which % (A-I)*x=-b. [f(x);g(x);h(x)] w=@(x) sqrt(10-x(1)*x(2)) v=@(x) sqrt((57-x(2))/(3*x(1))) b=[1.5;3.5] [x,iter]=GSNR(b,1,w,v) [w(x);v(x)] % So we have found a value of x for which x=[w(x);v(x)] % x is a fixed point of that vector-valued function. % You can interpret this as an x for which % [w(x)-x(1) ; v(x)-x(2)]=0. % Note the following fails: One = @(x) x(1)^2 + x(1)*x(2) - 10 Two = @(x) x(2)+3*x(1)*x(2)^2 - 57 b=[1.5;3.5] [x,iter]=GSNR(b,1,One,Two) % Rewriting as below also fails: Three = @(x) (10 - x(1)^2)/x(2) Four = @(x) 57 - 3*x(1)*x(2)^2 [x,iter]=GSNR(b,1,Three,Four) % But the following works! The take-away is: persist!! Five=@(x) sqrt(10-x(1)*x(2)) Six=@(x) sqrt((57-x(2))/(3*x(1)))