SOLVED CSCI 2033 Assignment 3 QR Decomposition

TO FIND THE NEXT OR DIFFERENT PROJECT CLICK ON THE SEARCH BUTTON ON THE TOP RIGHT MENU AND SEARCH USING THE TITLE CODE OR PROJECT TITLE.

Starting from:

~~$28~~

$16.80

In this assignment, you will implement a Matlab function to decompose a matrix A into the product of two matrices A = QR, where Q has orthonormal columns and R is upper triangular. You will then use your decomposition to solve a shape fitting problem. You will need the files back_sub.m for Part 4 and generate_data.m and visualize.m for Part 5. These can be found in the zip file provided on the Moodle assignment page. 1 Submission Guidelines You will submit a zip file that contains the following .m files on Moodle: • ortho_decomp.m • my_qr.m • least_squares.m • my_pack.m • my_unpack.m • design_matrix.m • affine_fit.m as well as any helper functions that are needed by your implementation. As with previous assignments, please note the following rules: 1. Each file must be named as specified above. 2. Each function must return the specified value(s). 3. You may use Matlab’s array manipulation syntax, e.g. size(A), A(i,:), and [A,B], and basic operations like addition, multiplication, and transpose of matrices and vectors. High-level linear algebra functions such as inv, qr, and A\b are not allowed except where specified. Please contact the instructor with further questions. 1 4. This is an individual assignment, and no collaboration is allowed. Any in-person or online discussion should stop before you start discussing or designing a solution. 2 Orthogonal decomposition Suppose you have an orthonormal basis {u1, . . . , un} for a subspace H ⊆ R m. Any vector v ∈ R m can be decomposed into the sum of two orthogonal vectors, vˆ ∈ H and vˆ ⊥ ∈ H⊥. Furthermore, since vˆ ∈ H, it can be expressed as vˆ = c1u1 + · · · + cnun for some weights c1, . . . , cn. Implement a function to perform this decomposition. Specification: function [c, v_perp] = ortho_decomp(U, v) Input: an m × n matrix U = u1 · · · un with orthonormal columns, and a vector v. Output: c: a vector c =    c1 . . . cn    such that v = c1u1 + · · · + cnun + vˆ ⊥ v_perp: the residual vector v ⊥ such that ui · vˆ ⊥ = 0 for all i. Implementation notes: Here vˆ is simply the orthogonal projection of v onto the subspace H. Since we have an orthogonal basis of H, this should be easy to compute. In fact, since the basis is orthonormal, it is possible to compute the projection in just one line of code without any for loops. Hint: What are the entries of UT v? Test cases: 1. U =     1 0 0 0 0 1 0 0     , v =     2 3 4 5     → c = 2 4 , vˆ ⊥ =     0 3 0 5     2. U =   0.6 0.8 0   , v =   1 1 1   → c = 1.4 , vˆ ⊥ =   0.16 −0.12 1   2 3. Generate a random orthonormal set of 5 vectors in R 10 using the command [U,~] = qr(randn(10,5),0), and generate another random integer vector v = randi([-2,2], 10,1). Compute your orthogonal decomposition, [c, v_perp] = ortho_decomp(U, v). Verify that U*c + v_perp is nearly the same as v, and U’*v_perp is nearly zero (both to within 10−12). 3 QR decomposition Suppose you have an m × n matrix A which is “tall and thin”, i.e. with m > n, and the columns of A are linearly independent. The Gram-Schmidt process corresponds to a factorization A = QR, where Q is an m × n matrix with orthonormal columns, and R is an n × n upper triangular matrix.1 Specification: function [Q, R] = my_qr(A) Input: an m × n matrix A with m > n. Output: an m × n matrix Q with orthonormal columns, and an n × n upper triangular matrix R, such that A = QR. Implementation notes: I recommend starting your implementation by first having your function compute Q correctly. After that, you can modify your code to also compute R. The columns of Q are essentially obtained by performing the Gram-Schmidt process with normalization on the columns of A: q1 = a1/ka1k, aˆ ⊥ 2 = a2 − projH1 a2 where H1 = span{q1}, q2 = aˆ ⊥ 2 /kaˆ ⊥ 2 k, aˆ ⊥ 3 = a3 − projH2 a3 where H2 = span{q1, q2}, q3 = aˆ ⊥ 3 /kaˆ ⊥ 3 k, . . . You can carry this out using your orthogonal decomposition function. For each column i = 1, . . . , n, call ortho_decomp with an appropriate set of inputs, and 1Technically, what we are computing here is known as the “thin” or “reduced” QR decomposition. In the full QR decomposition, Q is square and orthogonal, and R is an m × n upper triangular matrix whose lower (m − n) rows are all zero. In practice, it is also not computed with the Gram-Schmidt process, but with other methods that are more numerically stable. 3 use the resulting vˆ ⊥ to fill in the ith column of Q. In the end, the columns of Q should be an orthonormal set of vectors {q1, . . . , qn} which span Col A. Once you can compute Q correctly, modify your function to compute R as well. In principle, since QT Q = I, you could obtain R simply via QT A = QT QR = R. However, this is more work than necessary, because you can actually fill in the entries of R as you compute each column of Q. When you compute the ith column of Q, the ortho_decomp call gives you both c and vˆ ⊥; can you use these to fill in the i nonzero entries in the ith column of R? Hint: Consider the case when A is already an upper triangular matrix, say A =   1 2 3 0 4 5 0 0 6  , and work out by hand what c and vˆ ⊥ will be for each column. Test cases: 1. A =   0 3 0 0 0 −4 −2 0 0   → Q =   0 1 0 0 0 −1 −1 0 0   , R =   2 0 0 0 3 0 0 0 4   2. A =     1 1 0 1 0 0 1 1 1 1 0 1     → Q =     1/2 1/2 −1/2 1/2 −1/2 −1/2 1/2 1/2 1/2 1/2 −1/2 1/2     , R =   2 1 1 0 1 0 0 0 1   3. Generate a random 10 × 5 integer matrix A = randi([-2,2], 10,5) and compute your QR decomposition, [Q, R] = my_qr(A). Verify that istriu(R) is true, Q’*Q is nearly the identity matrix, and Q*R is nearly the same as A (both to within 10−12). 4 4 Least-squares problems Use the QR decomposition to solve the least-squares problem Ax ≈ b. Specification: function x = least_squares(A, b) Input: an m × n matrix A and a vector b ∈ R m. Output: a vector x ∈ R n which is the least-squares solution of the overdetermined system Ax ≈ b, i.e. such that Ax − b ∈ (Col A) ⊥. Implementation notes: Do not solve the normal equations AT Ax = AT b directly, as this is a very numerically unstable approach. Instead, compute the QR factorization of A and use it to solve the least-squares problem with only a matrix-vector multiplication and a back-substitution, as described in the textbook. Use the back_sub function provided with this assignment to perform the back-substitution. If you cannot get your my_qr function to work, you may use Matlab’s built-in qr function here so you can still attempt Part 5. Call qr(A,0) to get a rectangular Q and square R as desired. Test cases: 1. A =   1 1 0   , b =   2 3 4   → x = 2.5 2. A =       1 0 0 1 1 1 1 2 4 1 3 9 1 4 16       , b =       0 3 4 3 0       → x =   0 4 −1   3. Generate a random 10 × 5 integer matrix A = randi([-2,2], 10,5) and a random integer vector b = randi([-2,2], 10,1). Compute your least-squares solution, x = least_squares(A, b), and obtain the residual r = b - A*x. Verify that A’*r is nearly zero (to within 10−12). 5 5 Best-fitting transformations An affine transformation is a combination of a linear transformation and a translation (i.e. displacement) while retaining parallelism, i.e., parallel lines remain parallel after the transformation. For example, a point x y on A (orange square) in Figure 1 can be transformed to x y on B (red parallelogram) via a linear transform M = m11 m12 m21 m22 , i.e., x y = m11 m12 m21 m22 x y . Note that the parallel lines stay parallel. This transform can be combined with a translation, moving the red parallelogram to blue parallelogram (C). This composite transformation can be written as: xe ye = x y + tx ty = m11 m12 m21 m22 x y + tx ty = M x y + t, (1) where t = tx ty is the translation vector. 11 12 2 21 22 1 2 m m 1 m x y m y  x        =          11 12 21 22 1 1 x y m m t m y t x y m   x   =   +                     A B C Figure 1: Affine transform. In sum, Equation (1) can be written as xe ye = m11x + m12y + tx m22x + m22y + ty . (2) 6 5.1 Linear Equation from a Single Correspondence Your task is given many correspondences2 (xi , yi) ↔ (xei , yei) where i is the index for the correspondence, to compute the best affine transform parameters, m11, m12, m21, m22, tx, ty (unknowns): We can rewrite Equation (2) by arranging it with respect to the unknowns for, say, the first correspondence: xe1 ye1 | {z } pe1 = x1 y1 0 0 1 0 0 0 x1 y1 0 1 | {z } A1         m11 m12 m21 m22 tx ty         | {z } β . (3) or simply, pe1 = A1β. (4) Note that A1 depends on the original position of the point, p1 = x1 y1 . The correspondence produces 2 linear equations while the number of unknowns is 6, so at least 3 correspondences are needed to uniquely determine the affine transform parameters. Implement a function beta = my_pack(M, t) to obtain β from M = m11 m12 m21 m22 and t = tx ty , and its inverse function [M, t] = my_unpack(beta). function beta = my_pack(M, t) Input: a 2 × 2 matrix M and a vector t. Output: a vector β ∈ R 6 containing the entries of M and t. function [M, t] = my_unpack(beta) Input: a vector β ∈ R 6 . Output: a 2 × 2 matrix M and a vector t ∈ R 2 such that my_pack(M, t) = β. Finally, construct the matrix Ai given pi . function Ai = design_matrix(pi) Input: a vector pi = xi yi ∈ R 2 . Output: the matrix Ai given by Equation (3). 2A point (x, y) in A is transformed to a point (x, e ye) in C. This pair of points forms a correspondence. 7 5.2 Solving Linear System from n Correspondences Equation (3) can be extended to include n correspondences as follow:        xe1 ye1 . . . xen yen        =        x1 y1 0 0 1 0 0 0 x1 y1 0 1 . . . . . . . . . . . . . . . . . . xn yn 0 0 1 0 0 0 xn yn 0 1                m11 m12 m21 m22 tx ty         , (5) or again simply,    pe1 . . . pen    =    A1 . . . An    β, (6) If all correspondences are noise-free, Equation (6) will be always satisfied. Due to correspondence noise in practice, it cannot be satisfied, and therefore,    pe1 . . . pen    ≈    A1 . . . An    β, (7) Now we will compute the best affine transform parameters using Equation (7). function [M, t] = affine_fit(P, P_tilde) Input: 2×k correspondence matrices P = p1 · · · pk and Pe = pe1 · · · pek containing the reference points and transformed points as columns. Output: a 2 × 2 matrix M and a vector t ∈ R 2 such that the affine transformation Mx + t best matches the given data. To compute this, you will have to set up and solve the least-squares problem in Equation (7) to obtain β, then unpack it to get M and t out. Test cases: For the first two tests, pick an arbitrary M and t, say M = 1 2 3 4 and t = 5 6 . 1. Check that [M_new, t_new] = my_unpack(my_pack(M, t)) recovers the original values of M and t. 2. Verify that when x = 0 0 , design_matrix(x)*my_pack(M,t) gives back t. Similarly, x = 1 0 and x = 0 1 should give you m1 + t and m2 + t respectively, where m1 and m2 are the columns of M. 8 3. Applying affine_fit to P = 0 1 1 0 0 0 1 1 , Pe = 0.4 1 1.8 1.2 0.8 0 0.6 1.4 should give M = 0.6 0.8 −0.8 0.6 and t = 0.4 0.8 : a rotation and a translation. p1 pe1 p2 = pe2 p3 pe3 p4 pe4 4. We have provided a function [P, P_tilde, M, t] = generate_data() that produces some random test data. It does so by filling random values in M and t, choosing random points pi , then setting each pei to Mpi + t plus a small amount of random noise. You can visualize the data by calling the provided function visualize(P, P_tilde). Call affine_fit(P, P_tilde) to obtain your own estimates of M and t. The result should be close to the “ground truth” transformation returned by generate_data, although due to the added noise it will not be exactly identical. Call visualize(P, P_tilde, M, t) to see what your fit looks like. Note: If you want to test on a smaller data set, just use the first few columns of P and Pe, for example P = P(:, 1:10) and P_tilde = P_tilde(:, 1:10). 9