强撸MIT之18.06

强撸MIT之18.06

在看量子计算之前,决定先复习一下线性代数,毕竟一年半没看了(逃
大概就是快速过一遍GSLA吧,顺便拿来训练阅读速度


1 Introduction to Vectors

1.1 Vectors and Linear combinations

  • the basic operations on vectors are multiplication cv and vector addition v + w.
  • The sum of cv and dw is a linear combination of v and w.
  • there is a perfect match between the column vector and the arrow from the origin and the point where the arrow ends.

1.2 Lengths and Dot Products

  • The dot product \(v \cdot w\) multiplies each component \(v_i\) by \(w_i\) and adds all \(v_iw_i\).
  • The length \(\|v\|\) of a vector is the square root of \(v \cdot v\).
  • Unit vectors u and U at angle \(\theta\) have \(u \cdot U = cos \theta\).
  • Schwarz inequality: \(|v \cdot w| \le \|v\| \|w\|\).
  • triangle inequality: \( \|v + w\| \le \|v\| + \|w\| \).

1.3 Matrices

  • Ax = b:
    The matrix A acts on the vector x. The output Ax is a combination b of the columns of A. (in column’s view)
    Ax is also dot products with rows. (in row’s view)
  • The relationship between Difference Matrix and Derivative.
    The Sum Matrix is the inverse matrix of the Difference Matrix.
    the Integration is the inverse of the Differentiation.
  • Independent columns: Ax = 0 has one solution. A is an invertible matrix.
    Dependent columns: Cx = 0 has many solutions. C is a singular matrix.

2 Solving Linear Equations

2.1 Vectors and Linear Equations

  • Column picture: Ax = b asks for a combination of columns to produce b.
  • Row picture: Each equation in Ax = 0 gives a line (n = 2) or a plane (n = 3) or a “hyperplane” (n > 3).
    They intersect at the solution or solutions, if any.
  • but Ax should be understood as a combination of the columns of A.

2.2 The idea of Elimination

  • A linear system (Ax = b) becomes upper triangular (Ux = c) after elimination.
  • The upper triangular system is solved by back substitution (starting at the bottom).
  • A zero in the pivot position can be repaired if there is a nonzero below it.
  • When breakdown is permanent, the system has no solution or infinitely many.

2.3 Elimination Using Matrices

  • The Identity Matrix has 1’s on the diagonal and otherwise 0’s.
    Then \(Ib = b\) for all b.
  • The Elementary Matrix or Elimination Matrix \(E_{ij}\) that substracts a multiple \(l\) of row j from row i has the extra nonzero entry \(-l\) in the i,j position (sitll diagonal 1’s)
  • The Row Exchange Matrix \(P_{ij}\) is the identity matrix with rows i and j reversed.
    When this Permutation Matrix \(P_{ij}\) multiplies a matrix, it exchanges rows i and j.
  • Elimination does the same row operations to A and to b. We can include b as an extra column and follow it through elimination. The matrix A is enlarged or augmented by the extra column b.

2.4 Rules for Matrix Operations

  • The (i, j) entry of AB is (row i of A) \(\cdot\) (column j of B).
    AB is also the sum of these matrices: (column j of A) times (row j of B).
  • A row times a column is an inner product.
    A column times a row is an outer product.
    These are extreme cases of matrix multiplication.
  • Each column of AB is a combination of the columns of A. (matrix B acts from right)
    Each row of AB is a combination of the rows of B. (matrix A acts from left)
  • Block multiplication is allowed when the block shapes match correctly.
    Block elimination produces the Schur complement \(D – CA^{-1}B\).
  • The associative law for ABC is satisfied.
    The commutative law is usually broken.

2.5 Inverse Matrices

  • The inverse matrix gives \(AA^{-1} = I\) and \(A^{-1}A = I\)
  • The algorithm to test invertibility is elimination : A is invertible if and only if it has n pivots (row exchanges allowed).
    The algebra test for invertibility is the determinant of A: det A must not be zero.
    The equation that tests for invertibility is Ax = 0: x = 0 must be the only solution.
  • The inverse of \(AB\) is the reverse product \(B^{-1}A^{-1}\).
  • The Gauss-Jordan method solves \(AA^{-1} = I\) to find the n columns of \(A^{-1}\).
    The augmented matrix \([A I]\) is row-reduced to \([I A^{-1}]\).
  • If A is symmetric or triangular, so is \(A^{-1}\).
  • Diagonally dominant matrices are invertible. Each \(|a_{ii}|\) dominates its row.

2.6 Elimination = Factorization: A = LU

  • Gaussian elimination (with no row exchanges) factors A into L times U.
  • The lower triangular L contains the multiplier \(l_{ij}\) that multipy pivot rows, while going from A to U.
    The product LU add those rows back to recover A.
  • The original system Ax = b is factored into two triangular systems:
    solve Lc = b (forward) and Ux = c (backward)

2.7 Transposes and Permutations

  • The transpose puts the rows of A into the columns of \(A^T\). Then \((A^T)_{ij} = A_{ji}\).
  • The transpose of \(AB\) is \(B^TA^T\). The transpose of \(A^{-1}\) is the inverse of \(A^T\).
  • The dot product is \(x \cdot y = x^Ty\). Then \((Ax)^Ty\) equals the dot product \(x^T(A^Ty)\).
  • When A is symmetric \((A^T = A)\), its LDU factorization is symmetric: \(A = LDL^T\).
  • A Permutation Matrix P has the rows of the identity \(I\) in any order, and \(P^T = P^{-1}\).
  • If A is invertible then a Permutation P will reorder its rows for \(PA = LU\).

3 Vector Spaces and Subspaces

3.1 Spaces of Vectors

  • \(R^n\) contains all column vectors with n real components.
  • A vector space must closed under linear combination.
    A subspace containing v and w must contain all their combinations \(cv + dw\).
  • The combinations of the columns of A form the column space \(C(A)\). Then the column space is spanned by the columns.
  • \(Ax = b\) has a solution exactly when b is in the column space of A.

3.2 The Nullspace of A: solving \(Ax = 0\)

  • The nullspace \(N(A)\) is a subspace of \(R^n\). It contains all solutions to \(Ax = 0\).
  • Elimination produces an echelon matrix U, and then a row reduced R, with pivot columns and free columns.
    Every free column of U or R leads to a special solution.
    The complete solution to \(Ax = 0\) is a combination of the special solutions.
  • Suppose \(Ax = 0\) has more unknowns than equations (n > m, more columns than rows).
    Then there are nonzero solutions. There must be free columns.

3.3 The Rank and the Row Reduced Form

  • The rank r of A is the number of pivots.
    The true size of A is given by its rank.
  • The r pivot columns are not combinations of earlier columns.
    The n – r free columns are combinations of earlier columns (pivot columns).
  • Those combinations (using -F taken from R) give the n – r special solutions to \(Ax = 0\) and \(Rx = 0\).
    They are the n – r columns of the nullspace matrix N.

3.4 The Complete Solution to \(Ax = b\)

  • \(Ax = b\) is solvable if and only if the last m – r equations reduce to 0 = 0.
  • One particular solution \(x_p\) has all free variables equal to zero.
    The pivot variables are determined after the free variables are chosen.
  • \(x = x_p + x_n\)
    The particular solution \(x_p\) will be one point on the “line”.
    Adding the nullspace vector \(x_n\) will move us along the “line”.
  • Full column rank r = n means no free variables: one solution is m = n or none if m > n (overdetermined).
  • Full row rank r = m means one solution if m = n or infinitely many if m < n (underdetermined).

3.5 Independence, Basis and Dimension

  • The columns of A are independent if x = 0 if the only solution to \(Ax = 0\).
  • The vectors \(v_1, \dots ,v_r\) span a space if their combinations fill that space.
  • A basis consists of linearly independent vectors that span the space.
    Every vector in the space is a unique combination of the basis vectors.
  • All bases for a space have the same number of vectors.
    This number of vectors in a basis is the dimension of the space.
  • The pivot columns are one basis for the column space. The dimension if r.
  • The vectors \(v_1, \dots ,v_n\) are a basis for \(R^n\) exactly when they are the columns of an n by n invertible matrix.
    Thus \(R^n\) has infinitely many different bases.
  • Space Z contains only the zero vector.
    The dimension of this space is zero.
    The empty set is a basis for Z (We never allow the zero vector into a basis).

3.6 Dimension of the Four Subspaces

The-big-picture-of-LA-1
四个基本子空间的关系-1
  • Fundamental Theorem of Linear Algebra, Part 1:
    The column space \(C(A)\) and the row space \(C(A^T)\) both have dimension r (the rank of A).
    The nullspace \(N(A)\) has dimension n − r. The left nullspace \(N(A^T)\) has dimension m − r.
  • Elimination produces bases for the row space and nullspace of A: They are the same as for R.
  • Elimination often changes the column space and left nullspace (but dimensions don’t change).
  • Rank one matrices: \(A = uv^T = \text{column times row}\): \(C(A)\) has basis u, \(C(A^T)\) has basis v.

4 Orthogonality

4.1 Orthogonality of the Four Subspaces

The-big-picture-of-LA-2
四个基本子空间的关系-2
  • Fundamental Theorem of Linear Algebra, Part 2:
    \(N(A)\) is the orthogonal complement of the row space \(C(A^T)\) (in \(R^n\)).
    \(N(A^T)\) is the orthogonal complement of the column space \(C(A)\) (in \(R^m\)).
  • Subspace V and W are orthogonal if every v in V is orthogonal to every w in W.
  • V and W are orthogonal complements if W contains all vectors perpendicular to V (and vice versa).
    Inside \(R^n\), the dimensions of complements V and W add to n.
    Orthogonality is impossible when dim V + dim W > dimension of whole space.
  • When a vector is in two orthogonal subspaces, it must be zero which is perpendicular to itself.
  • Any n independent vectors in \(R^n\) will span \(R^n\).
  • Every x in \(R^n\) has a nullspace component \(x_n\) and a row space component \(x_r\).
  • Every vector b in column space comes from one and only one vector in the row space.
    From the row space to the column space, A is invertible (pseudoinverse).

4.2 Projections

  • The rank one projection matrix \(P = aa^T / a^Ta\) multiplies b to produce the projection p onto the line through a.
  • Projecting b onto a subspace leaves \(e = b – p\) perpendicular to subspace.
    When \(P\) projects onto subspace, \(I – P\) projects onto the perpendicular subspace.
  • To project any b onto the column space of any matrix A
    The projection p is the combination \(A\hat{x}\) which is closest to b
    The error vector \(b – A\hat{x}\) is perpendicular to the column space of A
    Therefore \(b – A\hat{x}\) is in the nullspace of \(A^T\). This means \(A^T(b – A\hat{x} = 0\).

    When A has full rank n, the equation \(A^TA\hat{x} = A^Tb\) leads to \(\hat{x}\) and \(p = A\hat{x}\).
  • The projection matrix \(P = A(A^TA)^{-1}A^T\) has \(P^T = P\) and \(P^2 = P\).
    The matrix A is rectangular, which means that A may be singular.
    When A is singular, we cannot split \((A^TA)^{-1}\) into \(A^{-1}\) times \((A^T)^{-1}\) because there is no \(A^{-1}\) in the first place.
    When A has independent columns, \(A^TA\) is square, symmetric, and invertible.

4.3 Least Squares Approximations

  • When \(Ax = b = p + e\) is impossible, \(A\hat{x} = p\) is solvable.
    b is outside the column space, and the nearest point in the column space is its projection p when the error e is orthogonal to the column space.
  • The best \(\hat{x}\) comes from the normal equations \(A^TA\hat{x} = A^Tb\).
    The best \(\hat{x}\) minimizes \(E = \|Ax – b\|^2\) which is the sum of squares of the errors in the m equations (m > n).
  • If we try to fit m points by a combination of n < m functions, the m equations \(Ax = b\) are generally unsolvable.
    The n equations \(A^TA\hat{x} = A^Tb\) give the least squares solution—— the combination with smallest MSE (mean square error).

4.4 Orthogonal Bases and Gram-Schmidt

  • A matrix Q with orthonormal columns satisfies \(Q^TQ = I\)
    When Q is square (an orthogonal matrix), \(Q^TQ = I\) means that \(Q^T = Q^{-1}\): transpose = inverse.
  • Q preserves length: \(\|Qx\| = \|x\|\)
    Q preserves dot products: \((Qx)^T(Qy) = x^Ty\)
  • The projection onto the column space is \(P = QQ^T\)
    if Q is square then \(P = I\) and every \(b = q_1(q_1^Tb) + \dots + q_n(q_n^Tb)\).
  • Gram-Schmidt process: subtract from every new vector its projections in the directions already set
    In matrix form this is the factorization \(A = QR\).
    R is upper triangular because of the non-involvement of later vectors (the key point of Gram-Schmidt)
  • The least squares solution of \(Qx = b\) is \(\hat{x} = Q^Tb\)
    The least squares solution of \(Ax = b\) is \(\hat{x} = R^{-1}Q^Tb\)

5 Determinants

5.1 The Properties of Determinants

  • The determinant is defined by det I = 1, sign reversal, and linearity in each row.
  • After elimination det A is \(\pm\) (product of the pivots).
  • The determinant is zero exactly when A is not invertible.
  • Two remarkable properties are det AB = (det A)(det B) and det \(a^T\) = det A.

5.2 Permutations and Cofactors

  • The first k pivots come from the k by k matrix \(A_k\) in the top left conrner of A.
    The determinant of that corner submatrix \(A_k\) is \(d_1d_2 \dots d_k\).
  • det A = sum over all n! column permutations \(P = (\alpha, \beta, \dots, \omega)\) = \(\Sigma(\text{det P})a_{1\alpha}a_{2\beta}\dots a_{n\omega}\) = BIG FORMULA
  • The cofactor \(C_{ij}\) is \((-1)^{i+j}\) times the smaller determinant that omits row i and column j (because \(a_{ij}\) uses that row and column).
  • The determinant is the dot product of any row of A with its row of cofactors.
    When a row of A has lot of zeros, we only need a few cofactors.
    The idea behind cofactors is to reduce the order one step at a time.

5.3 Cramer’s Rule, Inverses, and Volumes

  • Cramer’s Rule solves \(Ax = b\) by ratios like \(x_1 = |B_1|/|A| = |ba_2 \dots a_n|/|A|\).
  • When C is the cofactor matrix for A, the inverse is \(A^{-1} = C^T / \text{det A}\).
  • The volume of a box is |det A|, when the box edges going out from the origin are the rows of A.
  • The cross product is a vector with length \(\|u\|\|v\||sin\theta|\). Its direction is perpendicular to u and v. Its points “up” or “down” by the right hand rule.

6 Eigenvalues and Eigenvectors

6.1 Introduction to Eigenvalues


发表评论

电子邮件地址不会被公开。 必填项已用*标注