We consider the problem of learning a vector-valued function
data-value="R^dtoR^p">Rd→Rp from input-output training data (xi,yi)i=1n where each xi is
a d-dimensional vector and each yi is a p-dimensional vector. We choose our hypothesis class to be the set of linear functions from Rd to Rp , that is function
satisfying f(x)=W⊤x for some d×p regression matrix W , and we want to minimize the squared error loss function
over the training data.
Let W∗ be the minimizer of the empirical risk:
-Derive a closed-form solution for W∗ as a function of the data matrices X∈Rn×d $ and Y∈Rn×p
-Show that solving the problem from the previous question is equivalent to independently solving p independent classical linear regression problems (one for each component of the output vector), and give an example of a multivariate regression task where performing independent regressions for each output variables is not the best thing to do.
-The low rank regression algorithm addresses the issue described in the previous question by imposing a low rank constraint on the regression matrix $W$. Intuitively, the low rank constraint encourages the model to capture linear dependencies in the components of the output vector.
Propose an algorithm to minimize the squared error loss over the training data subject to a low rank constraint on the regression matrix W :
minW∈Rd×pJ(W) rank(W) ≤ R
(hint: There are different ways to do that. Leverage the fact that rank(W) ≤R if and only if there exists A∈Rd×R $ and B∈RR×p $ such that W=AB )
Recently Asked Questions
- Rephrase theargument, ifnecessary, so that the first premise has the form all S are P or no S are P. Then draw a Venn diagram to determine whether the argument
- The drug Valium is eliminated from the bloodstream exponentially with a half-life of 36 hours. Suppose that a patient receives an initial dose of 80 milligrams
- I need to solve this equation -4(9-8x)1(1-x)=2(x-5)I am unable to pull up my online textbook