36462/36662 Data Mining Homework 5 Solution
TA: Cong Lu
Problem 3
Figure 1 shows the CV error curve and the degree of freedom is 19. Figure 2 shows the
curve computed from the right-hand side of fo
Data Mining 36-462/36-662
Homework 3 Solution
Cong Lu
March 20, 2013
Problem 1(25 points)
We generate the data cfw_(x, y ) : x = cos(), y = sin(), 2 [0, 6 ] in Figure 1 .
First, apply the traditional
Data Mining: 36-462/36-662
Homework 4 Solutions
Jack Rae
April 15, 2013
Problem 1 [30]
(a) [10]
Recall the least squares criterion
S ( ) = |y X |2
2
(1)
If we let be the minimizer of (1), and = + c v
Data Mining: 36-462/36-662
Homework 1 Solutions
Cong Lu
February 6, 2013
Problem 1 (25 points)
a). (5 points) See the attached code for computing dtm1 and dtm2. We compute dtm1
by normalizing each row
Principal Components: Mathematics, Example,
Interpretation
36-350: Data Mining
18 September 2009
Reading: Section 3.6 in the textbook.
Contents
1 Mathematics of Principal Components
1.1 Minimizing Pro
Data Mining: 36-462/36-662
Homework 2 Solutions
Jack Rae
February 19, 2013
Problem 1
(a) The following code implements the ch.index function.
ch.index = function(x,kmax,iter.max=100,nstart=10,algorith
Lecture 2: More Similarity Searching;
Multidimensional Scaling
36-350: Data Mining
28 August 2009
Reading: Principles of Data Mining, sections 14.114.4 (skiping 14.3.3 for
now) and 3.7.
Lets recap whe
Predicting Quantitative Features: Regression
36-350, Data Mining
6 October 2008
Reading: sections 6.16.3 and 11.1 in Principles of Data Mining.
Optional Reading: chapter 1 of Berk.
Weve already looked
Additive Models
36-350, Data Mining, Fall 2009
2 November 2009
Readings: Principles of Data Mining, pp. 393395; Berk, ch. 2.
Contents
1 Partial Residuals and Backtting for Linear Models
1
2 Additive M
R Environment
-R is an integrated suite of software facilities for data manipulation, calculation and graphical
display.
-effective data handling and storage facility
-a suite of operators for calcula
Lecture 3 Page Rank
36-350, Data Mining
31 August 2009
The combination of the bag-of-words representation, cosine distance, and
inverse document frequency weighting forms the core of lots of informati
Similarity and Invariance; Searching for Similar
Images
36-350: Data Mining
2 September 2009
Reading: Section 14.5 in the textbook.
So far, we have seen how to search and categorize texts by represent
Making Better Features
36-350: Data Mining
16 September 2009
Reading: Sections 2.4, 3.4 and 3.5 in the textbook.
Contents
1 Standardizing and Transforming
1
2 Relationships Among Features and Low-Dime
Lecture 1: Similarity Searching and Information
Retrieval
36-350, Data Mining
26 August 2009
Readings: Principles of Data Mining, ch. 1, and sections 14.1
and 14.3.014.3.1.
One of the fundamental prob
Dimension reduction 1: Principal component
analysis
Ryan Tibshirani
Data Mining: 36-462/36662
February 5 2013
Optional reading: ISL 10.2, ESL 14.5 Clustering as dimension reduction
We've thought a