2881990_2_screen-shot-2019-03-25-at-12.51.55-am.png

2881990_2_screen-shot-2019-03-25-at-12.51.55-am.png -...

This preview shows page 1 out of 1 page.

Unformatted text preview: Problem 2 : 10 - fold Cross Validation image . print < - function ( x ) Consider a two- class classification problem with zero - one loss and training data set *train = ( ( 20 1 , 3/1 ) ...... ( In , Un) } , with class labels y; E 10 , 1] . Given a test data point a , recall at the the k-nearest neighbor * . matrix < - matrix ( x , 16 , 16 , byrow = FALSE )* classifier calculates y , the predicted class of ac , as follows :\ * . matrix . rotated < - t ( apply ( x . matrix , 1 , rev ) ) image ( x . matrix . rotated , axes = FALSE , col = grey ( seq ( 0 . 1 , length . out = 256 ) ) ) . Find the k points in (*1 . .... In] that are closest to ac ( in terms of Euclidean distance in 19 ) . . Predict ; to be the majority class to which the k closest points belong ." The original image was scanned bottom up from the right , so we first transform * into a 16 * 16 We denote the above prediction to be matrix , rotate , and transpose the data ." 3 . Randomly split the data into :" * = f ( ac ; K , *train ) . Training data ( 60% of the entire data set ) Given data set *^ = ( ( 20 1 , 4/ 1 ) . .... ( Zen , Un ) ] , describe a step - by- step 10- fold cross validation procedure to . Test data ( 20% ) choose an optimal value for the parameter K out of the values $ 1 , 3 , 5 , 7 , 93 . You should use the notation* . Validation data ( 20% ) If defined above . Address the following issues :` 1 . The R function kun in library class implements a k- nearest classifier . We will only work with two . What are the 10 folds ? classes and odd values of K , so you do not have to implement tie-breaking .* . For each fold , what do you use as the training data and what do you use as the validation data ?" . Train a 1 - nearest neighbor classifier using the training data and predict the labels of the images* in the test data . What is the test error ( the empirical error rate on the test set ) ?' . What quantity do you compare for KE ( 1 , 3 , 5 , 7 , 97 ?' . Plot the misclassified images .* . How do you determine which & is optimal ?" 5 . Select & as follows :" Nearest neighbor methods often work surprising well . Can you think of a reason why they may nonetheless* . For KE ( 1 , 3, 5, 7 , 9, 13 ; , train the k- In classifier on the training data and classify the images in* be an inconvenient choice for an application running , for example , on a phone or a digital camera ?* the test set . Compute the test error for each * .* . Which value of K should you choose ? Why ?* Problem 3 : Cross validating a nearest neighbor classifier* . Finally , compute the error rate of the classifier for the optimal value of K on the validation set ." A nearest neighbor classifier requires a parameter ( the number K of neighbors used to classify ) . We will* use cross validation to select the value of " for a specific type of data , handwritten digits .* Download the digit data set from Courseworks . The zip archive contains two files :" both files are text files . The file usesdata . txt contains a matrix with one data point* ( = vector of length 256 ) per row . The 256 - vector in each row represents at 16 * 16 image of a handwritten number . The uspsci . Ext contains the corresponding class* labels . The data contains two classes - the digits 5 and 6 - so the class label are stored 6 ) as - I and + 1 , respectively . The image on the right shows the first row , re-arranged as a 16 x 16 matrix and plotted as a gray scale image .* 1 . Read the data into R from the dataset . It can be more convenient to model categorical class data with the factor data type in R. Use the function as . factor to transform the class label to the factor type . 2 . Plot the first four images using the following function . Note that the input* * should be a numerical vector of length 256 . 3...
View Full Document

{[ snackBarMessage ]}

Get FREE access by uploading your study materials

Upload your study materials now and get free access to over 25 million documents.

Upload now for FREE access Or pay now for instant access
Christopher Reinemann
"Before using Course Hero my grade was at 78%. By the end of the semester my grade was at 90%. I could not have done it without all the class material I found."
— Christopher R., University of Rhode Island '15, Course Hero Intern

Ask a question for free

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern