ndim D one assert len keep nout 14f ERROR Should have returned nout elements

Ndim d one assert len keep nout 14f error should have

This preview shows page 14 - 18 out of 21 pages.

ndim } -D one. , ***" assert len (keep) == n_out, \ 14
f"*** ERROR: Should have returned { n_out } elements, not { len (keep) } . , ***" assert len ( set (keep)) == n_out, \ f"*** ERROR: Output is of the wrong length or contains duplicate , elements. ***" assert (( 0 <= keep) & (keep < n)) . all(), \ f"*** ERROR: Output contains invalid (out-of-bounds) values ***" assert isin(k_smaller, keep) . all(), \ f"*** ERROR: Elements { setdiff1d(k_smaller, keep) } are missing. ***" except : print ( "=== Inputs ===" ) print ( "- Input array, `y`:" , y) print ( "- Smaller group positions (must be included):" , k_smaller) print ( "- Larger group positions (choose an equal-sized subset):" , , k_larger) print ( " \n === Your output ===" ) print ( "- Keep-set:" , keep) raise for trial in range ( 10 ): print ( f"=== Trial # { trial } / 9 ===" ) ex1_check() ### ### AUTOGRADER TEST - DO NOT REMOVE ### print ( " \n (Passed.)" ) === Trial #0 / 9 === === Trial #1 / 9 === === Trial #2 / 9 === === Trial #3 / 9 === === Trial #4 / 9 === === Trial #5 / 9 === === Trial #6 / 9 === === Trial #7 / 9 === === Trial #8 / 9 === === Trial #9 / 9 === (Passed.) 1.5.2 Precomputed solution for Exercise 1 Here is some code to load a precomputed training dataset with filled-in missing values. Regardless of whether your Exercise 0 works or not, please run this cell now so subsequent exercises can continue. It will create two variables named keep_train_ds and keep_test_ds , two 1-D Numpy array-like objects that indicate which samples to keep from the training and testing datasets, respectively. 15
You’ll need these two keep-sets later, so do not modify them! [21]: with open (get_path( 'ex1_soln.pickle' ), 'rb' ) as fp: keep_train_ds = pickle . load(fp) keep_test_ds = pickle . load(fp) print (keep_train_ds . shape) print (keep_test_ds . shape) (1096978,) (275268,) 1.5.3 Reassessing the baseline classifier Suppose we downsample the test set , so that we test on equal numbers of “0” and “1” examples. How does the accuracy change? [22]: # Down-sample the testing data: X_test_ds = X_test[keep_test_ds, :] y_test_ds = y_test[keep_test_ds] # Reevaluate the classifier: test(baseline_classifier, X_test_ds, y_test_ds) , --------------------------------------------------------------------------- NameError Traceback (most recent call , last) <ipython-input-22-f797eee81c34> in <module> 4 5 # Reevaluate the classifier: ----> 6 test(baseline_classifier, X_test_ds, y_test_ds) NameError: name 'baseline_classifier' is not defined Observation. You should see test accuracy drop to near 50%. Recall that this balanced, down- sampled test set has equal numbers of 0 and 1 examples. So this accuracy is no better than random guessing! 1.6 Up-sampling Whereas down-sampling shrinks the larger groups so that they have an equal number of samples as the smaller group, up-sampling does the opposite: it takes the smaller group and makes it bigger by 16
randomly selecting elements with replacement . One needs to use replacement because the smaller group will necessarily require repeats to match the size of a larger group. Conveniently, the choice() function mentioned before allows you to sample with re- placement by the parameter, replace=True . In contrast to down-sampling, up-sampling avoids throwing out data. However, doing so comes at a price: the cost goes up for training (or testing) on a now-larger number of up-sampled inputs. 1.6.1 Exercise 2: Up-sampling (1 point) Suppose you are given a 1-D Numpy vector, y , whose values are either 0 or 1. Implement a function, upsample(y) , that implements an up-sampling strategy , summarized as follows.

You've reached the end of your free preview.

Want to read all 21 pages?

• Fall '14
• DaKuang
• The Missing, len, NumPy, Dewey Robertson