CS195f Homework 1: Naive Bayes
Mark Johnson and Erik Sudderth
Homework due at 2pm, 24th September 2009
The Nursery database records a series of admission decisions to a nursery in Ljubljana,
Slovenia. We downloaded this data from
http://archive.ics.uci.edu/ml/datasets/Nursery
,
which you can see for more details if you’re interested.
The database contains one tuple for each admission decision. The features or attributes
include ﬁnancial status of the parents, the number of other children in the house, etc. The
ﬁrst three tuples in the dataset are as follows:
usual,proper,complete,1,convenient,convenient,nonprob,recommended,recommend
usual,proper,complete,1,convenient,convenient,nonprob,priority,priority
usual,proper,complete,1,convenient,convenient,nonprob,not_recom,not_recom
where the ﬁrst 8 values are features or attributes and the 9th value is the class assigned (i.e.,
the admission decision recommendation).
Your job is to build a Naive Bayes classiﬁer that will make admission recommendations.
Luckily the really hard work of data preparation has been done for you by Deqing, our
fearless TA. The ﬁle
/course/cs195f/asgn/naive_bayes/handout/nursery/nursery.mat
contains this in a matrix format that Matlab can directly read. All of the symbols have been
replaced with identifying integers. The ﬁrst three rows of this matrix are:
>> load(’/course/cs195f/asgn/naive_bayes/handout/nursery/nursery.mat’);
>> data(1:3,:)
ans =
1
1
1
1
1
1
1
1
2
1
1
1
1
1
1
1
2
4
1
1
1
1
1
1
1
3
1
You should divide this data into equal-sized training and testing data sets as follows (the