Use the following learning schemes to analyze the attached files adult datasets (here is the description of the data). The files were converted to Weka file adult-Train.arff and adult-Test.arff.
• ZeroR (majority class)
• Naive Bayes Simple
Target field is CLASS (i.e., income). For test options, first choose "Use training set", and then choose "Percentage Split" using default 66% percentage split.
1. Report each classification model, percent error rate, and accuracy. Because each learning schemes will be modeled two different test options, in total you should include 8 different classification models in your reports. For example, for ZeroR learning schema, two models will be generated one with “Use training set” and the other with “Percentage Split”. Below are the steps to construct the classification model for ZeroR algorithm with test option – “Use training set”.
a. Select a Classify learning schema: ZeroR
b. Training adult-Train by specifying “Use training set” as the test option. Report what classifier you get and its accuracy
c. Next, specify adult-Test as the supplied test set. Report what classifier you get and how its accuracy compare to the one in the previous step
d. Now using the other test option – “Percentage Split”. Report what classifier you get and its accuracy.
e. Repeat step c. What do you observe? Does the accuracy on the test set improve and if so, why do you think it does?
f. Following the step a to e for other learning schemas: Naïve Bayes Simple, J4.8, and RandomForest.
2. Once you have built 8 models, which of these classifiers are you more likely to trust when determining whether the income is equal or grater than $50,000? Why?
3. What can you say about accuracy when using training set data and when using a separate percentage to train?
Recently Asked Questions
- Computer-controlled automated machines that work alone and that are NOT integrated to work with other machines or systems are called: POSSIBLE ANSWERS:
- Define fiscal policy and its key objectives. What government agencies are responsible for marking decision on fiscal policy actions and implementation?
- Define "Industry", "Business" and "Sector". How are these related?