Description

Use the mushrooms dataset attached to do the following.
a. Read the dataset into R. Don’t set stringsAsFactors=FALSE.
b. How many variables, rows, and columns are there in the dataset? What are the datatypes?
c. Which variable needs to be removed from the dataset? Why? (Enter the answer as a comment).
d. Remove the identified variable from part c and do a str function again. What do you notice? (Enter the answer as a comment).
e. In this dataset, what percentage of the mushrooms are poisonous and non-poisonous? Use the prop.table() function to identify.
g. Use the entire dataset as a training dataset, and apply the OneR model.
h. What are the first three rules that result? Enter your answer as a comment.
i. Test the model on the test dataset (use the entire data as a test) and identify the overall error rate. Enter the overall error rate as a comment.
j. What is another algorithm that can be used to improve upon the overall accuracy? Enter your answer as a comment. You needn’t run the algorithm.

Sample Solution

This question has been answered.

Get Answer