You work for a construction firm who need to be able to accurately predict the compressive strength of concrete given variables including the concrete composition and its age. You are aware that the relationship between these different components and the concrete strength is complex, however you have been asked to investigate how well a simple linear regression model works for prediction. Using the provided data (Concrete_Data.xls), develop models to predict:
Concrete strength from the single best indicator variable;
Concrete strength from all variables.
With the second model, determine if any variables are not contributing significantly to the model, and what impact removing these has on prediction performance. Comment on the final model and its accuracy, and whether it would be appropriate to use this model in practice.
You should draw on the unit content concerning correlation and regression to answer this question. Note that you are not expected to use training/validation/testing data splits, although you are welcome to do so. No marks will be lost/gained for using/not using data splits.

Sample Solution

This question has been answered.

Get Answer