Design, Build, and Evaluate an Ensemble Prediction (Classification) Model

Scenario
For this Project, imagine you have been hired by a large German bank as a data scientist in their business analytics unit.

The bank wants to use its historical loan data to predict the credit rating (risk) of new loan prospects. Using these credit-rating predictions, the bank can make more informed loan approval or denial decisions that preserve the bank’s assets and reduce its loan defaults.

The bank collects historical data about the loans it extended to its customers. This historical data includes the final classification of the loans. There are two classifications:
• If the loan has been fully paid off, then the loan is classified as good credit rating (low risk to the bank).
• If the loan has defaulted (not paid off), then the loan is classified as bad credit rating (high risk to the bank).

Your first task at the bank is to use the bank’s historical data to design, build, and evaluate an ensemble prediction model to predict the credit rating of new loan prospects. The ensemble model should include at least two different classification algorithms. You are free to use any of the ensemble methods (voting, bagging, boosting, or random forest) for your model.

Sample Solution

ACED ESSAYS