Performing a Cluster Analysis

This business case is using a data set and unsupervised method of determining derived information for a client heavily engaged in charity contributions. The client is wanting to determine if there are clear segregation points in terms of amounts contributed over a period of time (ex: one time or every x months) and actual dollar amounts contributed. The client believes this information will help them determine better methods and efficiencies for their marketing campaigns. As the business analyst, you need to provide information from the PVA (contributions) dataset in SAS that will help your client understand this problem/concern. This Word document will be the basis for your responses to the client. Be sure all questions are answered completely and professionally. Do not use abbreviated or incomplete sentences when responding. This is a professional report back to the client. No hand written responses will be accepted.
Please address each question or sets of questions below in detail. Also, provide objective evidence (OE) of each as requested (the graphs that you create with the SAS tool).

  1. Perform a cluster analysis for all the variables that begin with “Gift” (as these variables provide the primary context for this analysis).

a. Provide a Parallel coordinates plot diagram.

b. How many clusters were generated? Why?

c. Which cluster ID had the largest variability? What does this tell the client?

d. Which two of the cluster centroids are closest? What does this tell the client?

  1. Change the model to four-clusters. For the four-cluster solution,

e. Provide the parallel coordinates plot for the four-cluster model.

f. Derive a new Cluster ID variable.

g. How many distinct levels are in the new variable?
h. Create an auto chart of the new cluster variable. What does the -1 category represent? Provide the auto leveling chart.

  1. Change the model to three-clusters. Examine the parallel coordinates plot for the three-cluster solution.

i. Provide the parallel coordinates plot for the three-cluster model.
j. Which cluster gave the most money per donation? What analysis can you provide the client relative to this result?
k. Which cluster donated most frequently? What analysis can you provide the client relative to this result?

Sample Solution

ACED ESSAYS