Data mining

  1. Present an example where data mining is crucial to the success of a business. What data mining functions does this business need? Can they be performed alternatively by data query processing or simple statistical analysis?
  2. Define each of the following data mining functionalities: characterization, discrimination, association and correlation analysis, classification, prediction, clustering, and evolution analysis. Give examples of each data mining functionality, using a real-life database that you are familiar with.
  3. Describe three challenges to data mining regarding data mining methodology and user interaction issues.
  4. Discuss issues to consider during data integration.
  5. Suppose that the data for analysis includes the attribute age. The age values for the data tuples are (in increasing order) 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
    (a) What is the mean of the data? What is the median?
    (b) What is the mode of the data? Comment on the data’s modality (i.e., bimodal, trimodal, etc.).
    (c) What is the midrange of the data?
    (d) Can you find (roughly) the first quartile (Q1) and the third quartile (Q3) of the data?
    (e) Give the five-number summary of the data.
    (f) Show a boxplot of the data.
    (g) How is a quantile-quantile plot different from a quantile plot?

Sample Solution