Data Analysis

Use the Excel spreadsheet “BRFSS” and the “BRFSS Dataset Codebook” on D2L to answer the following questions. Provide the output tables and codes for this assignment along with this sheet. You may also combine them into this document (by copying and pasting figures and code).

Data Description

  1. How total many cases (observations) are there in this data set? How many males and females are there (variable “SEX”?) Remember that RStudio is case-sensitive. (2 points)
  2. a. How many of the total cases reported “excellent” and “poor” general health (variable “GENHLTH”)?
    b. How many of the total cases performed physical activities in the previous month (variable “EXERANY2”)? (3 points)
  3. How many total cases were ever told that they had diabetes, including women who were told during pregnancy (variable “DIABETE3”)? (2 points)
    table(BRFSS$DIABETE3)
  4. What are the income level categories (variable “INCOME2”) by sex (NOTE: Only report those for the following categories: “<$15,000”; “$15,000- <$25,000”; “$25,000- <$35,000”; “$35,000- <$50,000”; “$50,000 or more”. Only include categories 1 to 5. Do not report income categories labeled: “6”, “7”, “8”, “77”, “99”)? Which sex had a greater proportion making $35,000 – <$50,000 annually? (3 points)

Finding Descriptive Statistics

For the remainder of this assignment, you will need to use the Excel spreadsheet “NHANES” and the “NHANES 2013-2014 Data set codebook” on D2L to answer the following questions. Provide the output tables and codes for this assignment along with this sheet. You may also combine them into this document (by copying and pasting figures and code).

  1. Without calculating significant differences, which educational level (variable “educ_adult”) has the highest average (mean) systolic (variable “AVGSBP”) and highest average (mean) diastolic blood pressure (variable “AVGDBP”)? (2 points)
  2. What are the 25% (1st Quartile), 50% (Median), and 75% (3rd Quartile) quantiles for calcium? (3 points)

Calculating t-tests

  1. Is mean systolic blood pressure (variable “AVGSBP”) different between males and females in the sample? To answer that question, you need to complete the following sub-questions. (10 points)
    a. First, what type of t-test would you use?
    b. What type of homogeneity test do you need to run before you can run your t-test (also provide the code for that test and corresponding output)?
    c. Are there equal or unequal variances? Based on results from the aforementioned test, would you run a normal two sample t-test or a Welch two sample t-test?
    d. Now run your t-test and answer the question. Provide the code, output, and interpret the results.
  2. Is mean sodium (variable “sodium”) intake different between males and females in the sample? To answer that question, you need to complete the following sub-questions. (10 points)
    a. First, what type of t-test would you use?
    b. What type of homogeneity test do you need to run before you can run your t-test (also provide the code for that test and corresponding output)?
    c. Are there equal or unequal variances? Based on results from the aforementioned test, would you run a normal two sample t-test or a Welch two sample t-test?
    d. Now run your t-test and answer the question. Provide the code, output, and interpret the results.

Sample Solution

ACED ESSAYS