One-Sample T-test
Homework Assignment SPSS Exercises
The questionnaire used to collect the data for the survey is in our textbook – Avery Fitness Center. You will need it to define the labels in SPSS.
Questions:
- Get freq, mean, std and sem for each variable. Tell for which variables the mean, std and sem are “meaningful” and give the interpretation of each.
Mean
| Std | SEM | Can the mean of this type of variable be interpreted? (Y/N) | |
Weight | ||||
Classes | ||||
Circuit | ||||
Station | ||||
Pool | ||||
Visits | ||||
Daypart | ||||
Doctor | ||||
Enjoy | ||||
Age | ||||
Gender |
- Using only the standard deviation for each of the Importance variables (in survey, how important …), which variable had the greatest amount of agreement? List these four variables in table below in order of most to least agreement.
Importance Variables | Standard Deviation |
Create and present a Frequency table to present your answers to the following questions:
- What percentage of the respondents who answered the Gender question are male?
- What percentage of everyone who took the survey are female?
Create and present a single Frequency table using the variable Income and answer the following questions:
- What percentage of the respondents who answered the question make over $120,000 per year?
- What percentage of the respondents who answered the question reported making over $60,000 per year?
- What percentage of the respondents who answered the question reported making $30,000 or less per year?
- What percentage of the survey respondents reported making between $45,001 and $60,000 per year?
Create and present a Histogram with a normal curve (can use the SPSS graph) using the variable Age and answer the following questions:
- What are the mean, standard deviation, and count for age?
- What are the upper and lower boundaries (i.e., ages) of the normal distribution? How did you calculate these numbers?
- Identify (by specific ages) any outliers (if any)?
- If there are outliers, what do you recommend be done with them and why?
Create and present a Frequency table using the variable Gender:
- Based on the percentages of males, calculate the sampling error for the proportion using the formula from our book. Be sure to show and explain the numbers you used. Also be sure to show the resultant confidence interval.
Create and present a single table that lists the following:
- Percentages and counts for each category of the four continuous variables: General Health/Fitness, Social Aspects, Physical Enjoyment, and Specific Medical Concerns;
- The top two boxes for each of these variables;
Create a single table to compare the means between:
- The pairs of all of these four continuous Importance variables (General Fitness; Social Aspects; Physical Enjoyment; Specific Medical Concerns). Explain if there are/are not significant differences between each pair of variables.
- List all the variables in the table in the order of most important to least important (be sure to show why/how you determined the level of importance). Be sure to show the ranking numbers (e.g., 1,2,3,4).
Run a One-sample T-test and present a table to determine:
- If the average number of monthly visits (i.e., the variable Visits) is significantly different from the national average of eight. Interpret and explain your relevant results. Be sure to report the mean difference, t-value, degrees of freedom, and significance level.
Create and present a Cross-tabulation table of the variables Pool and Doctor.
- What percentage of the total sample utilized the therapy pool?
- What percentage of those who used the therapy pool were recommended by a doctor?
- What percentage of those recommended by a doctor utilized the therapy pool?
- Are the results significant?
- How strongly, if at all, are the variables associated with each other?
Show in a table:
- The comparison of the means between the number of Visits and whether people had Utilized the exercise circuit. Explain, and show, if the means are significantly different from each other.
Run and interpret a correlation analysis and create a single table that:
- Uses the four Importance variables (General Fitness; Social Aspects; Physical Enjoyment; Specific Medical Concerns) showing the correlations and which are significant.
- Replace the diagonal values with the respective means in the table.
- Interpret the table.
Recommendations
- Based on your analysis of ALL the data in this assignment, write an Executive Summary of your findings with clear managerial recommendations. 200 -300 words
Solution
Answers:
Mean
| Std | SEM | Can the mean of this type of variable be interpreted? (Y/N) | |
weight | 0.32 | 0.465 | 0.022 | No |
classes | 0.26 | 0.440 | 0.021 | No |
circuit | 0.22 | 0.415 | 0.020 | No |
station | 0.12 | 0.325 | 0.015 | No |
pool | 0.45 | 0.498 | 0.023 | No |
visits | 14.20 | 7.733 | 0.387 | Yes |
daypart | 1.30 | 0.549 | 0.027 | No |
doctor | 0.26 | 0.439 | 0.021 | No |
enjoy | 3.91 | 1.090 | 0.055 | Yes |
age | 62.56 | 19.630 | 0.937 | Yes |
gender | 1.79 | 0.105 | 0.019 | No |
The mean can be interpreted for quantitive variables only. Here in our case, we can interpret the mean of visits and age
- For the 400 persons who answered the question “Number of visits to AFC in previous 30 days”, the mean number of visits in the previous 30 days is of 14.20 visits with a standard deviation of 7.733.
- The mean score for the physical enjoyment is of 3.91 meaning that the physical enjoyment is an important reason for participating in AFC
- For the 439 persons who answered the question related to age, the mean age is of 62.56 years with a standard deviation of 19.630
- Using only the standard deviation for each of the Importance variables, The table below shows in order of most to least agreement
Importance Variables | Standard Deviation |
Fitness | 0.745 |
Enjoy | 1.090 |
Medical | 1.206 |
Social | 1.272 |
Let us create and present, using SPSS, a Frequency table for the variable gender
Frequency | Percent | Valid Percent | Cumulative Percent | ||
Valid | male | 89 | 19.8 | 20.6 | 20.6 |
female | 344 | 76.4 | 79.4 | 100.0 | |
Total | 433 | 96.2 | 100.0 | ||
Missing | System | 17 | 3.8 | ||
Total | 450 | 100.0 |
- The percentage of male respondents who answered the Gender question is of 20.6% (89 male among 433 respondents)
- What percentage of everyone who took the survey are female 76.4 (344 female among 450 person)
Using SPSS, let us create and present a single Frequency table using the variable Income
Frequency | Percent | Valid Percent | Cumulative Percent | ||
Valid | 0 – 15.000 | 14 | 3.1 | 4.0 | 4.0 |
15.001 – 30.000 | 43 | 9.6 | 12.3 | 16.3 | |
30.001 – 45.000 | 49 | 10.9 | 14.0 | 30.3 | |
45.001 – 60.000 | 83 | 18.4 | 23.7 | 54.0 | |
60.001 – 75.000 | 60 | 13.3 | 17.1 | 71.1 | |
75.001 – 90.000 | 35 | 7.8 | 10.0 | 81.1 | |
90.001 – 105.000 | 25 | 5.6 | 7.1 | 88.3 | |
105.001 – 120.000 | 23 | 5.1 | 6.6 | 94.9 | |
more than 120.000 | 18 | 4.0 | 5.1 | 100.0 | |
Total | 350 | 77.8 | 100.0 | ||
Missing | System | 100 | 22.2 | ||
Total | 450 | 100.0 |
- 0% (18 person among 350) of the respondents who answered the question make over $120,000 per year
- 46% (161 person among 350) of the respondents who answered the question reported making over $60,000 per year
- 3% (57 person among 350) is the percentage of the respondents who answered the question reported making $30,000 or less per year
- 7% (83 person among 350) of the survey respondents reported making between $45,001 and $60,000 per year.
The following figure represents a Histogram with a normal using the variable Age
- The mean age of the 439 respondents is of 62.56 years with a standard deviation of 19.63
- the upper and lower boundaries can be calculated using the following formula
Upper boundary is of 121 years
Lower boundary 3.67 years
- there are no outliers in the data
- Outlier can be replaced with the maximum/minimum value dependent if the outiler is greater/smaller than the upper/lower boundaries. Or it can simply be replaced with the median
The following table representa Frequency table using the variable Gender
Frequency | Percent | Valid Percent | Cumulative Percent | ||
Valid | male | 89 | 19.8 | 20.6 | 20.6 |
female | 344 | 76.4 | 79.4 | 100.0 | |
Total | 433 | 96.2 | 100.0 | ||
Missing | System | 17 | 3.8 | ||
Total | 450 | 100.0 |
Based on the percentages of males, the sampling error for the proportion is:
- The following table present the percentages and counts for each category of the four continuous variables: General Health/Fitness, Social Aspects, Physical Enjoyment, and Specific Medical Concerns
Variable | Category | Count | Percent | Valid Percent | Cumulative Percent | |||||
Fitness | Valid | 1 | 10 | 2.2 | 2.2 | 2.2 | ||||
2 | 4 | .9 | .9 | 3.1 | ||||||
3 | 8 | 1.8 | 1.8 | 4.9 | ||||||
4 | 50 | 11.1 | 11.2 | 16.1 | ||||||
5 | 374 | 83.1 | 83.9 | 100.0 | ||||||
Total | 446 | 99.1 | 100.0 | |||||||
Missing | System | 4 | .9 | |||||||
Total | 450 | 100.0 | ||||||||
Social | Valid | 1 | 53 | 11.8 | 13.5 | 13.5 | ||||
2 | 66 | 14.7 | 16.8 | 30.2 | ||||||
3 | 113 | 25.1 | 28.7 | 58.9 | ||||||
4 | 94 | 20.9 | 23.9 | 82.7 | ||||||
5 | 68 | 15.1 | 17.3 | 100.0 | ||||||
Total | 394 | 87.6 | 100.0 | |||||||
Missing | System | 56 | 12.4 | |||||||
Total | 450 | 100.0 | ||||||||
Enjoy | Valid | 1 | 18 | 4.0 | 4.6 | 4.6 | ||||
2 | 20 | 4.4 | 5.1 | 9.6 | ||||||
3 | 84 | 18.7 | 21.3 | 31.0 | ||||||
4 | 128 | 28.4 | 32.5 | 63.5 | ||||||
5 | 144 | 32.0 | 36.5 | 100.0 | ||||||
Total | 394 | 87.6 | 100.0 | |||||||
Missing | System | 56 | 12.4 | |||||||
Total | 450 | 100.0 | ||||||||
Medical | Valid | 1 | 33 | 7.3 | 8.1 | 8.1 | ||||
2 | 14 | 3.1 | 3.4 | 11.6 | ||||||
3 | 43 | 9.6 | 10.6 | 22.2 | ||||||
4 | 122 | 27.1 | 30.0 | 52.2 | ||||||
5 | 194 | 43.1 | 47.8 | 100.0 | ||||||
Total | 406 | 90.2 | 100.0 | |||||||
Missing | System | 44 | 9.8 | |||||||
Total | 450 | 100.0 | ||||||||
- The following table shows the top two boxes for each of these variables
Variable | Category | Count | Percent | Valid Percent | |||
Fitness | 1 | 10 | 2.2 | 2.2 | |||
2 | 4 | .9 | .9 | ||||
Social | 1 | 53 | 11.8 | 13.5 | |||
2 | 66 | 14.7 | 16.8 | ||||
Enjoy | 1 | 18 | 4.0 | 4.6 | |||
2 | 20 | 4.4 | 5.1 | ||||
Medical | 1 | 33 | 7.3 | 8.1 | |||
2 | 14 | 3.1 | 3.4 | ||||
- There are six possible pairs, the following table shows comparisons of means between the pairs of all of four continuous Importance variables (General Fitness; Social Aspects; Physical Enjoyment; Specific Medical Concerns)
PairedDifferences | t | df | Sig. (2-tailed) | ||||||
Mean | Std. Deviation | Std. ErrorMean | 95% Confidence Interval of the Difference | ||||||
Lower | Upper | ||||||||
Pair 1 | fitness – social | 1.629 | 1.321 | .067 | 1.499 | 1.760 | 24.482 | 393 | .000 |
Pair 2 | fitness – enjoy | .853 | 1.069 | .054 | .747 | .959 | 15.829 | 393 | .000 |
Pair 3 | fitness – medical | .700 | 1.210 | .060 | .581 | .818 | 11.645 | 405 | .000 |
Pair 4 | social – enjoy | -.788 | 1.110 | .057 | -.899 | -.676 | -13.937 | 385 | .000 |
Pair 5 | social – medical | -.925 | 1.537 | .080 | -1.081 | -.768 | -11.608 | 371 | .000 |
Pair 6 | enjoy – medical | -.128 | 1.488 | .077 | -.279 | .023 | -1.664 | 375 | .097 |
From the table above, we can notice that all the differences between the pairs are statistically significant (p-values of the 2 tailed test are smaller than the 5% significance level). Except the difference between enjoy and medical (Pair 6) which is not significant (t=-1.664, df=375, p-value=0.097>0.05) and hence the means of these two variables is not statistically significant.
- Using the mean, the following table shows the importance in the order of most important to least important
Importance Variables | Standard Deviation | |
1 | Fitness | 4.74 |
2 | Medical | 4.06 |
3 | Enjoy | 3.91 |
4 | Social | 3.15 |
- In this question, we want to examine whether the average number of monthly visits is significantly different from the national average of eight. To do so, we proceed to a One-sample T-test. The following table shows the results obtained via SPSS.
variable | t | df | Sig. (2-tailed) | MeanDifference | 95% Confidence Interval of the Difference | |
Lower | Upper | |||||
visits | 16.022 | 399 | .000 | 6.195 | 5.43 | 6.96 |
From the table above, we can securely confirm, at the 5% significance level, that the average number of monthly visits is statistically different from the national average of eight (t=10.022, df=399 and p-value<0.05)
The following table represents a cross-tabulation table of the variables Pool and Doctor
pool | |||
No | Yes | ||
Count | Count | ||
doctor | No | 203 | 130 |
Yes | 44 | 73 | |
Chi-square | 19.071 | ||
df | 1 | ||
Sig | 0.000 |
- The percentage of the total sample utilized the therapy pool
- The percentage of those who used the therapy pool were recommended by a doctor
- The percentage of those recommended by a doctor utilized the therapypool
- From the table 8 above, the chi-square statistic is of 19.071 with 1 degree of freedom and a null p-value meaning that, at the 5% significance level, there is a significant association between utilized the therapy pool and a doctor recommendation.
- The coefficient of correlation between the pool and the doctor recommendation is of 0.206 meaning that there is a moderate association between these two variables.
- The following table represent the comparison of the means between the number of Visits and whether people had utilized the exercise circuit.
t-test for Equality of Means | ||||||||
t | df | Sig. (2-tailed) | MeanDifference | Std. ErrorDifference | 95% Confidence Interval of the Difference | |||
Lower | Upper | |||||||
visits | Equal variances assumed | -2.522 | 398 | .012 | -2.302 | .913 | -4.096 | -.508 |
Equal variances not assumed | -2.849 | 184.863 | .005 | -2.302 | .808 | -3.896 | -.708 |
From the table above, we can confirm that there is a significant difference between the mean number of visits and whether people had utilized the exercise circuit. In fact, the t-tests are significant. Assuming equal variance (t=-2.522, df=398,p-value=0.012<0.05) and not assuming equal variances (t=-2.849, df=184.863,p-value=0.005<0.05)
- The following table shows the correlation of the importance variables
fitness | social | enjoy | medical | ||
fitness | Pearson Correlation | 4.74 | .188** | .340** | .271** |
Sig. (2-tailed) | .000 | .000 | .000 | ||
N | 446 | 394 | 394 | 406 | |
social | Pearson Correlation | .188** | 3.15 | .565** | .238** |
Sig. (2-tailed) | .000 | .000 | .000 | ||
N | 394 | 394 | 386 | 372 | |
enjoy | Pearson Correlation | .340** | .565** | 3.91 | .188** |
Sig. (2-tailed) | .000 | .000 | .000 | ||
N | 394 | 386 | 394 | 376 | |
medical | Pearson Correlation | .271** | .238** | .188** | 4.06 |
Sig. (2-tailed) | .000 | .000 | .000 | ||
N | 406 | 372 | 376 | 406 | |
**. Correlation is significant at the 0.01 level (2-tailed). |
- See table above
- The table above shows a significant and positive correlation between the four importance variables. Meaning that that if one variable increases in value, the second variable also increase in value. Similarly, as one variable decreases in value, the second variable also decreases in value.
- In this question, we will resume the results obtained from the statistical analysis of this Avery Fitness Center survey. Let us start with the personal characteristics of the respondents, the mean age of the 439 respondents is of 62.56 years with a standard deviation of 19.63. 4% of the respondents are female. The high percentage of people who joined the program are reported making between $45.001 and $60.000 per year. Also, we have analyzed the personal reason for participating in AFC programs and the results showed that people participated respectively for fitness, medical and enjoy reasons. Besides, there is a significant difference between the mean average number of monthly and the national average of eight. In fact, that there is a significant difference between the mean number of visits and whether people had utilized the exercise circuit. Furthermore, there is a significant association between utilized the therapy pool and a doctor recommendation.
This information would help the center to focalize their marketing strategy. They should target female population aged between 30 and 70 years and making between $45.001 and $60.000 per year. The center should also work on the fitness program by good monitoring etc. and work on the medical programs and enjoyment materials. The center should work with doctors for recommendations.