Aspen Gulley

Data Scientist | Behavior Analyst in Training


Hypothesis Testing: ANOVA, Chi-Square Goodness of Fit, One Sample Z-Test, One Sample T-Test, One Sample Variance Test, Two Sample Z-Test, Two Sample T-Test, Paired T-Test, Two Sample Variance Test


Hypothesis Testing: ANOVA

The following is a compilation of a series of ANOVA tests conducted over the course of a few years.

Analysis of Variance (ANOVA) is a statistical technique used to compare the means of groups to determine if there are any statistically significant differences between them. It helps in understanding whether there are variations between group means that are larger than what would be expected due to random sampling variation. ANOVA tests the null hypothesis that all group means are equal, and if the null hypothesis is rejected, it indicates that at least one group mean is significantly different from the others.

ANOVA STUDY

In this study, I performed a series of analyses using ANOVA to investigate the impact of different factors (Diet, Exercise, Factor A, and Factor B) on Serum Drop, Yield 1, Yield 2, and Yield 3.

For Serum Drop, a Two-Way ANOVA was conducted on Diet and Exercise, revealing significant effects for both factors. Post hoc tests provided additional insights into specific group differences. The normality and equal variance assumptions were validated.

Yield 1 showed significant effects for both Factor A and Factor B, as indicated by the Two-Way ANOVA. Interaction plots confirmed that the means were parallel, supporting the use of the additive model.

However, for Yield 2 and Yield 3, interaction plots suggested a departure from parallel means, prompting further investigation. Two-Way ANOVA confirmed significant effects for both factors in Yield 2, supporting the interaction model. In contrast, for Yield 3, ANOVA failed to find significant effects for either factor.

For Serum Drop, the interaction ANOVA indicated that the additive model was appropriate. Yet, for Yield 2 and Yield 3, interaction ANOVAs favored the interaction models, suggesting a more complex relationship.

In summary, the analyses revealed that Diet and Exercise significantly affect Serum Drop, while Factor A and Factor B influence Yield 1 and Yield 2. Yield 3 appears less sensitive to these factors. The results suggest that the appropriateness of additive or interaction models depends on the specific response variable under consideration.

#Additive model
#Ho: Factor A has no effect on output (lowering serum)
#Ha: Factor A has an effect on output
#Ho: Factor B has no effect on output
#Ha: Factor B has an effect on output
two.way<- aov(lab12.data$Serum.Drop~lab12.data$D + lab12.data$E, data=lab12.data)
summary(two.way)
plot(two.way)
#Reject the null, Factors A and B both have an effect on output
#the lower value for exercise indicates we are more certain of that effect on serum drop

# Df Sum Sq Mean Sq F value Pr(>F)
#lab12.data$D 2 279.6 139.8 5.086 0.0123 *
#lab12.data$E 2 1160.2 580.1 21.101 1.64e-06 ***
# Residuals 31 852.3 27.5

#Test for normality
library(nortest)
two.way.residuals<- two.way$residuals
ad.test(two.way.residuals)
#Fail to reject null, no reason to doubt normality assumption
#Anderson-Darling normality test, data: two.way.residuals
#A = 0.38574, p-value = 0.3735

#Test for equal variances
leveneTest(lab12.data$Serum.Drop~lab12.data$D*lab12.data$E,data=lab12.data)
#Fail to reject null, no reason to doubt equal variance assumption
#Levene’s Test for Homogeneity of Variance (center = median)
# Df F value Pr(>F)
#group 8 1.0815 0.4051
# 27

bartlett.test(lab12.data$Serum.Drop ~ interaction(lab12.data$D, lab12.data$E), data = lab12.data)
#Fail to reject null, no reason to doubt equal variance assumption
#Bartlett test of homogeneity of variances, data: lab12.data$Serum.Drop by interaction(lab12.data$D, lab12.data$E)
#Bartlett’s K-squared = 6.9552, df = 8, p-value = 0.5415
#Because normality is not violated, we trust Bartlett’s test for equal variances

tukey.two.way<-TukeyHSD(two.way)
tukey.two.way
#Tukey multiple comparisons of means, 95% family-wise confidence level
#Fit: aov(formula = lab12.data$Serum.Drop ~ lab12.data$D + lab12.data$E, data = lab12.data)
#$`lab12.data$D`
#diff lwr upr p adj
#2–1 3.545736 -1.722607 8.814079 0.2378963
#3–1 6.825292 1.556949 12.093635 0.0088930
#3–2 3.279556 -1.988787 8.547899 0.2901658
#$`lab12.data$E`
#diff lwr upr p adj
#2–1 7.177637 1.909294 12.44598 0.0058384
#3–1 13.903312 8.634969 19.17166 0.0000009
#3–2 6.725675 1.457333 11.99402 0.0099994

serumdropplot<- interaction.plot(x.factor = lab12.data$D, #x-axis variable
trace.factor = lab12.data$E, #variable for lines
response = lab12.data$Serum.Drop, #y-axis variable
fun = median, #metric to plot
ylab = “Serum Drop”,
xlab = “Diet”,
col = c(“orange”, “blue”, “green”),
lty = 1, #line type
lwd = 2, #line width
trace.label = “Exercise”)

#Two way ANOVA on Factors A and B vs Yield 1
#Ho: The means are parallel, no reason to doubt the additive model
#Ha: The means do not look parallel, we have reason to doubt the additive model
yield1plot<- interaction.plot(x.factor = lab12.data$Factor.A, #x-axis variable
trace.factor = lab12.data$Factor.B, #variable for lines
response = lab12.data$Yield.1, #y-axis variable
fun = median, #metric to plot
ylab = “Mean”,
xlab = “Factor A”,
col = c(“orange”, “blue”),
lty = 1, #line type
lwd = 2, #line width
trace.label = “Factor B”,
main=”Data Means: Interaction Plot for Yield 1″)
#Fail to reject null, the means look parallel, no reason to doubt the additive model

#Ho: Factor A has no effect on output for Yield 1
#Ha: Factor A has an effect on output for Yield 1
#Ho: Factor B has no effect on output for Yield 1
#Ha: Factor B has an effect on output for Yield 1
yield1<- aov(lab12.data$Yield.1~lab12.data$Factor.A + lab12.data$Factor.B, data=lab12.data)
summary(yield1)
#Reject null, both factors A and B have an effect on Yield 1
# Df Sum Sq Mean Sq F value Pr(>F)
#lab12.data$Factor.A 1 58.9 58.9 11.56 0.0192 *
# lab12.data$Factor.B 1 826.2 826.2 162.28 5.3e-05 ***
# Residuals 5 25.5 5.1

#Two way ANOVA on Factors A and B vs Yield 2
#Ho: The means are parallel, no reason to doubt the additive model
#Ha: The means do not look parallel, we have reason to doubt the additive model
yield2plot<- interaction.plot(x.factor = lab12.data$Factor.A, #x-axis variable
trace.factor = lab12.data$Factor.B, #variable for lines
response = lab12.data$Yield.2, #y-axis variable
fun = median, #metric to plot
ylab = “Mean”,
xlab = “Factor A”,
col = c(“orange”, “blue”),
lty = 1, #line type
lwd = 2, #line width
trace.label = “Factor B”,
main=”Data Means: Interaction Plot for Yield 2″)
#Fail to reject null, parallel means, no reason to doubt the additive model

#Ho: Factor A has no effect on output for Yield 2
#Ha: Factor A has an effect on output for Yield 2
#Ho: Factor B has no effect on output for Yield 2
#Ha: Factor B has an effect on output for Yield 2
summary(yield2plot)
yield2<- aov(lab12.data$Yield.2~lab12.data$Factor.A + lab12.data$Factor.B, data=lab12.data)
summary(yield2)
#Reject null, both factors A and B have an effect on Yield 2
# Df Sum Sq Mean Sq F value Pr(>F)
#lab12.data$Factor.A 1 1800.0 1800.0 10.4 0.0233 *
# lab12.data$Factor.B 1 1800.0 1800.0 10.4 0.0233 *
# Residuals 5 865.5 173.1

#Two way ANOVA on Factors A and B vs Yield 3
#Ho: The means are parallel, no reason to doubt the additive model
#Ha: The means do not look parallel, we have reason to doubt the additive model
yield3plot<- interaction.plot(x.factor = lab12.data$Factor.A, #x-axis variable
trace.factor = lab12.data$Factor.B, #variable for lines
response = lab12.data$Yield.3, #y-axis variable
fun = median, #metric to plot
ylab = “Mean”,
xlab = “Factor A”,
col = c(“orange”, “blue”),
lty = 1, #line type
lwd = 2, #line width
trace.label = “Factor B”,
main=”Data Means: Interaction Plot for Yield 3″)
#Reject the null, the additive model is definitely wrong

#Ho: Factor A has no effect on output for Yield 3
#Ha: Factor A has an effect on output for Yield 3
#Ho: Factor B has no effect on output for Yield 3
#Ha: Factor B has an effect on output for Yield 3
yield3<- aov(lab12.data$Yield.3~lab12.data$Factor.A + lab12.data$Factor.B, data=lab12.data)
summary(yield3)
#Fail to reject null, neither factors have an effect on output
# Df Sum Sq Mean Sq F value Pr(>F)
#lab12.data$Factor.A 1 0.5 0.5 0.003 0.958
#lab12.data$Factor.B 1 0.5 0.5 0.003 0.958
#Residuals 5 819.0 163.8

#Ho: Factor A has no effect on output
#Ha: Factor A has an effect on output
#Ho: Factor B has no effect on output
#Ha: Factor B has an effect on output
#Ho: The additive model is fine
#Ha: The interaction model is better
interaction.serum<- aov(lab12.data$Serum.Drop~lab12.data$D*lab12.data$E,data=lab12.data)
summary(interaction.serum)
#Fail to reject null, additive model is fine because the interaction P value is large (.329)
# Df Sum Sq Mean Sq F value Pr(>F)
#lab12.data$D 2 279.6 139.8 5.224 0.0121 *
# lab12.data$E 2 1160.2 580.1 21.674 2.43e-06 ***
# lab12.data$D:lab12.data$E 4 129.6 32.4 1.211 0.3293
#Residuals 27 722.7 26.8

#No reason to perform an analysis on Yield 1 because it was extremely parallel,
#this suggests the additive model is appropriate

#Ho: Factor A has no effect on output
#Ha: Factor A has an effect on output
#Ho: Factor B has no effect on output
#Ha: Factor B has an effect on output
#Ho: The additive model is fine
#Ha: The interaction model is better
interaction.yield2<- aov(lab12.data$Yield.2~lab12.data$Factor.A *lab12.data$Factor.B,data=lab12.data)
summary(interaction.yield2)
#Reject null, low p value, interaction model is a better fit
# Df Sum Sq Mean Sq F value Pr(>F)
#lab12.data$Factor.A 1 1800.0 1800.0 288.0 7.07e-05 ***
# lab12.data$Factor.B 1 1800.0 1800.0 288.0 7.07e-05 ***
# lab12.data$Factor.A:lab12.data$Factor.B 1 840.5 840.5 134.5 0.000316 ***
# Residuals 4 25.0 6.3

#Ho: Factor A has no effect on output
#Ha: Factor A has an effect on output
#Ho: Factor B has no effect on output
#Ha: Factor B has an effect on output
#Ho: The additive model is fine
#Ha: The interaction model is better
interaction.yield3<- aov(lab12.data$Yield.3~lab12.data$Factor.A *lab12.data$Factor.B,data=lab12.data)
summary(interaction.yield3)
#Reject null, interaction model is a better fit
# Df Sum Sq Mean Sq F value Pr(>F)
#lab12.data$Factor.A 1 0.5 0.5 0.105 0.761861
#lab12.data$Factor.B 1 0.5 0.5 0.105 0.761861
#lab12.data$Factor.A:lab12.data$Factor.B 1 800.0 800.0 168.421 0.000203 ***
# Residuals 4 19.0 4.8

ANOVA

I investigated whether the mean volume differs across the three machines. After conducting the ANOVA test, the obtained F-value of 22.27 led to a p-value of 3.24e-05, indicating a significant difference in the mean volume produced by the machines. Subsequently, I performed a Tukey HSD test to identify which machines exhibit a significant difference in mean volume. The results revealed that there was a significant difference in mean volume between machine 1 and machine 3, as well as between machine 2 and machine 3. However, no significant difference was found between machine 1 and machine 2. This suggests that machines 1 and 2 do not have significantly different mean volumes, while machine 3 stands out as having a distinct mean volume compared to the other machines.

mc1<-c(150,151,152,152,151,150)
mc2<- c(153,152,148,151,149,152)
mc3<-c(156,154,155,156,157,155)

#create data frame to prepare data
volume<- c(mc1,mc2,mc3)
machine<- rep(“machine1″, times=6)
machine<- rep(“machine1″, times=length(mc1))
machine <- rep(c(“machine1″, “machine2″, “machine3″),
. times=c(length(mc1), length(mc2), length(mc3)))
vol.mc<- data.frame(volume,machine)
vol.mc
aov(data=vol.mc, formula = volume ~ machine)
#data is in vol.mc,
#formula = what we are measuring and what is the factor we are measuring
#does machine make a difference when it comes to mean volume?
#does not work unless you assign it to a variable, so feed result into object
mc.aov<-aov(data=vol.mc, formula = volume ~ machine)
mc.aov
summary(mc.aov)
# Df Sum Sq Mean Sq F value. Pr(>F)
#machine. 2. 84.11. 42.06. 22.27 3.24e-05 ***
# Residuals. 15. 28.33. 1.89
#reject Ho, there is evidence of a significant difference in a machine mean
#to find the machine difference use Tukey
#visualize
boxplot(mc1,mc2,mc3)
#critical F value, use vol.mc
qf(1-.05, vol.mc[1,1], vol.mc[2,1]) #1.308574
#TUKEY — to find the ANOVA difference
TukeyHSD(x=mc.aov)
#look for a low p value to identify which machines have a significant difference
#$machine
#diff. lwr. upr. p adj
#machine2-machine1 -0.1666667 -2.227739 1.894405 0.9760110
#machine3-machine1. 4.5000000. 2.438928 6.561072 0.0001241
#machine3-machine2. 4.6666667. 2.605595 6.727739 0.0000846
##machine3-machine1, machine3-machine2 have significant differences

One Way ANOVA

In this section, I analyzed agricultural data comparing the effects of different fertilizers (Fertilizer1) on crop yield (Yield1). The summary statistics and ANOVA results indicate that we failed to reject the null hypothesis, suggesting no significant difference in means. The p-value for Fertilizer1 is 0.915, indicating that the data does not provide evidence that all means are not equal.

#One Way ANOVA (normality=T, equal variance=T)
crop.data<-read.csv(“/Users/aspengulley/Desktop/anova_lab_11_abc.csv”,header=TRUE,colClasses = c(“numeric”,”character”,”numeric”,”character”))
summary(crop.data)
crop.data
one.way1<-aov(Yield1~Fertilizer1,data=crop.data)
summary(one.way1)
#Results: Fail to Reject the Null 
#The data does not provide evidence that all means are not equal
# Df Sum Sq Mean Sq F value Pr(>F)
#Fertilizer1 2 0.0217 0.01083 0.09 0.915
#Residuals 9 1.0875 0.12083

One Way ANOVA with Tukey’s Test

Here, I continued the analysis by examining the impact of a different set of fertilizers (Fertilizer2) on crop yield (Yield2). The ANOVA results suggest rejecting the null hypothesis, indicating significant differences in means. Furthermore, Tukey’s test identified that Fertilizers A and B, as well as Fertilizers A and C, are significantly different, with p-values of 0.0000007 and 0.0000005, respectively.

#One Way ANOVA (normality=T, equal variance=T)
one.way2<-aov(Yield2~Fertilizer2,data=crop.data)
summary(one.way2)
#Results: Reject the Null 
#The data supports the claim that all means are not equal
# Df Sum Sq Mean Sq F value Pr(>F) 
#Fertilizer2 2 26.48 13.240 129.5 2.33e-07 ***
#Residuals 9 0.92 0.102

tukey<-TukeyHSD(one.way2)
#Results: Fertilizers A and B are different; fertilizers A and C are different 
#Tukey multiple comparisons of means, 95% family-wise confidence level
#Fit: aov(formula = Yield2 ~ Fertilizer2, data = crop.data)
#$Fertilizer2
#diff lwr upr p adj
#B-A 3.1 2.4687899 3.7312101 0.0000007
#C-A 3.2 2.5687899 3.8312101 0.0000005
#C-B 0.1 -0.5312101 0.7312101 0.8989352

#view plot
plot(tukey, las=1 , col=”brown”)

Anderson Darling Test and Levene Test

I conducted the Anderson-Darling Test for Normality on the residuals from the previous ANOVA. The resulting p-value of 0.6886 gives us no reason to doubt the normality assumption. Additionally, the Levene Test for Equal Variances yielded a p-value of 0.6224, suggesting no reason to doubt the equal variance assumption.

#Anderson Darling Test for Normality
library(nortest)
#save residuals from one.way2 anova:
residuals2<- one.way2$residuals
ad.test(residuals2)
#Results: a p-value of 0.6886 gives us no reason to doubt residuals
#Anderson-Darling normality test, data: residuals2
#A = 0.24747, p-value = 0.6886

#Levene Test for Equal Variances 
library(car)
#leveneTest(response variable ~ group variable, data = data)
leveneTest(crop.data$Yield2 ~ crop.data$Fertilizer2, data=crop.data)
#Results: a p-value of 0.6224 gives us no reason to doubt the equal variance assumption
#Levene’s Test for Homogeneity of Variance (center = median)
# Df F value Pr(>F)
#group 2 0.5 0.6224

Welch’s ANOVA

Moving on to a different dataset, I performed Welch’s ANOVA to analyze the impact of different groups on a response variable. The results failed to reject the null hypothesis, indicating no evidence of unequal means. I also visually inspected the normality of the data for each group, ensuring that normality assumptions were not violated.

#Welch’s ANOVA (normailty=T, equal variance=F)
DE.data<-read.csv(“/Users/aspengulley/Desktop/anova_lab_11_de.csv”,header=TRUE,colClasses = c(“numeric”,”character”,”numeric”,”character”))
summary(DE.data)
DE.data
oneway.test(DE.data$Data.Welch ~ DE.data$Group.Welch, data = DE.data, var.equal = FALSE)
#Results: fail to reject the null, no evidence of unequal means
#One-way analysis of means (not assuming equal variances), data: DE.data$Data.Welch and DE.data$Group.Welch
#F = 2.7938, num df = 2.0000, denom df = 6.5364, p-value = 0.1328
#check for normality:
qqnorm(DE.data$Data.Welch)
qqline(DE.data$Data.Welch) 
#these results do not show normality — you have to break down the data by group

#visual normality
DE.data
wg1<- c(-0.6818746,1.3671693,0.4246373,-0.5070765,0.1961487,-1.7070845)
qqnorm(wg1)
qqline(wg1) 
#normality not violated
wg2<- c(0.4352388,0.9240730,0.7077601,0.6184544,0.3734570,-0.1955902)
qqnorm(wg2)
qqline(wg2) 
#normality not violated
wg3<- c(-6.4182477,3.2144466,-17.4511023,-8.4875037,-3.6738870)
qqnorm(wg3)
qqline(wg3) 
#normality not violated

#ad.test() for normality requires sample sizes 8 or larger

#other options
#shapiro.test() test for normality
shapiro.test(wg1) 
#Shapiro-Wilk normality test, data: wg1
#W = 0.98794, p-value = 0.9836
shapiro.test(wg2) 
#Shapiro-Wilk normality test,data: wg2
#W = 0.93333, p-value = 0.606
shapiro.test(wg3)
#Shapiro-Wilk normality test,data: wg3
#W = 0.98098, p-value = 0.9398

Kruskal-Wallis Test

In this section, I opted for the Kruskal-Wallis Test due to the absence of normality and equal variance assumptions. The results suggest rejecting the null hypothesis, providing evidence that all medians are not equal. The p-value for the Kruskal-Wallis chi-squared test is 0.01531.

#Kruskal-Wallis Test (normality=F, equal variance=F)
kruskal.test(DE.data$Data.K.W ~ DE.data$Group.K.W, data = DE.data)
#Results: reject the null, evidence suggests all medians are not equal
#Kruskal-Wallis rank sum test, data: DE.data$Data.K.W by DE.data$Group.K.W
#Kruskal-Wallis chi-squared = 8.3585, df = 2, p-value = 0.01531

ANOVA TEST 1:

I began by conducting an analysis of variance (ANOVA), comparing the drop values (P1.Drop) across different cholesterol medications (P1.Chol.Med). The ANOVA results suggest a failure to reject the null hypothesis, indicating that the data does not support the claim that all means are not equal. The p-value is 0.0657. I also tested for equal variances using both Levene’s and Bartlett’s tests, with conflicting results. Normality was tested using the Anderson-Darling test on residuals, which led to rejecting the null hypothesis, suggesting that populations are not normally distributed. Consequently, the reliability of Bartlett’s result for equal variance is questionable. Visual and Shapiro-Wilk tests for normality by group were also conducted.

q1<-aov(med.data$P1.Drop~med.data$P1.Chol.Med,data=med.data)
summary(q1)
#Fail to reject null, the data does not support the claim that all means are not equal
# Df Sum Sq Mean Sq F value Pr(>F) 
#med.data$P1.Chol.Med 3 15049 5016 2.835 0.0657 
#Residuals 19 33620 1769

#Test for equal variances
library(car)
leveneTest(med.data$P1.Drop~med.data$P1.Chol.Med,data=med.data)
#Fail to reject null, the data does not support the claim that all variances are not equal
#Levene’s Test for Homogeneity of Variance (center = median)
# Df F value Pr(>F)
#group 3 1.9351 0.1582
# 19 
bartlett.test(med.data$P1.Drop~med.data$P1.Chol.Med,data=med.data)
#Reject null, the data supports the claim all variances are not equal
#*Remember Bartlett requires normality
#Bartlett test of homogeneity of variances, data: med.data$P1.Drop by med.data$P1.Chol.Med
#Bartlett’s K-squared = 14.261, df = 3, p-value = 0.00257

#Test for normality
library(nortest)
#save residuals from one.way2 anova:
q1residuals<- q1$residuals
ad.test(q1residuals)
#Reject null, the data supports the claim that all populations are not normally distributed
#Anderson-Darling normality test, data: q1residuals
#A = 0.99517, p-value = 0.01028

#Because normality has been violated, Bartlett’s result for equal variance is not reliable

#Visual and Shapiro-Wilk normality by group
g1drop<- c(160.5,42.3,11.4,23.3,8.9,52.2,116.8)
qqnorm(g1drop)
qqline(g1drop) 
shapiro.test(g1drop) 
#Results: Fail to Reject Null
#Shapiro-Wilk normality test,data: g1drop
#W = 0.8505, p-value = 0.1242
g2drop<- c(1.7,6.8,8.2,11.5,51.5)
qqnorm(g2drop)
qqline(g2drop)
shapiro.test(g2drop)
#Results: Reject Null
#Shapiro-Wilk normality test,data: g2drop
#W = 0.71877, p-value = 0.0149
g3drop<- c(1.5,5.6,0.6,1.2,16.1)
qqnorm(g3drop)
qqline(g3drop)
shapiro.test(g3drop)
#Results: Reject Null
#Shapiro-Wilk normality test, data: g3drop
#W = 0.76134, p-value = 0.03777
g4drop<- c(42.5,12.5,101.1,32.5,52.1,142.6)
qqnorm(g4drop)
qqline(g4drop)
shapiro.test(g4drop)
#Results: Fail to Reject Null
#Shapiro-Wilk normality test, data: g4drop
#W = 0.91209, p-value = 0.4503

kruskal.test(med.data$P1.Drop~med.data$P1.Chol.Med,data=med.data)
#Fail to reject null, the data does not support the claim that all medians are not equal
#Kruskal-Wallis rank sum test, data: med.data$P1.Drop by med.data$P1.Chol.Med
#Kruskal-Wallis chi-squared = 11.901, df = 3, p-value = 0.00773

ANOVA TEST 2:

I performed ANOVA to compare the number of days (P2.Days) based on different medicines (P2.Medicine). The results indicate a failure to reject the null hypothesis, suggesting that the data does not support the claim that all means are not equal. Both Levene’s and Bartlett’s tests for equal variances failed to reject the null hypothesis, supporting the claim that variances are not unequal. However, normality testing on residuals led to rejecting the null hypothesis, indicating that not all populations are normally distributed. The Kruskal-Wallis test was also performed, and it did not provide enough evidence to reject the null hypothesis.

med.data
q2<-aov(med.data$P2.Days ~ med.data$P2.Medicine,data=med.data)
summary(q2)
#Fail to reject null, the data does not support the claim that all means are not equal
# Df Sum Sq Mean Sq F value Pr(>F)
#med.data$P2.Medicine 3 125.1 41.71 1.483 0.249
#Residuals 20 562.5 28.12

#Test variances
#Fail to reject null, the data does not support the claim that variances are not equal
#leveneTest(med.data$P2.Days ~ med.data$P2.Medicine,data=med.data)
#Levene’s Test for Homogeneity of Variance (center = median)
# Df F value Pr(>F)
#group 3 0.0665 0.9771
# 20

bartlett.test(med.data$P2.Days ~ med.data$P2.Medicine,data=med.data)
#Fail to reject null, the data does not support the claim that ariances are not equal
#Bartlett test of homogeneity of variances, data: med.data$P2.Days by med.data$P2.Medicine
#Bartlett’s K-squared = 0.44481, df = 3, p-value = 0.9308

#Test normality
q2residuals<- q2$residuals
ad.test(q2residuals)
#Reject null, data supports the claim that all means are not equal
#Anderson-Darling normality test, data: q2residuals
#A = 0.81109, p-value = 0.03061

kruskal.test(med.data$P2.Days ~ med.data$P2.Medicine,data=med.data)
#Fail to reject null, the data does not support the claim that all medians are not equal
#Kruskal-Wallis rank sum test, data: med.data$P2.Days by med.data$P2.Medicine
#Kruskal-Wallis chi-squared = 5.8499, df = 3, p-value = 0.1191

ANOVA TEST 3:

For the third question, an ANOVA was conducted to compare the drop values (Prob.3.Days) across different medicines (P3.Medicine). The results suggest rejecting the null hypothesis, supporting the claim that all means are not equal. Both Levene’s and Bartlett’s tests for equal variances failed to reject the null hypothesis. Normality testing on residuals did not provide enough evidence to reject the null hypothesis, indicating that populations are normally distributed. The Kruskal-Wallis test was also conducted and did not provide enough evidence to reject the null hypothesis.

q3<-aov(med.data$Prob.3.Days ~ med.data$P3.Medicine,data=med.data)
summary(q3)
#Reject null, data supports the claim that all means are not equal
# Df Sum Sq Mean Sq F value Pr(>F) 
#med.data$P3.Medicine 3 48.13 16.042 3.395 0.038 *
#Residuals 20 94.50 4.725

#Test variances
leveneTest(med.data$Prob.3.Days ~ med.data$P3.Medicine,data=med.data)
#Fail to reject null, data does not support the claim that variances are unequal
#Levene’s Test for Homogeneity of Variance (center = median)
# Df F value Pr(>F)
#group 3 0.2069 0.8904
# 20

bartlett.test(med.data$Prob.3.Days ~ med.data$P3.Medicine,data=med.data)
#Fail to reject null, data does not support the claim that variances are unequal
#Bartlett test of homogeneity of variances, data: med.data$Prob.3.Days by med.data$P3.Medicine
#Bartlett’s K-squared = 0.79442, df = 3, p-value = 0.8508

#Test normality
q3residuals<- q3$residuals
ad.test(q3residuals)
#Fail to reject null, data does not support the claim that populations are not normally distributed
#Anderson-Darling normality test, data: q3residuals
#A = 0.15375, p-value = 0.9505

kruskal.test(med.data$Prob.3.Days ~ med.data$P3.Medicine,data=med.data)
#Fail to reject null, data does not support the claim that all medians are not equal
#Kruskal-Wallis rank sum test, data: med.data$Prob.3.Days by med.data$P3.Medicine
#Kruskal-Wallis chi-squared = 7.3423, df = 3, p-value = 0.06175

#Welch’s ANOVA
w3<- oneway.test(med.data$Prob.3.Days ~ med.data$P3.Medicine, data=med.data, var.equal = FALSE)
w3
#Fail to reject null, data does not support the claim that population means are not equal
#One-way analysis of means (not assuming equal variances), data: med.data$Prob.3.Days and med.data$P3.Medicine
#F = 3.1412, num df = 3.000, denom df = 11.001, p-value = 0.0691

posthoc.tgh(y=med.data$Prob.3.Days, x=med.data$P3.Medicine)
#Result: Fail to Reject Null, data does not support the claim of unequal means 
#n means variances
#1 6 18 6.7
#2 6 21 5.1
#3 6 22 4.2
#4 6 19 3.0
#t df p
#1:2 1.55 9.8 0.448
#1:3 2.72 9.5 0.088
#1:4 0.53 8.7 0.951
#2:3 1.21 9.9 0.635
#2:4 1.30 9.4 0.587
#3:4 2.75 9.7 0.083

ANOVA TEST 4:

ANOVA was performed to compare drop values (P4.Drop) based on different cholesterol medications (P4.Chol.Med). The results indicate rejecting the null hypothesis, supporting the claim that all population means are not equal. Tukey’s HSD test and Scheffe’s Test for post hoc analysis were conducted, both suggesting specific differences between groups. Boxplots and pairwise t-tests with Bonferroni and Holm adjustments were also performed to explore differences between groups.

q4<-aov(med.data$P4.Drop ~ med.data$P4.Chol.Med,data=med.data)
summary(q4)
#Reject null, data supports the claim that all populations means are not equal
# Df Sum Sq Mean Sq F value Pr(>F) 
#med.data$P4.Chol.Med 3 2600 866.8 10.05 3e-04 ***
#Residuals 20 1724 86.2

t4<-TukeyHSD(q4)
t4
# 2–1, 3–2, 4–2
#Tukey multiple comparisons of means, 95% family-wise confidence level
#Fit: aov(formula = med.data$P4.Drop ~ med.data$P4.Chol.Med, data = med.data)
#$`med.data$P4.Chol.Med`
#diff lwr upr p adj
#2–1 29.33333 14.327885 44.338781 0.0001284
#3–1 13.16667 -1.838781 28.172115 0.0984426
#4–1 12.66667 -2.338781 27.672115 0.1172725
#3–2 -16.16667 -31.172115 -1.161219 0.0318695
#4–2 -16.66667 -31.672115 -1.661219 0.0261382
#4–3 -0.50000 -15.505448 14.505448 0.9996990

plot(t4, las=1 , col=”brown”)

#View data
boxplot(med.data$P4.Drop ~ med.data$P4.Chol.Med, data = med.data)

#Three alternatives to Fisher’s Test:
library(DescTools)
s4<- ScheffeTest(q4)
#Results: 2–1, 4–2, (3–2 is really close!)
#Posthoc multiple comparisons of means: Scheffe Test, 95% family-wise confidence level
#$`med.data$P4.Chol.Med`
#diff lwr.ci upr.ci pval 
#2–1 29.33333 12.988341 45.6783259 0.00031 ***
#3–1 13.16667 -3.178326 29.5116592 0.14488 
#4–1 12.66667 -3.678326 29.0116592 0.16879 
#3–2 -16.16667 -32.511659 0.1783259 0.05327 . 
#4–2 -16.66667 -33.011659 -0.3216741 0.04456 * 
#4–3 -0.50000 -16.844993 15.8449926 0.99978

plot(s4, las=1 , col=”brown”)

pb4<- pairwise.t.test(med.data$P4.Drop, med.data$P4.Chol.Med, p.adj=’bonferroni’)
#Results: 2–1, 3–2, 4–2
#Pairwise comparisons using t tests with pooled SD
#data: med.data$P4.Drop and med.data$P4.Chol.Med 
#1 2 3 
#2 0.00014 — — 
#3 0.13993 0.04100 — 
#4 0.17033 0.03320 1.00000
#P value adjustment method: bonferroni

ph4<- pairwise.t.test(med.data$P4.Drop, med.data$P4.Chol.Med, p.adj=’holm’)
#Results: 2–1, 3–2, 4–2
#Pairwise comparisons using t tests with pooled SD, data: med.data$P4.Drop and med.data$P4.Chol.Med 
#1 2 3 
#2 0.00014 — — 
# 3 0.06996 0.02767 — 
# 4 0.06996 0.02767 0.92662
#P value adjustment method: holm


Hypothesis Testing: Chi-Square Goodness of Fit

The following is a compilation of a series of Chi-Square tests.

A Chi-Square Goodness of Fit Test is a statistical test used to determine if there is a significant difference between the observed and expected frequencies of categorical data. It is commonly employed to assess whether a sample of categorical data comes from a population with a specific distribution.

Coin Bias Test

In this example, a Chi-Square Goodness of Fit Test is conducted to determine if a coin is biased based on the observed outcomes (40 heads and 60 tails) compared to the expected probabilities (50% for each). The calculated chi-square statistic is approximately 4, and the critical value at a 5% significance level is retrieved as around 3.841. Since the calculated value exceeds the critical value, the null hypothesis is rejected, suggesting evidence of bias.

#Example 1: if we flip a coin 100 times and we get 
#40 heads and 60 tails, is this coin biased?

flip<- c(40, 60)
chisq.test(x=flip, p=c(05,0.5))

#calculate chi sq

ch.cal<- ((40–50)²/50) + ((60–50)²/50)
ch.cal

#critical value, q gives critical value
qchisq(p=0.05, df=1, lower.tail = F)

#visualize — does not give good visuals for df=1
install.packages(“visualize”)
library(visualize)

visualize.chisq(stat = 4, df=1, section = “upper”)
#visualization not great because df = 1 but with a larger df, 
#the visualization will be better
#reject null, p-value = 0.0455

Shirt Manufacturing Sales

This example involves testing whether the observed sales proportions of different shirt sizes (small, medium, large, extra-large) match the expected proportions. The Chi-Square Goodness of Fit Test yields a chi-square statistic of approximately 4.59 with 3 degrees of freedom. The critical value at a 5% significance level is approximately 7.815. As the calculated value is less than the critical value, the null hypothesis is not rejected, indicating no significant difference between the expected and actual sales proportions.

#shirt manufacturing company expects the proportion of sales as follows:
#small: 20%, medium: 40%, large:30%, extra large: 10%
#actual sale recorded:
#small: 211, medium: 402, large: 297, extra large: 80
#Is there a significant difference between the expected and actual?

sh.p<- c(0.2, 0.4, 0.3, 0.1) #probability
sh.a<- c(211, 402, 297, 80) #actual values
chisq.test(x=sh.a, p=sh.p) #p= for probability

#visualize
visualize.chisq(stat = 4.59, df=3, section = “upper”)

#confirm data: sh.a, 
#X-squared = 4.5909, df = 3, p-value = 0.2043
#fail to reject null

Chi-Square Tests 1–6:

Chi-Square Test 1

A similar test is performed for given probabilities and actual numbers. The calculated chi-square statistic is approximately 9.65 with 4 degrees of freedom. The critical value at a 5% significance level is around 9.488. Since the calculated value exceeds the critical value, the null hypothesis is rejected.

n.p<- c(.35, .20, .20, .10, .15) #probabilities
n.a<- c(750, 380, 390, 170, 310) #actual numbers 
chisq.test(x=n.a, p=n.p)
visualize.chisq(stat = 9.6548, df=4, section = “upper”)
#X-squared = 9.6548, df = 4, p-value = 0.04666
#reject Ho

Chi-Square Test 2 and 3

Both questions involve Chi-Square Goodness of Fit Tests comparing expected and actual values. For Question 2, the calculated chi-square statistic is approximately 19.88 with 5 degrees of freedom. For Question 3, the calculated chi-square is 99.4 with 5 degrees of freedom. In both cases, the null hypothesis is rejected.

#Test 2
d.p<- c(0.166666666666667, 0.166666666666667, 0.166666666666667, 0.166666666666667, 0.166666666666667, 0.166666666666667)
#probability has to equal 1 exactly
d.a<- c(224, 214, 196, 216, 204, 146)
chisq.test(x=d.a, p=c(d.p))

#calculate chi sq
q2.cal<- ((224–200)²/200) + ((214–200)²/200) + ((196–200)²/200) + ((216–200)²/200) + ((204–200)²/200) + ((146–200)²/200)
q2.cal

#critical value, q gives critical value
qchisq(p=0.05, df=5, lower.tail = F)

visualize.chisq(stat = 19.88, df=5, section = “upper”)

#reject null, p-value = 0.00132
#chi square = 19.88 (critical value = 11.0705)

#Test 3
q3.p<- c(0.166666666666667, 0.166666666666667, 0.166666666666667, 0.166666666666667, 0.166666666666667, 0.166666666666667)
#probability has to equal 1 exactly

q3.a<- c(1120, 1070, 980, 1080, 1020, 730)
chisq.test(x=q3.a, p=c(q3.p))

#calculate chi sq
q3.cal<- ((1120–1000)²/1000) + ((1070–1000)²/1000) + ((980–1000)²/1000) + ((1080–1000)²/1000) + ((1020–1000)²/1000) + ((730–1000)²/1000)
q3.cal

#critical value, q gives critical value
qchisq(p=0.05, df=5, lower.tail = F)

visualize.chisq(stat = 99.4, df=5, section = “upper”)

#reject null, p-value = 0
#chi square = 99.4 (critical value = 11.0705)

Chi-Square Test 4

A test is performed to compare expected and actual probabilities. The calculated chi-square statistic is approximately 7.83 with 2 degrees of freedom. The critical value at a 5% significance level is around 5.991. The null hypothesis is rejected.

#Test 4
gpig.p<- c(.40, .40, .20) #probabilities
gpig.a<- c(34, 20, 6) #actual numbers 
chisq.test(x=gpig.a, p=gpig.p)
qchisq(p=0.05, df=5, lower.tail = F)
visualize.chisq(stat = 7.8333, df=2, section = “upper”)
#X-squared = 7.8333, df = 2, p-value = 0.01991
#reject Ho

Chi-Square Test 5

A similar test is performed, resulting in a chi-square statistic of approximately 1.79 with 2 degrees of freedom. The critical value is around 5.991, and the null hypothesis is not rejected.

#Test 5
q2gpig.p<- c(.40, .40, .20) #probabilities
q2gpig.a<- c(29, 20, 11) #actual numbers 
chisq.test(x=q2gpig.a, p=q2gpig.p)
qchisq(p=0.05, df=5, lower.tail = F)

visualize.chisq(stat =1.7917 , df=2, section = “upper”)
#X-squared = 1.7917, df = 2, p-value = 0.4083
#fail to reject

Chi-Square Test 6

This question involves a Chi-Square Goodness of Fit Test for Mendel’s genetic experiment. The calculated chi-square statistic is approximately 4.53 with 3 degrees of freedom. The critical value at a 5% significance level is around 7.815. The null hypothesis is not rejected.

#Test 6
mendel.p <- c(0.5625, 0.1875, 0.1875, 0.0625) 
#16 total, 9/16 = 0.5625, 3/16 = 0.1875, 3/16 = 0.1875, 1/16 = 0.0625 <- all add up to 1

mendel.a <- c(545, 192, 185, 78)

#calculate the expected values:
#1000(0.5626)=562.5, 1000(0.1875)=187.5, 1000(0.0625)=62.5

chisq.test(x=mendel.a, p=c(mendel.p))

#calculate chi sq
#use expected values
mendel.cal<- ((545–562.5)²/562.5) + ((192–187.5)²/187.5) + ((185–187.5)²/187.5) + ((78–62.5)²/62.5) 
mendel.cal

#critical value, q gives critical value
qchisq(p=0.05, df=5, lower.tail = F)

visualize.chisq(stat = 4.529778, df=3, section = “upper”)

#fail to reject null, p-value = 0.21
#chi square = 4.529778 (critical value = 11.0705)


Hypothesis Testing: One Sample Z-Test, One Sample T-Test, One Sample Variance Test, Two Sample Z-Test, Two Sample T-Test, Paired T-Test, Two Sample Variance Test

The following is a compilation of a series of hypothesis tests.

ONE SAMPLE Z-TEST

I have a sample of 100 bottles, and their mean volume is 152 cc. The population mean volume is 150 cc with a standard deviation of 2 cc. I’m using a one sample z-test to check if the mean volume has increased. With a 95% confidence level, the z-value I calculated is 10. When I checked the critical value of z using qnorm(), I found it to be 1.64 for a one-tailed test at a 5% significance level. Since my calculated z-value is much larger than the critical value, I reject the null hypothesis. The one-sample z-test also gives a z-value of 10 and a p-value less than 2.2e-16, leading to rejection of the null hypothesis. This suggests that the true mean volume is greater than 150 cc.

#ONE SAMPLE Z TEST
#example: bottles are being produced with mean as 150 cc and sd of 2 cc.
#sample of 100 bottles show the mean of 152.
#has the mean volume increased?
#check with 95% confidence level
#with no z-test function in. R, use theory:
#Ho: mean < 150 cc, Ha: mean > 150 cc
#greater than, one tailed test, right tail 5%
#mean of sample minus mu, mean of population,
#divided by
#sigma, sd of population, divided by square root of 100, the number of samples drawn

library(readr)
getwd()
perfume_volumes<-read.csv(“/Users/aspengulley/Desktop/Perfume+Volumes.csv”)
perfume_volumes
perfume_volumes$Machine.1
mean(perfume_volumes$Machine.1)
zvalue<- (152–150)/(2/sqrt(100))
zvalue #10
#find the critical value of z by using qnorm()
qnorm(0.05) #5% alpha value corresponds with 95% alpha
#qnorm() gives area on the LEFT, -1.644854,
#since normal distribution is symmetrical,
#the 5% value on the right will be +1.64
#reject Ho
#use BSDA (basic stats & data) package to calculate z-test
install.packages(“devtools”, dependencies = TRUE)
devtools::has_devel()
devtools::install_github(‘alanarnholt/BSDA’)
library(BSDA)
z.test(x=perfume_volumes$Machine.1, alternative=”greater”, sigma.x=2, mu=150)
#output:one-sample z-Test, data: perfume_volumes$Machine.1
#z = 10, p-value < 2.2e-16
#alternative hypothesis: true mean is greater than 150
#95 percent confidence interval:
# 151.671. NA
#sample estimates: mean of x 152

ONE SAMPLE T-TEST

The one sample t-test was conducted to determine if the mean volume of bottles has changed from the hypothesized value of 150cc, with a 95% confidence level. The sample of 4 bottles yielded volumes of (151, 153, 152, 152). The null hypothesis (Ho) stated that the volume has not changed, while the alternative hypothesis (Ha) proposed that the volume has changed.

The results of the one sample t-test indicated a mean volume of 152cc, with a t-value of 4.899 and a p-value of 0.01628. This led to the rejection of the null hypothesis, suggesting that the true mean volume is different from 150cc. The 95% confidence interval for the mean volume was calculated to be 150.7008 to 153.2992. This indicates strong evidence to suggest that the mean volume of the bottles has changed from the hypothesized 150cc.

#ONE SAMPLE T-TEST
#if sample size if large, you can find population sd from sample,
#but when you do not have the population sd, you use t-test,
#and when sample size is small
#draw t distributions for different df
q<-seq(-4.0,4.0,by=0.1) #create sequence called q
q
dt(q,3) #d vertical values, 3 degrees of freedom
#plot several values with different degrees of freedom
plot(q,dt(q,19), type=”l”, lty=”solid”, xlab=”t”, ylab=”f(t)”) #19 degrees of freedom
#one you make a plot, you can add layers, draw another line
lines(q,dt(q,9),type=”l”,lty=”dashed”) #9 degrees of freedom
lines(q, dt(q,4),type=”l”,lty=”dotted”) #4 degrees of freedom

#example: bottles are being produced with mean as 150cc and the population sd is unknown
#sample of 4 bottles show the volume as (151,153,152,152)
#has the mean volume changed? check with 95% confidence level
#this is a two tailed test
#Ho: volume has not changed, Ha: volume has changed
vol<- c(151,153,152,152)
t.test(x=vol, mu=150, conf.level=0.95)
#one sample t-test
#data: vol
#t = 4.899, df = 3, p-value = 0.01628
#alternative hypothesis: true mean is not equal to 150
#95 percent confidence interval: 150.7008 153.2992
#sample estimates: mean of x 152
#reject Ho
#visualize this
library(visualize)
visualize.t(stat=c(-4.899,4.899), df=3, section=”tails”)

ONE SAMPLE VARIANCE TEST

I conducted a variance test on a sample of 25 bottles to determine if the variance has increased from 4 with a 95% confidence level. The variance of the sample was found to be 5. The variance test resulted in a chi-squared statistic of 30 with 24 degrees of freedom. The p-value obtained was 0.1847518, indicating a failure to reject the null hypothesis.

By calculating the chi-squared statistic manually and comparing it with the critical value obtained from the chi-squared distribution, I found the calculated value to be 30, while the critical value at a 5% significance level with 24 degrees of freedom was 36.41503. This further supports the result of failing to reject the null hypothesis.

#ONE SAMPLE VARIANCE TEST
#example: 25 bottles were selected and their variance was 5.
#has SD increased from 4? check with 95% confidence level
install.packages(‘EnvStats’)
library(EnvStats) #environmental stats
library(readr)
getwd()
volumevar<- read.csv(“/Users/aspengulley/Desktop/VolumeVar.csv”)
volumevar
var(volumevar$Volumes) #variance of volumes is 5
#test whether variance has changed
varTest(x=volumevar$Volumes, alternative=”greater”, sigma.squared = 4, conf.level = .95)
#$statistic Chi-Squared = 30, $parameters df = 24
#$p.value = [1] 0.1847518
#$estimate variance = 5, $null.value variance = 4, $alternative = [1] “greater”
#$method [1] “Chi-Squared Test on Variance”, $data.name [1] “volumevar$Volumes”
#$conf.int LCL 3.295343. UCL Inf
#attr(,”conf.level”) [1] 0.95, attr(,”class”) [1] “htestEnvStats”
#get critical value
qchisq(p=.05, 24, lower.tail=FALSE)
#fail to reject Ho
#chi square distribution without any package, just usng R functions
#find calculated chi-square: (n-1)s^2/sigma^2
calc <- (25–1) * var(volumevar$Volumes)/4
calc # 30
#remember rchisq, pchisq, qchisq, dchisq
#find the number that puts 5% on the right tail, the critical value, 36.41503
qchisq(p=.05, 24, lower.tail=FALSE) #lower.tail=T gives area to the LEFT
x<- seq(1,50, by=1)
y<- dchisq(x, df=24)
plot(y, type=”l”, xlab=”chi sq”, ylab=”f(chi sq)”)
abline(v=30)
text(8, 0.05, “calulated:”)
text(5,0.045, “30″)
abline(v=36.41503)
text(40, 0.04, “critical:”)
text(45, 0.035, “36, 0.95″)
#fail to reject Ho

TWO SAMPLE Z-TEST

The Z-test resulted in a test statistic (z) of approximately -3.5954 and a p-value of 0.0003238. With a significance level of 0.05, I reject the null hypothesis. This indicates that there is a statistically significant difference between the mean volumes produced by the two machines.

The 95 percent confidence interval for the difference in means is calculated to be between -1.5142221 and -0.4457779. The sample estimates show that the mean volume for Machine 1 is 150.19, while the mean volume for Machine 2 is 151.17.

Additionally, a visual representation through boxplots and overlapping histograms confirms the difference in means. The histograms show distinct distributions for Machine 1 and Machine 2, and the boxplot further illustrates the shift in central tendency.

To further investigate, a second Z-test was performed, considering a specific difference in means. The null hypothesis assumed a difference equal to -1, while the alternative hypothesis suggested a true difference not equal to -1. The result, with a p-value of 0.9415, indicates that we fail to reject the null hypothesis. Therefore, there is no significant evidence to suggest that Machine 2 is producing exactly 1 cc more than Machine 1.

In conclusion, the initial Z-test demonstrates a significant difference in mean perfume volumes between Machine 1 and Machine 2, while the additional test with a specific difference in means suggests that Machine 2 does not necessarily produce 1 cc more than Machine 1.

#TWO SAMPLE Z-TEST
perfume_volumes_2<- read.csv(“/Users/aspengulley/Desktop/Perfume+Volumes+2+Sample.csv”)
getwd()
perfume_volumes_2 #machine 1 and machine 2
z.test(x=perfume_volumes_2$Machine.1, 
 y=perfume_volumes_2$Machine.2, 
 sigma.x = sd(perfume_volumes_2$Machine.1),
 sigma.y = sd(perfume_volumes_2$Machine.2))
# Two-sample z-Test data: perfume_volumes_2$Machine.1, perfume_volumes_2$Machine.2
#z = -3.5954, p-value = 0.0003238
#alternative hypothesis: true difference in means is not equal to 0
#95 percent confidence interval: -1.5142221 -0.4457779
#sample estimates: mean of x: 150.19, mean of y 151.17 
#reject Ho
#visualize two sample z-test
#boxplot
boxplot(x=perfume_volumes_2)
#overlapping histograms
hist(x=perfume_volumes_2$Machine.1, 
 col=rgb(1,0,0,0.5),#color = red 1 green 0 blue 0, transparency 0.5
 main=”Volumes by Machine 1 and 2″, #main title,
 xlim=c(140,160), xlab=”Volume”, ylab=”Frequency”)
hist(x=perfume_volumes_2$Machine.2,
 col=rgb(0,0,1,0.5), #red 0 green 0 blue 1, transparency 0.5
 add=TRUE #add to first histogram to overlap
 ) #you can see red for machine 1, blue for machine 2, and purple for overlapping
#two sample z test when mu1 and mu2 is not zero, DIFFERENT MEANS 
#first machine is supposed to make 150 cc and second 151 cc
z.test(x=perfume_volumes_2$Machine.1, y=perfume_volumes_2$Machine.2,
 sigma.x = sd(perfume_volumes_2$Machine.1), sigma.y = sd(perfume_volumes_2$Machine.2), mu=-1.0)
# Two-sample z-Test, data: perfume_volumes_2$Machine.1 and perfume_volumes_2$Machine.2
#z = 0.073376, p-value = 0.9415
#null: difference equal to -1
#alternative hypothesis: true difference in means is not equal to -1
#95 percent confidence interval: -1.5142221 -0.4457779
#sample estimates: mean of x 150.19, mean of y 151.17 
#fail to reject Ho — machine 2 is producing 1 cc more than machine 1

TWO SAMPLE T-TEST

For the equal variance T-Test, the F-test was used to compare variances between mc1 and mc2 resulted in a p-value of 0.764, failing to reject the null hypothesis. This suggests no significant evidence that the true ratio of variances is not equal to 1. The subsequent t-test assuming equal variances resulted in a t-statistic of approximately -3.8334, with a p-value of 0.008625. As the p-value is below the significance level of 0.05, we reject the null hypothesis. This indicates there is evidence that the difference in means between mc1 and mc2 is not equal to zero. The 95 percent confidence interval for the difference in means is between -6.962837 and -1.537163.

For the unequal variance test, the F-test comparing variances produced a p-value of 0.000555, leading to the rejection of the null hypothesis. This suggests evidence that the variances are not equal between mc1 and mc2. However, the subsequent Welch Two Sample t-test, assuming unequal variances, resulted in a t-statistic of approximately -0.41464, with a p-value of 0.6993. As this p-value is above the significance level, we fail to reject the null hypothesis, indicating no significant evidence that the difference in means is not equal to zero. The 95 percent confidence interval for the difference in means is between -21.40824 and 15.80824.

#TWO SAMPLE T-TEST
mc1<- c(150, 152, 154, 151)
mc2<- c(156,155,158,155)
var.test(x=mc1, y=mc2)

#TWO OPTIONS WITH T-TEST: EQUAL AND UNEQUAL VARIANCES

#F- TEST TO COMPARE TWO VARIANCES
#data: mc1 and mc2
#F = 1.4583, num df = 3, denom df = 3, p-value = 0.764
#alternative hypothesis: true ratio of variances is not equal to 1
#95 percent confidence interval: 0.09445664 22.51547430
#sample estimates:ratio of variances 1.458333 
#fail to reject Ho — no evidence that the true ratio of variances are not equal to 1

#EQUAL VARIANCE — use pooled variance
boxplot(mc1,mc2)
t.test(x=mc1, y=mc2, alternative = “two.sided”, var.equal=TRUE)
#Two Sample t-test, data: mc1 and mc2
#t = -3.8334, df = 6, p-value = 0.008625
#alternative hypothesis: true difference in means is not equal to 0
#95 percent confidence interval: -6.962837 -1.537163
#sample estimates: mean of x: 151.75, mean of y 156.00 
#reject Ho, evidence that the difference in means does not equal zero

#UNEQUAL VARIANCE 
mc1<- c(150, 152, 154, 152, 151)
mc2<- c(144,162,177,150, 140)
var.test(x=mc1, y=mc2) #F- TEST TO COMPARE TWO VARIANCES
#data: mc1 and mc2
#F = 0.0097431, num df = 4, denom df = 4, p-value = 0.000555
#alternative hypothesis: true ratio of variances is not equal to 1
#95 percent confidence interval: 0.001014431 0.093578236
#sample estimates: ratio of variances 0.009743136 
#reject Ho, evidence variances are not equal
t.test(x=mc1, y=mc2, var.equal=FALSE)
#Welch Two Sample t-test, data: mc1 and mc2
#t = -0.41464, df = 4.0779, p-value = 0.6993
#alternative hypothesis: true difference in means is not equal to 0
#95 percent confidence interval: -21.40824 15.80824
#sample estimates: mean of x:151.8, mean of y 154.6
#fail to reject null
summary(mc1)
summary(mc2)
#despite failing to reject, we can see there is a good difference in means,
#despite this, we fail to reject
boxplot(mc1, mc2)
#too much variation in mc2, but null stands until proven to be wrong

PAIRED T-TEST

The paired t-test was conducted on blood pressure measurements (bp.before and bp.after). The test resulted in a t-statistic of approximately -0.68641, with a p-value of 0.5302. As the p-value is greater than the significance level of 0.05, I failed to reject the null hypothesis. This suggests that there is no significant evidence of a difference in means between blood pressure measurements before and after taking medicine. The 95 percent confidence interval for the difference in means is between -7.062859 and 4.262859.

A boxplot was created to visualize the differences in blood pressure before and after taking medicine. The boxplot shows that the median difference is close to zero, indicating that, on average, there is no substantial change in blood pressure. The boxplot also demonstrates the dispersion of the differences, with the median being at zero.

In summary, there is no significant evidence in this data to suggest a difference in blood pressure before and after taking medicine.

#PAIRED T-TEST
bp.before<- c(120,122,143,100,109)
bp.after<- c(122,120,141,109,109)
t.test(x=bp.before, y=bp.after, paired=TRUE, conf.level = .95)
#Paired t-test, data: bp.before and bp.after
#t = -0.68641, df = 4, p-value = 0.5302
#alternative hypothesis: true difference in means is not equal to 0
#95 percent confidence interval: -7.062859 4.262859
#sample estimates: mean of the differences -1.4 
#fail to reject Ho
#visualize this paired t-test
bp.diff<- bp.after — bp.before
bp.diff
boxplot(bp.diff, main=”Effect of Medicine on BP”, ylab=”Post Medicine BP Difference”)
#zero is the median

TWO SAMPLE VARIANCE TEST

The two-sample variance test was conducted to compare the variances of two samples, denoted as mca and mcb. The F test resulted in an F-statistic of approximately 0.11526 with a p-value of 0.01516. The null hypothesis (H0: Var(mca)/Var(mcb) = 1) is rejected at a 90% confidence level. This indicates evidence of a difference in variances between the two samples.

To ensure consistency, the test was performed again, this time placing the sample with the larger variance (mcb) in the numerator. The F-statistic obtained was approximately 8.6761 with the same p-value of 0.01516. Again, the null hypothesis (H0: Var(mcb)/Var(mca) = 1) is rejected at a 90% confidence level.

The calculated F-value (8.7) was compared with the critical F-value (4.120312) at a significance level of 0.05. Since the calculated F-value exceeds the critical value, the null hypothesis is rejected, further supporting the evidence of unequal variances.

A boxplot was created to visualize the two samples. The plot of the F distribution was also included, highlighting the critical region where the F-value surpasses the critical value.

In summary, both the statistical test and visualization provide evidence to reject the null hypothesis, suggesting a significant difference in variances between the two samples.

#TWO SAMPLE VARIANCE TEST — F DISTRIBUTION 
#example: we took 8 samples from machine a and the sd was 1.1
#5 samples from machine b, variance was 11
#is there a difference between machine a and b? 
#check with 90% confidence level
mca<- c(150,150,151,149,151,151,148,151)
sd(mca) #1.1
mean(mca) #150.125
mcb<- c(152,146,152,150,155)
var(mcb) #11
mean(mcb) #151
var.test(x=mca, y=mcb, ratio=1, conf.level=.9)
#F test to compare two variances, data: mca and mcb
#F = 0.11526, num df = 7, denom df = 4, p-value = 0.01516
#alternative hypothesis: true ratio of variances is not equal to 1
#90 percent confidence interval:0.01891299 0.47490606
#sample estimates: ratio of variances 0.1152597 
#reject Ho
#try again — 
#larger variance should always go in the numerator 
#to force the test into a right-tailed test
var.test(x=mcb, y=mca, ratio=1, conf.level=.9)
#F test to compare two variances, data: mcb and mca
#F = 8.6761, num df = 4, denom df = 7, p-value = 0.01516
#alternative hypothesis: true ratio of variances is not equal to 1
#90 percent confidence interval: 2.10568 52.87372
#sample estimates: ratio of variances 8.676056 
#reject Ho
#now the F value is 8.7
#theory
fcal<- var(mcb)/var(mca)
fcal #8.7
#f critical value
fcrit<- qf(p=0.05, df1=4, df2=7,lower.tail=FALSE)
fcrit #4.120312
#reject Ho
#visualize two sample variance
#plot F distribution
x<- seq(0,10)
df(x, df1=4, df2=7)
plot(df(x, df1=4, df2=7), type=”l”, xlab=”F Value”, ylab=”Density”, xlim=c(0,10))
boxplot(mca, mcb)

By Aspen Gulley on .



Leave a Reply

WORK & VOLUNTEER EXPERIENCE

Data Analyst
CenCore, LLC
2024 – Current

Mental Health Crisis Counselor
Crisis Text Line
2023 – 2024

Contributing Data Science Writer
Dev Genius
2022 – 2024

Research Assistant & Academic Writer
Utah State University
2019 – 2020

Behavior Technician
Wasatch Behavioral Health
2018 – 2019

Discover more from Aspen Gulley

Subscribe now to keep reading and get access to the full archive.

Continue reading