Add HW4 and HW5

2025-12-25 21:51:34 -04:00
parent 7ae72064a4
commit ee139e4559
2 changed files with 313 additions and 0 deletions
--- a/HW4.Rmd
+++ b/HW4.Rmd
@@ -0,0 +1,186 @@
+---
+title: "Assignment 3"
+subtitle: "STAT3373"
+author: "Isaac Shoebottom"
+date: "Oct 16th, 2025"
+output:
+  pdf_document: default
+  html_document:
+    df_print: paged
+---
+
+```{r message=FALSE, warning=FALSE}
+library(tidyverse)
+library(knitr)
+```
+
+# Question 1
+
+## a)
+
+```{r}
+# Create the dataset
+data <- tibble(
+  Farm = factor(1:4),
+  Fert1 = c(48, 45, 52, 44),
+  Fert2 = c(55, 50, 58, 49),
+  Fert3 = c(52, 49, 55, 47)
+)
+
+# Convert to long format
+long_data <- data %>%
+  pivot_longer(
+    cols = starts_with("Fert"),
+    names_to = "Fertilizer",
+    values_to = "Yield"
+  ) %>%
+  mutate(Fertilizer = factor(Fertilizer))
+
+kable(long_data, caption = "Yield Data (Bushels per Acre)")
+
+```
+
+## b)
+
+Model: $$Y_{ij} = \mu + \tau_i + \beta_j + \varepsilon_{ij}$$
+
+```{r}
+anova_model <- aov(Yield ~ Fertilizer + Farm, data = long_data)
+
+anova_table <- summary(anova_model)
+anova_table
+```
+
+Conclusions:
+
+-   Fertilizer effect is significant (p \< 0.05)
+
+-   Farm (block) effect is also significant
+
+## c)
+
+```{r}
+tukey_results <- TukeyHSD(anova_model, "Fertilizer")
+tukey_results
+```
+
+Results:
+
+- Fertilizer 2 produces the highest yields
+
+- All fertilizer pairs differ significantly
+
+- Ordering of mean yields: Fert 2 \> Fert 3 \> Fert 1
+
+Final Conclusion (alpha = 0.05)
+
+- There is strong statistical evidence that fertilizer type affects yield.
+
+- Blocking by farm was appropriate and reduced error variability.
+
+- Fertilizer 2 is the most effective option based on yield.
+
+# Question 2
+
+## a)
+```{r}
+drug_data <- data.frame(
+  patient = factor(rep(1:5, each = 3)),
+  drug = factor(rep(c("A", "B", "C"), times = 5)),
+  response_time = c(
+    12, 10, 15,  # Patient 1
+    14, 11, 16,  # Patient 2
+    10, 8, 13,   # Patient 3
+    13, 10, 14,  # Patient 4
+    11, 9, 14    # Patient 5
+  )
+)
+
+kable(drug_data, caption = "Drug Trial Response Times (seconds)")
+```
+
+## b)
+
+Model: $$Y_{ij} = \mu + \tau_i + \beta_j + \varepsilon_{ij}$$
+
+```{r}
+anova_model <- aov(response_time ~ drug + patient, data = drug_data)
+summary(anova_model)
+```
+
+Decision (alpha = 0.05):
+
+- Drug effect is significant
+
+- Patient (block) effect is significant
+
+## c)
+Residual Diagnostics
+```{r}
+par(mfrow = c(1, 2))
+plot(anova_model, which = 1)  # Residuals vs Fitted
+plot(anova_model, which = 2)  # Normal Q-Q
+par(mfrow = c(1, 1))
+```
+
+Formal Tests
+```{r}
+# Normality of residuals
+shapiro.test(residuals(anova_model))
+
+# Homogeneity of variance
+bartlett.test(response_time ~ drug, data = drug_data)
+```
+
+Results:
+
+- Residuals are approximately normally distributed
+
+- Variances across drug groups are homogeneous
+
+## d)
+Multiple Comparisons
+```{r}
+tukey_results <- TukeyHSD(anova_model, "drug")
+tukey_results
+```
+
+Results:
+
+- All drug pairs differ significantly
+
+- Ordering of mean response times: Drug B < Drug A < Drug C
+
+## e)
+Mean Response Times by Drug
+```{r}
+drug_data %>%
+  group_by(drug) %>%
+  summarise(mean_time = mean(response_time)) %>%
+  ggplot(aes(x = drug, y = mean_time)) +
+  geom_col(fill = "steelblue") +
+  labs(
+    title = "Mean Response Time by Drug",
+    x = "Drug",
+    y = "Mean Response Time (seconds)"
+  ) +
+  theme_minimal()
+```
+
+
+Boxplot by Drug
+```{r}
+ggplot(drug_data, aes(x = drug, y = response_time)) +
+  geom_boxplot(fill = "lightgray") +
+  labs(
+    title = "Response Time Distribution by Drug",
+    x = "Drug",
+    y = "Response Time (seconds)"
+  ) +
+  theme_minimal()
+```
+
+## f)
+Conclusion: 
+
+At the 5% significance level, there is strong evidence that drug formulation affects patient response time. Blocking by patient was effective and significantly reduced unexplained variability. Post-hoc analysis using Tukey’s HSD showed that all three drugs differ significantly, with Drug B producing the fastest (best) response times, followed by Drug A, and then Drug C.