Hypothesis Testing with Spielberg and Burton

Aim: Exploring whether the mean IMDB rating for Steven Spielberg and Tim Burton is different via Hypothesis Testing. (Language: R)

H0 = there is no difference between the mean IMBD ratings of Steven Spielberg and Tim Burton
H1 = there is a difference between the mean IMBD ratings of Steven Spielberg and Tim Burton

First we need to load the data and take a look at its contents.

movies <- read_csv(here::here("data", "movies.csv"))

## Rows: 2961 Columns: 11

## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): title, genre, director
## dbl (8): year, duration, gross, budget, cast_facebook_likes, votes, reviews,...

## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

glimpse(movies)

## Rows: 2,961
## Columns: 11
## $ title               <chr> "Avatar", "Titanic", "Jurassic World", "The Avenge…
## $ genre               <chr> "Action", "Drama", "Action", "Action", "Action", "…
## $ director            <chr> "James Cameron", "James Cameron", "Colin Trevorrow…
## $ year                <dbl> 2009, 1997, 2015, 2012, 2008, 1999, 1977, 2015, 20…
## $ duration            <dbl> 178, 194, 124, 173, 152, 136, 125, 141, 164, 93, 1…
## $ gross               <dbl> 760505847, 658672302, 652177271, 623279547, 533316…
## $ budget              <dbl> 2.37e+08, 2.00e+08, 1.50e+08, 2.20e+08, 1.85e+08, …
## $ cast_facebook_likes <dbl> 4834, 45223, 8458, 87697, 57802, 37723, 13485, 920…
## $ votes               <dbl> 886204, 793059, 418214, 995415, 1676169, 534658, 9…
## $ reviews             <dbl> 3777, 2843, 1934, 2425, 5312, 3917, 1752, 1752, 35…
## $ rating              <dbl> 7.9, 7.7, 7.0, 8.1, 9.0, 6.5, 8.7, 7.5, 8.5, 7.2, …

Basic initial analysis to see how the mean ratings compare:

#initial observation table:
 movies %>% 
  filter(director %in% c("Steven Spielberg", "Tim Burton")) %>% 
  group_by(director) %>% 
  summarise(n = n(),
            mean = mean(rating),
            sd = sd(rating))

## # A tibble: 2 × 4
##   director             n  mean    sd
##   <chr>            <int> <dbl> <dbl>
## 1 Steven Spielberg    23  7.57 0.695
## 2 Tim Burton          16  6.93 0.749

Comparing confidence intervals:

#creating a confidence interval
compare <- movies %>% 
  filter(director %in% c("Tim Burton", "Steven Spielberg"))   %>% 
  group_by(director) %>% 
   summarise(mean = round(mean(rating), 2),
              n = count(director),
              sd = sd(rating),
              t_critical = qt(0.975, n - 1),  #calculates the t-critical
              se = sd / sqrt(n),
              margin_of_error = t_critical * se,
              low_CI= round(mean - margin_of_error, 2) ,
              high_CI= round(mean + margin_of_error, 2)
              )%>%
  mutate(director = factor(director, levels = c("Tim Burton","Steven Spielberg")))

#graphing the confidence intervals
graph <- ggplot(compare, aes(colour=director)) +
  geom_errorbar(aes(xmin = low_CI, xmax = high_CI, y= director), width = 0.1, size = 1.5)  +
  scale_color_manual(values = c("skyblue","tomato"))+
  geom_point(aes(x=mean, y=director), size = 3 ) +
  labs(title="Do Spielberg and Burton have the same IMDB ratings?",
       subtitle="95% confidence intervals overlap",
       x="Mean IMDB Rating",
       y =" ") +
  geom_text(aes(label = low_CI, x=low_CI, y=director), size = 4, color="black", hjust = 1, vjust = 0, nudge_x = 0.05, nudge_y = 0.08) +
  geom_text(aes(label = high_CI, x=high_CI, y=director),size = 4, color="black", hjust = 1, vjust = 0, nudge_x = 0.05, nudge_y = 0.08) +
  geom_text(aes(label = mean, mean, y=director), size = 6, color="black", hjust = 1, vjust = 0, nudge_x = 0.05, nudge_y = 0.08)+
geom_rect( mapping=aes(xmin= 7.27, xmax= 7.33, ymin=0, ymax=3), color="lightgrey", alpha=0.2) +
  theme_bw()
  
  
graph +theme(legend.position = "none")

T-test

compare2 <- movies %>% 
  filter(director %in% c("Steven Spielberg", "Tim Burton")) 

t.test(rating ~ director, data = compare2)

## 
##  Welch Two Sample t-test
## 
## data:  rating by director
## t = 2.7144, df = 30.812, p-value = 0.01078
## alternative hypothesis: true difference in means between group Steven Spielberg and group Tim Burton is not equal to 0
## 95 percent confidence interval:
##  0.1596624 1.1256637
## sample estimates:
## mean in group Steven Spielberg       mean in group Tim Burton 
##                       7.573913                       6.931250

Hypothesis testing

set.seed(1234)

obs_diff <- compare2 %>%
  specify(rating ~ director) %>%
  calculate(stat = "diff in means", order = c("Steven Spielberg", "Tim Burton"))

obs_diff

## Response: rating (numeric)
## Explanatory: director (factor)
## # A tibble: 1 × 1
##    stat
##   <dbl>
## 1 0.643

#hypothesis testing with infer:

  null_dist_movies <- compare2 %>%
  # specify variables
  specify(rating ~ director) %>%
  
  # assume independence, i.e, there is no difference
  hypothesize(null = "independence") %>%
  
  # generate 1000 reps, of type "permute"
  generate(reps = 1000, type = "permute") %>%
  
  # calculate statistic of difference, namely "diff in means"
  calculate(stat = "diff in means", order = c("Steven Spielberg", "Tim Burton"))
  
#visualise the null distribution 
null_dist_movies %>% visualize() +
  shade_p_value(obs_stat = obs_diff, direction = "two-sided")

null_dist_movies %>%
  get_p_value(obs_stat = obs_diff, direction = "two_sided")

## # A tibble: 1 × 1
##   p_value
##     <dbl>
## 1   0.008

null_dist_movies %>%
  get_p_value(obs_stat = obs_diff, direction = "two_sided")

## # A tibble: 1 × 1
##   p_value
##     <dbl>
## 1   0.008

Conclusion

According to the hypothesis testing, the p-value is smaller than 0.05 which means that it is safe to reject the null hypothesis that claims there is no difference in the mean ratings of Spielberg and Burton, which means that it is 95% likely that there is a difference between the mean ratings of these famous directors, and Spielberg is ahead of Burton in the IMDB race!