Marginal Distributions

By Xiaochi Liu in R programming

Marginal Distribution (Density) plots are a way to extend your numeric data with side plots that highlight the density (histogram or boxplots work too).

knitr::opts_chunk$set(warning = FALSE, message = FALSE)
library(ggside)
library(tidyverse)
library(tidyquant)
mpg
## # A tibble: 234 × 11
##    manufacturer model      displ  year   cyl trans drv     cty   hwy fl    class
##    <chr>        <chr>      <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
##  1 audi         a4           1.8  1999     4 auto… f        18    29 p     comp…
##  2 audi         a4           1.8  1999     4 manu… f        21    29 p     comp…
##  3 audi         a4           2    2008     4 manu… f        20    31 p     comp…
##  4 audi         a4           2    2008     4 auto… f        21    30 p     comp…
##  5 audi         a4           2.8  1999     6 auto… f        16    26 p     comp…
##  6 audi         a4           2.8  1999     6 manu… f        18    26 p     comp…
##  7 audi         a4           3.1  2008     6 auto… f        18    27 p     comp…
##  8 audi         a4 quattro   1.8  1999     4 manu… 4        18    26 p     comp…
##  9 audi         a4 quattro   1.8  1999     4 auto… 4        16    25 p     comp…
## 10 audi         a4 quattro   2    2008     4 manu… 4        20    28 p     comp…
## # … with 224 more rows

Linear Regression with Marginal Distribution (Density) Side-Plots (Top and Left):

mpg %>% 
  ggplot(aes(hwy, cty, color = class)) +
  geom_point(size = 2, alpha = 0.3) + 
  geom_smooth(aes(color = NULL), se = TRUE) +
  geom_xsidedensity(aes(y = after_stat(density), fill = class),
                    alpha = 0.5,
                    size = 1,
                    position = "stack"
                    ) +
  geom_ysidedensity(aes(x = after_stat(density), fill = class),
                    alpha = 0.5,
                    size = 1,
                    position = "stack"
                    ) + 
  scale_color_tq() +
  scale_fill_tq() +
  theme_tq() +
  labs(title = "Fuel Economy by Vehicle Type",
       subtitle = "ggside density",
       x = "Highway MPG", y = "City MPG") +
  theme(ggside.panel.scale.x = 0.4,
        ggside.panel.scale.y = 0.4)

Facet-Plot with Marginal Box Plots (Top):

mpg %>% 
  ggplot(aes(x = cty, y = hwy, color = class)) +
  geom_point() +
  geom_smooth(aes(color = NULL)) +
  geom_xsideboxplot(alpha = 0.5, size = 1) +
  scale_color_tq() +
  scale_fill_tq() +
  theme_tq() +
  facet_grid(cols = vars(cyl), scales = "free_x") +
  labs(title = "Fuel Economy by Engine Size (Cylinders)") +
  theme(ggside.panel.scale.x = 0.4)