R – Stacked bar plot with 4 categorical variables in R

bar-chartggplot2plotr

My problem is being able to displaying 4 categorical variables in a bar graph in R.

The 4 categorical variables each have 2 or more levels. My thoughts were to use a ggplot to create separate bar plots using geom_bar for each of 3 categories, for which counts of each level would be stacked. I would then use facet_wrap then to split it out by the 4th category.

The data looks like this:

Species     Crown_Class     Life_class      Stem_Category
E. obliqua  Suppressed      Standing live   Large stems
E. rubida   Intermediate    Standing live   Large stems
E. obliqua  Suppressed      Standing live   Small stems
E. obliqua  Suppressed      Standing live   Small stems
E. rubida   Suppressed      Standing live   Large stems
E. radiata  Suppressed      Standing live   Small stems
E. obliqua  Dominant        Standing live   Small stems
E. obliqua  Suppressed      Standing live   Small stems
E. radiata  Suppressed      Standing live   Large stems
E. rubida   NA              Standing dead   Large stems
E. rubida   Intermediate    Standing live   Large stems

The graph I have in mind shows each a stacked bar for each of three categories which are then grouped by a third. For the data given, separate bars for Crown_Class, life_class and Stem_Category would be displayed for each of the species.

I have been trying for hours and can do separate plots using this code (I separated the data into 3 separate dataframes to do it though:

ggplot(data= cc, aes(x= Species, fill = Crown_Class))+
geom_bar(position='stack')

ggplot(data=lc, aes(x = Species, fill = Life_class))+
geom_bar(position ='stack')

ggplot(data=sc, aes(x = Species, fill = Stem_Category))+
geom_bar(position ='stack')

The idea was to do something like this:

ggplot()+
  geom_bar(data= cc, aes(x = Species, fill = Crown_Class), 
      position='stack') +
  geom_bar(data=lc, aes(x = Species, fill = Life_class), 
      position ='dodge')+
  facet_wrap(~Species)

But the result is not what I have in mind. The second plot effectively overwrites the first.

enter image description here

I would be grateful for any help.

Best Answer

Here's an example of how you could use facet_grid to include all 4 variables on the same plot.

Note that I generate some dummy data, since I had trouble importing your dataset into R.

generate data

library(ggplot2)
theme_set(theme_bw())
set.seed(123)
df1 <- data.frame(s1 = sample(letters[1:3], 11, replace = T),
                  s2 = sample(letters[4:6], 11, replace = T),
                  s3 = sample(letters[7:9], 11, replace = T),
                  s4 = sample(letters[10:12], 11, replace = T),
                  stringsAsFactors = FALSE)

edit:

Maybe this is closer to what you're after:

ggplot(df1)+
    geom_bar(aes(x = s1), position = 'stack')+
    geom_bar(aes(x = s2), position = 'stack')+
    geom_bar(aes(x = s3), position = 'stack')+
    facet_wrap(~ s4)

enter image description here

If you proceed in this manner, you should definitely note that the values on the x-axis come from three different variables.

IMHO: While I'm no expert on the subject, I do think it's a bit dubious to create a visualization with three different variables on the same axis, and ggplot2 gives you plenty of options to avoid proceeding in such a manner.

make plot using facet_grid

ggplot(df1, aes(x = s1, fill = s2))+
    geom_bar(position = 'stack')+
    facet_grid(s3~s4)

enter image description here

make plot using interaction and facet_wrap

Now, suppose you don't want the two grouping factors as facets, and just prefer one facet. Then, we can use the interaction function.

ggplot(df1, aes(x = s1, fill = interaction(s2,s3)))+
    geom_bar(position = 'stack')+
    facet_wrap(~s4)

enter image description here

use Rmisc::multiplot

Finally, we can create three separate plots, and then use Rmisc::multiplot to plot on the same page.

library(Rmisc)
p1 <- ggplot(df1, aes(x = s1, fill = s2))+
    geom_bar(position = 'stack')
p2 <- ggplot(df1, aes(x = s1, fill = s3))+
    geom_bar(position = 'stack')
p3 <- ggplot(df1, aes(x = s1, fill = s4))+
    geom_bar(position = 'stack')

multiplot(p1,p2,p3, cols = 3)

enter image description here

Related Topic