R – ggplot2: making symbols in legend match symbols in plot

ggplot2legendr

I'm trying to do a plot that has most data points drawn normally, but one set of data points with a different sized symbol. I want the legend to show the same: most points shown normally, but the exception drawn with a different sized symbol. Here is a short bit of code:

library(ggplot2)
x = c(1,2,1,2,3)
y = c(1,2,3,4,3)
vendor = c("x", "x", "y", "y", "z")
df = data.frame(x,y,vendor)

p <- ggplot(df) +
     aes_string(x="x", y="y", color="vendor") +
     geom_point(size=3, data=subset(df, vendor!="z")) +
     geom_point(size=5, data=subset(df, vendor=="z"))
ggsave("foo.pdf")

The problem is that in the resulting legend, all points are drawn with the larger (size=5) symbol, not just those with vendor z. I want vendor z drawn with the larger point in the legend, and the others drawn with size=3.

(Bonus question: What I really want is a larger thick outlined symbol: instead of a circle, I want a donut. I realize that shape=2 will draw an outlined circle, but it is very thin. I'd rather have a thicker outlined circle. I want to do the same with a triangle. Any easy way to do this?)

Maybe I applied it wrong, but following this advice:

ggplot2: Making changes to symbols in the legend

with the addition of the "guides" line did not help:

guides(size = guide_legend(override.aes = list(shape = 1)))

i.e. same output, with size=5 symbols for all three vendors in the legend.

EDITED: Fantastic answer, which I quickly implemented. Now I've added lines:

library(ggplot2)
x = c(1,2,1,2,3)
y = c(1,2,3,4,3)
vendor = c("x", "x", "y", "y", "z")
df = data.frame(x,y,vendor)

df$vendor_z <- df$vendor=="z"     # create a new column 

ggplot(df) +
  aes_string(x = "x", y = "y", color = "vendor", size = "vendor_z") +
  geom_point() +
  geom_line(size=1.5) +   # this is the only difference
  scale_size_manual(values = c(3, 5), guide = FALSE) 
  guides(colour = guide_legend(override.aes = list(size = c(3, 3, 5))))

ggsave("foo.pdf")

and now the size of the legend is back down to 3 again for all dots, including the ones with vendor z. Any ideas on how to fix this?

Best Answer

The size is not applied to the legend since size is outside aes_string. Furtermore, the work with ggplot will be much easier if you create an additional column indicating whether vendor == "z".

Here's a solution for part 1:

df$vendor_z <- df$vendor=="z"     # create a new column 

ggplot(df) +
  aes_string(x = "x", y = "y", color = "vendor", size = "vendor_z") +
  geom_point() +
  scale_size_manual(values = c(3, 5), guide = FALSE) + 
  guides(colour = guide_legend(override.aes = list(size = c(3, 3, 5))))

Note that vendor_z is as argument of aes_string. This will tell ggplot to create a legend for the size characteristic. In the function scale_size_manual, the values for size are set. Furthermore, guide = FALSE avoids a second legend for size only. Finally, the size values are applied to the color legend.

enter image description here

Part2: a "donut" symbol

The size of the lines for circles cannot be modified in ggplot. Here is a workaround:

ggplot(df) +
  aes_string(x = "x", y = "y", color = "vendor", size = "vendor_z") +
  geom_point() +
  geom_point(data = df[df$vendor_z, ], aes(x = x, y = y),
             size = 3, shape = 21, fill = "white", show_guide = FALSE) +
  scale_size_manual(values = c(3, 5), guide = FALSE) + 
  guides(colour = guide_legend(override.aes = list(size = c(3, 3, 5))))

Here, a single point is drawn using geom_point and a subset of the data (df[df$vendor_z, ]). I chose a size of 3 since this is the value of the smaller circles. The shape 21 is a circle for which a fill colour could be specified. Finally, show_guide = FALSE avoids that the legend characteristics are overwritten by the new shape.

enter image description here

Edit: part 3: Add lines

You could suppress the legend for geom_line with the argument show_guide = FALSE:

ggplot(df) +
  aes_string(x = "x", y = "y", color = "vendor", size = "vendor_z") +
  geom_point() +
  geom_line(size=1.5, show_guide = FALSE) +   # this is the only difference
  scale_size_manual(values = c(3, 5), guide = FALSE) +
  guides(colour = guide_legend(override.aes = list(size = c(3, 3, 5))))

enter image description here