R – Filter data frame by character column name (in dplyr)

dplyrr

I have a data frame and want to filter it in one of two ways, by either column "this" or column "that". I would like to be able to refer to the column name as a variable. How (in dplyr, if that makes a difference) do I refer to a column name by a variable?

library(dplyr)
df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
df
#   this that
# 1    1    1
# 2    2    1
# 3    2    2
df %>% filter(this == 1)
#   this that
# 1    1    1

But say I want to use the variable column to hold either "this" or "that", and filter on whatever the value of column is. Both as.symbol and get work in other contexts, but not this:

column <- "this"
df %>% filter(as.symbol(column) == 1)
# [1] this that
# <0 rows> (or 0-length row.names)
df %>% filter(get(column) == 1)
# Error in get("this") : object 'this' not found

How can I turn the value of column into a column name?

Best Answer

From the current dplyr help file (emphasis by me):

dplyr used to offer twin versions of each verb suffixed with an underscore. These versions had standard evaluation (SE) semantics: rather than taking arguments by code, like NSE verbs, they took arguments by value. Their purpose was to make it possible to program with dplyr. However, dplyr now uses tidy evaluation semantics. NSE verbs still capture their arguments, but you can now unquote parts of these arguments. This offers full programmability with NSE verbs. Thus, the underscored versions are now superfluous.

So we basically need to do two things, to be able to refer to the value "this" of the variable column inside dplyr::filter():

  1. We need to turn the variable column which is of type character into type symbol.

    Using base R this can be achieved by the function as.symbol() which is an alias for as.name(). The former is preferred by the tidyverse developers because it

    follows a more modern terminology (R types instead of S modes).

    Alternatively the same can be achieved by rlang::sym() from the tidyverse.

  2. We need to unquote the symbol from 1).

    What unquoting exactly means can be learned in the vignette Programming with dplyr. It is achieved by the syntactic sugar !!.

    (In earlier versions of dplyr (or the underlying rlang respectively) there used to be situations (incl. yours) where !! would collide with the single !, but this is not an issue anymore since !! gained the right operator precedence.)

Applied to your example:

library(dplyr)
df <- data.frame(this = c(1, 2, 2),
                 that = c(1, 1, 2))
column <- "this"

df %>% filter(!!as.symbol(column) == 1)
#   this that
# 1    1    1