R – dplyr::select – Including All Other Columns at End of New Data Frame (or Beginning or Middle)

dplyrr

When interacting with data I find the dplyr library's select() function a great way to organize my data frame columns.

One great use, if I happen to be working with a df that has many columns, I often find myself putting two variables next to each other for easy comparison. When doing this, I then need to attached all other columns either before or after. I found the matches(".") function a super convenient way to do this.

For example:

library(nycflights13)
library(dplyr)

# just have the five columns:
select(flights, carrier, tailnum, year, month, day) 

# new order for all column:
select(flights, carrier, tailnum, year, month, day, matches(".")) 
# matches(".")  attached all other columns to end of new data frame

The Question
– I am curious if there is a better way to do this? Better in the sense of being more flexible.

For example of one issue: Is there some way to include "all other" columns at the beginning or middle of new data.frame? (Note that select(flights, matches("."), year, month, day, ) doesn't produce desired result, since matches(".") attached all columns and year, month, day are ignored because they are repeats of existing columns names.)

Best Answer

Update: using dplyr::relocate()

  • Selected columns **at the beginning**:
  • flights %>%  
      relocate(carrier, tailnum, year, month, day)
    
  • Selected columns **at the end**:
  • flights %>%  
      relocate(carrier, tailnum, year, month, day, .after = last_col()) 
    

    Old answer

    >If you want to **reorder the columns**
  • All other columns **at the end**:
  • select(flights, carrier, tailnum, year, month, day, everything()) 
    

    Or in two steps, to select variables provided in a character vector, one_of("x", "y", "z"):

    col <- c("carrier", "tailnum", "year", "month", "day")
    select(flights, one_of(col), everything()) 
    
  • All other columns **at the beginning**:
  • select(flights, -one_of(col), one_of(col))
    

    If you want to add all the data frame again using dplyr:

  • All data frame at the end:
  • bind_cols(select(flights, one_of(col)), flights)
    
  • All data frame at the beginning:
  • bind_cols(flights, select(flights, one_of(col)))
    
    Related Topic