When interacting with data I find the dplyr library's select() function a great way to organize my data frame columns.
One great use, if I happen to be working with a df that has many columns, I often find myself putting two variables next to each other for easy comparison. When doing this, I then need to attached all other columns either before or after. I found the matches(".")
function a super convenient way to do this.
For example:
library(nycflights13)
library(dplyr)
# just have the five columns:
select(flights, carrier, tailnum, year, month, day)
# new order for all column:
select(flights, carrier, tailnum, year, month, day, matches("."))
# matches(".") attached all other columns to end of new data frame
The Question
– I am curious if there is a better way to do this? Better in the sense of being more flexible.
For example of one issue: Is there some way to include "all other" columns at the beginning or middle of new data.frame? (Note that select(flights, matches("."), year, month, day, )
doesn't produce desired result, since matches(".")
attached all columns and year, month, day
are ignored because they are repeats of existing columns names.)
Best Answer
Update: using dplyr::relocate()
Old answer
>If you want to **reorder the columns**Or in two steps, to select variables provided in a character vector,
one_of("x", "y", "z")
: