I have a large data set and I would like to read specific columns or drop all the others.
data <- read.dta("file.dta")
I select the columns that I'm not interested in:
var.out <- names(data)[!names(data) %in% c("iden", "name", "x_serv", "m_serv")]
and than I'd like to do something like:
for(i in 1:length(var.out)) {
paste("data$", var.out[i], sep="") <- NULL
}
to drop all the unwanted columns. Is this the optimal solution?
Best Answer
You should use either indexing or the
subset
function. For example :Then you can use the
which
function and the-
operator in column indexation :Or, much simpler, use the
select
argument of thesubset
function : you can then use the-
operator directly on a vector of column names, and you can even omit the quotes around the names !Note that you can also select the columns you want instead of dropping the others :