Tagged with exists

Check if a Variable Exists in R

If you use attach, it is easy to tell if a variable exists. You can simply use exists to check:

>attach(df)

>exists("varName")
[1] TRUE

However, if you don't use attach (and I find you generally don't want to), this simple solution doesn't work.

> detach(df)
> exists("df$varName")
[1] FALSE

Instead of using exists, you can use in or any from the base package to determine if a variable is defined in a data frame:

> "varName" %in% names(df)
[1] TRUE
> any(names(df) == "varName")
[1] TRUE

Or to determine if a variable is defined in a matrix:

> "varName" %in% colnames(df)
[1] TRUE
> any(colnames(df) == "varName")
[1] TRUE

References

Tagged , , , , , ,

Only Load Data If Not Already Open in R

I often find it beneficial to check to see whether or not a dataset is already loaded into R at the beginning of a file. This is particularly helpful when I'm dealing with a large file that I don't want to load repeatedly, and when I might be using the same dataset with multiple R scripts or re-running the same script while making changes to the code.

To check to see if an object with that name is already loaded, we can use the exists function from the base package. We can then wrap our read.csv command with an if statement to cause the file to only load if an object with that name is not already loaded.

if(!exists("largeData")) {
    largeData <- read.csv("huge-file.csv",
        header = TRUE)
}

You will probably also find it useful to use the "colClasses" option of read.csv or read.table to help the file load faster and make sure your data are in the right format. For example:

if(!exists("largeData")) {
    largeData <- read.csv("huge-file.csv",
        header = TRUE,
        colClasses = c("factor", "integer", "character", "integer", 
            "integer", "character"))
}

This post is one part of my series on dealing with large datasets.

Tagged , , , , , ,