8 File handling operations in R

In chapter 9 we have already learned about reading and writing data from/to files. In this section, we will learn about some other functions that are useful while reading and writing data, such as - changing directory, creating a file, renaming a file, check the existence of the file, listing all files in the working directory, copying files and creating directories.

8.1 Handling files

8.1.1 Creating a file within R, using file.create()

Using file.create() function, we can create a new file from console. If the file already exists it truncates. The function returns a TRUE logical value if file is created otherwise, returns FALSE.

Example - The following command will create a blank text file in the current working directory.

file.create("my_new_text_file.txt")
## [1] TRUE

8.1.2 Checking whether a file exists, using file.exists()

Similar to above, we can check whether a file with given name exists, using function file.exists(). Example-

file.exists("my_new_text_file.txt")
## [1] TRUE

8.1.3 Renaming file with file.rename()

The file name can be changed within the R console using, function file.rename(). Basic syntax is file.rename(from = "old_name", to = "new_name"). The function will return TRUE or FALSE depending upon the successful execution. See example

file.rename(from = "my_new_text_file.txt", to = "my_renamed_file.csv")
## [1] TRUE
# Check whether old file exists
file.exists("my_new_text_file.txt")
## [1] FALSE

8.1.4 Copying file with file.copy() function

Using file.copy(from = "old_path", to = "new_path") syntax files can be copied from one directory to another.

8.1.5 Deleting file with file.remove()

The syntax for function, that removes a file with given name, is also very simple. Example-

file.remove("my_renamed_file.csv")
## [1] TRUE

Check whether the file has been really deleted.

file.exists("my_renamed_file.csv")
## [1] FALSE

8.2 Handling directories

8.2.1 Get/Set path of current working directory using getwd()/ setwd()

We can check/get the path of current working directory (wd in short) as a character vector, using getwd() function.

## [1] "G:/OneDrive - A c GeM CAG of INDIA (1)/new book"

Similarly, using setwd("given\\path\\here") we can change the current working directory.

Two things to be noted here - Either the path is to be given using forward slash / or if backslash \ is used these need to be escaped, using an extra \ as \ is itself an escape character in R.

8.2.2 Create new directory using dir.create() and other operations

A new directory can be created using function dir.create(). Example- the command below will create a new directory named ‘new_dir’ in the current working directory. If TRUE is returned, directory with given name is created.

dir.create("new_dir")

We can check whether any directory named ‘new_dir’ exists in current working directory, using function dir.exists() function. Function will return either TRUE or FALSE.

dir.exists("new_dir")
## [1] TRUE

We can also check all files that exists in current working directory/any other directory using list.files() function.

any(list.files() == 'new_dir')
## [1] TRUE

A given directory can be removed using unlink() function be specifically setting argument recursive to TRUE. Example

unlink("new_dir", recursive = TRUE)
dir.exists("new_dir")
## [1] FALSE

8.3 An important function for opening a dialog box for selecting files and folder

We may use either of choose.dir() or file.choose(), to let the user select directory or file of her/his choice respectively.

Try these in your console

list.files(choose.dir())
file.copy(from = file.choose(), to = "new_name")

8.4 Other useful functions for listing/removing variables

We can list all the variables available in current environment using function ls(). Another function rm() will remove the given variables. So a command like rm(lm()) will remove all the available variables from the environment(Use this will caution as it will erase all the saved variables/data).

8.5 Using save() to save objects/collection of objects

We can save objects using function save() which saves the objects on disk for later usage. The saved objects can be retrieved using load() function. See this example-

h <- hist(Nile)
save(h, file="nile_hist")
rm(h)
any(ls() == 'h')
## [1] FALSE
load("nile_hist")
any(ls() == 'h')
## [1] TRUE

8.6 Working with Projects in RStudio

While working on a project, often a requirement is to keep all scripts, data, results, charts, figures, etc. at a single place. R studio has thus, a concept of working in projects, which associates a specific directory with a specific project and creates a specific file with extension .Rproj, which can reopen the complete scripts/ data/ etc. associated with that project.

To open a new project in Rstudio, click file menu then New Project. Check the screenshot in Figure 8.1.

Creating Projects in R StudioCreating Projects in R Studio

Figure 8.1: Creating Projects in R Studio

After Creating the projects you will notice that a file with an extension .Rproj has been created by R Studio in the selected directory/location of project.

To resume working in the same project directory, either double click the file, or open the project using file menu i.e. file \(->\) Open Project.

In R Studio, we can create a project -

  • In a new directory (first option in Figure 8.1)
  • In an Existing Directory (second option in Figure 8.1)
  • By cloning a Version Control (like Git, etc.) as shown in third option in Figure 8.1.

Working in projects, one for each data analytics projects, has many benefits apart from streamlining the workflow, leading to improved productivity and focus on the analysis tasks, as against working directly using r scripts in RStudio-

  • Can get rid of hassles of setting/finding the working directory after starting any session of RStudio.
  • All the paths are now relative to that project directory. This again saves us from a lot of hassles of finding path to the directory where our scripts and data as well as outputs (data, plots, etc.) are saved.
  • All our code and related outputs are now having a better reproducibility.
  • Each project can have its own R environment, preventing conflicts between packages and dependencies.
  • Working in projects also ensures a consistent setup across different machines or for different team members.
  • This also allows for project-specific settings and configurations, enhancing the user experience.