Chapter 20

Ten Tips on Working with Packages

In This Chapter

arrow Finding packages

arrow Installing and updating packages

arrow Loading and unloading packages

One of the very attractive features of R is that it contains a large collection of third-party packages (collections of functions in a well-defined format). To get the most out of R, you need to understand where to find additional packages, how to download and install them, and how to use them.

In this chapter, we consolidate some of the things we cover earlier in the book and give you ten tips on working with packages.

tip.eps Many other software languages have concepts that are similar to R packages. Sometimes these are referred to as “libraries.” However, in R, a library is the folder on your hard disk (or USB stick, network, DVD, or whatever you use for permanent storage) where your packages are stored.

Poking Around the Nooks and Crannies of CRAN

The Comprehensive R Archive Network (CRAN; http://cran.r-project.org) is a network of web servers around the world where you can find the R source code, R manuals and documentation, and contributed packages.

CRAN isn’t a single website; it’s a collection of web servers, each with an identical copy of all the information on CRAN. Thus, each web server is called a mirror. The idea is that you choose the mirror that is located nearest to where you are, which reduces international or long-distance Internet traffic. You can find a list of CRAN mirrors at http://cran.r-project.org/mirrors.html.

tip.eps RGui and RStudio allow you to set the location of your nearest CRAN mirror directly in the application. For example, in the Windows RGui, you can find this option by choosing Packages⇒Set CRAN mirror. In RStudio, you can find this option by choosing Tools⇒Options⇒R⇒CRAN mirror.

technicalstuff.eps Regardless of which R editor you use, you can permanently save your preferred CRAN mirror (and other settings) in special file called .RProfile, located in the user’s home directory or the R startup directory. For example, to set the Imperial College, UK mirror as your default CRAN mirror, include this line in your .RProfile:

options(“repos” = c(CRAN = “http://cran.ma.imperial.ac.uk/”))

For more information, see the appendix of this book.

Finding Interesting Packages

As of this writing, there are more than 3,000 packages on CRAN. So, finding a package that does something that you want to do may seem difficult.

Fortunately, a handful of volunteer experts have collated some of the most widely used packages into curated lists. These lists are called CRAN task views, and you can view them at http://cran.r-project.org/web/views/. You’ll notice that there are task views for topics such as empirical finance, statistical genetics, machine learning, statistical learning, and many other fascinating topics.

Each package has its own web page on CRAN. Say, for example, you want to find a package to do high-quality graphics. If you followed the link to the graphics task view, http://cran.r-project.org/web/views/Graphics.html, you may notice a link to the ggplot2 package, http://cran.r-project.org/web/packages/ggplot2/index.html. On the web page for a package, you’ll find a brief summary, information about the packages that are used, a link to the package website (if such a site exists), and other useful information.

Installing Packages

To install a package use the install.packages() function. This simple command downloads the package from a specified repository (by default, CRAN) and installs it on your machine:

> install.packages(“fortunes”)

Note that the argument to install.packages() is a character string. In other words, remember the quotes around the package name!

tip.eps In RGui, as well as in RStudio, you find a menu command to do the same thing:

check.png In RGui, choose Packages⇒Install package(s).

check.png In RStudio, choose Tools⇒Install packages.

Loading Packages

To load a package, you use the library() or require() function. These functions are identical in their effect, but they differ in the return value:

check.png library(): Invisibly returns a list of packages that are attached, or returns FALSE if the package is not on your machine.

check.png require(): Returns TRUE if the package was successfully attached and FALSE if not.

The R documentation suggests that library() is the preferred way of loading packages in scripts, while require() is preferred inside functions and packages.

So, after you’ve installed the package fortunes you load it like this:

> library(“fortunes”)

Note that you don’t need to quote the name of the package in the argument of library(), but it’s good practice to quote it anyway.

Reading the Package Manual and Vignette

After you’ve installed and loaded a new package, a good starting point is to read the package manual. The package manual is a collection of all function and other package documentation. You can access the manual in two ways. The first way is to use the help argument to the library() function:

> library(help=fortunes)

The second way is to find the manual on the package website. If you point your browser window to the CRAN page for the fortunes package (http://cran.r-project.org/web/packages/fortunes/), you’ll notice a link to the manual toward the bottom of the page (http://cran. r-project.org/web/packages/fortunes/fortunes.pdf).

Whichever approach you choose, the result is a PDF document containing the package manual.

Some package authors also write a vignette, a document that illustrates how to use the package. A vignette typically shows some examples of how to use the functions and how to get started. The key thing is that a vignette illustrates how to use the package with R code and output, just like this book.

To read the vignette for the fortunes package, try the following:

> vignette(“fortunes”)

Updating Packages

Most package authors release improvements to their packages from time to time. To ensure you have the latest version, use update.packages():

> update.packages()

This function connects to CRAN (by default) and checks whether there are updates for all the packages that you’ve installed on your machine. If there are, it asks you whether you want to update each package and then downloads the code and installs the new version.

tip.eps If you add update.packages(ask = FALSE), R updates all out-of-date packages in the current applicable library locations, without prompting you. Also, you can tell update.packages() to look at a repository other than CRAN by changing the repos argument. If the repos argument points to a file on your machine (or network), R installs the package from this file.

Both RGui and RStudio have menu options that allow you to update the packages:

check.png In RGui, choose Packages⇒Update package(s).

check.png In RStudio, choose Tools⇒Check for Package Updates.

Both applications allow you to graphically select packages to update.

Unloading Packages

By loading a package, R first loads the package and then attaches the package and to your search path, which you can see as an internal database that tells R where to find functions and objects. Whenever R evaluates a variable (or function), it tries to find that variable (or function) in the search path. To list the packages that are loaded in the search path, use the search() function:

> search()

To unload a package from the search path, use detach(). The argument to detach() is the name of the package, preceded by package:, like this:

> detach(package:fortunes, unload=TRUE)

Note that you need to specify the argument unload=TRUE; otherwise, R removes the package from the search path but doesn’t unload it.

technicalstuff.eps When you specify the argument unload=TRUE, R attempts to unload the package from memory. This is only an attempt — unloading can fail for many reasons, for example if you’ve loaded additional packages with dependencies, this unload will fail. If you really want to be sure that a package is no longer loaded, your best option is to simply start a new R session.

technicalstuff.eps Because the authors of R packages work independently, it’s entirely possible for different authors to use the same function names and packages. If this happens, the package that was loaded last masks functions with the same name in packages that were loaded first. R gives you a message saying which objects were masked from other packages the moment this happens.

Forging Ahead with R-Forge

Although not universally true, packages on CRAN tend to have some minimum level of maturity. For example, to be accepted by CRAN, a package needs to pass a basic minimum set of requirements.

So, where do packages live that are in the development cycle? Quite often, they live at R-Forge (http://r-forge.r-project.org/). R-Forge gives developers a platform to develop and test their R packages. For example, R-Forge offers

check.png A build and check system on the three main operating systems (Windows, Linux, and Mac)

check.png Version control

check.png Bug-report systems

check.png Backup and administration

To install a project from R-Forge, you also use the install.packages() function, but you have to specify the repos argument. For example, to install the development version of the package data.table, try the following:

> install.packages(“data.table”, repos=”http://R-Forge.R-project.org”)

R-Forge is not the only development repository. Other repositories include

check.png rforge.net: Available at www.rforge.net, this is not related (other than in spirit) to R-Forge.

check.png omegahat: Available at www.omegahat.org.

Conducting Installations from BioConductor

BioConductor is a repository of R packages and software, a collection of tools that specializes in analysis of genomic and related data.

BioConductor has its own sets of rules for developers. For example, to install a package from BioConcuctor you have to source a script from its server:

> source(“http://bioconductor.org/biocLite.R”)

Then you can use the biocLite() function to install packages from BioConductor. If you don’t give an argument, you just install the necessary base packages from the BioConductor project. You can find all the information you need at www.bioconductor.org.

technicalstuff.eps BioConductor extensively uses object-orientation programming with S4 classes. Object orientation and its implementation as S4 classes is an advanced R topic — one we don’t discuss in this book.

Reading the R Manual

The “R Installation and Administration” manual (http://cran.r-project.org/doc/manuals/R-admin.html) is a comprehensive guide to the installation and administration of R. Chapter 6 of this manual contains all the information you need about working with packages. You can find it at http://cran.r-project.org/doc/manuals/R-admin.html#Add_002don-packages.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset