Hour 19. Package Building

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Hour 19. Package Building

What You’ll Learn in This Hour:

Why you should build R packages

What an R package contains

What you need to include in all the directories and files

Things to consider for maintaining good quality code

How to easily create documentation with roxygen2

How to build a package with devtools

In this hour, we will look at one of the key aspects for professionalizing your code: package building. When you put your code into a package, it helps you to ensure that your code is of a high standard and you are adhering to good practices such as documenting your code. In the next hour, we will look at some further components such as incorporating unit tests, but we will focus here on making sure our code is well written and documented. This is the starting point for high quality, professional code that is easy to share and reuse.

Why Build an R Package?

Most of us don’t think about writing our own packages when we work with R despite the fact that we use other packages on a regular basis, as you have done in the previous hours in this book. We typically start out by writing code in one or more R scripts that contain lots of library/require calls or calls to source at the top of the script. This type of coding can cause us problems for many reasons.

Code written in this way is difficult to share. We have to determine all of the files that we need to run the code and all of the package dependencies. We also have to spend time explaining to our colleagues what the code does and how to use it if we do not document it. It can be difficult to know which version is the latest because we might have slightly different versions stored in different places. What’s more, it can often be difficult to be certain that the code has not been affected by a change we have made.

However, as we know from using other R packages, we can solve many of these challenges. An R package allows us to keep all of our code and documentation in a single place and implement a more formal approach to testing. Building an R package allows us to do the following:

Keep track of versions of our code and easily know whether we are using the same or different versions.

Keep documentation with the code and save time in having to explain how to use functions and the workflow of the code.

Easily provide demo code and examples.

Easily use test frameworks to ensure that any changes to the code do not change the output of the function.

Easily incorporate and call functions written in other languages such as C++.

Overall the advantages of converting our code to be structured as an R package are huge and well worth considering, and as you will see in this hour, it is very simple to do using tools such as devtools and roxygen2.

The Structure of an R Package

As you know, R packages contain various components and objects, including functions and documentation. You will see the basic structure and components in this hour, and in Hour 20, “Advanced Package Building,” we will look at some of the additional components such as unit tests.

The basic structure of a package contains four components:

A DESCRIPTION file

A NAMESPACE file

An R directory

A man directory

We will look at all these components in turn, but before we do we will cover how to create the correct package structure—in particular, how to set up a package for working with RStudio.

Creating the Package Structure

Traditionally, we created the package structure by using a function called package.skeleton. Although we can still use this function, it is much better to use the create function in the package devtools. The devtools package has been created to simplify the package-building process by wrapping up functionality such as creating and building packages.

Tip: Creating a Package Project

In RStudio, you may also create an R package from the project menu in the top-right corner. By selecting New Project > New Directory > R Package, you will be given a menu that allows you to give the package name as well as the location for the package on your file system, and you can optionally select existing R files that will be included in the package.

The purpose of the create function is to set up the basic structure of an R package. As you will see later in this hour, it has been designed around a workflow whereby we add our own R code separately and document packages using roxygen2. As an example, as stated earlier, an R package requires a man directory. This will not be created when we run create but will instead be created when we generate our documentation.

To create the package structure, we simply give the name of the package by defining the file path to where the package directory should be created. Here’s an example:

Click here to view code image

> create("../simTools", rstudio = TRUE)
Creating package simTools in .
No DESCRIPTION found. Creating with values:

Package: simTools
Title: What the package does (one line)
Version: 0.1
Authors@R: "First Last <[email protected]> [aut, cre]"
Description: What the package does (one paragraph)
Depends: R (>= 3.1.2)
License: What license is it under?
LazyData: true
Adding RStudio project file to simTools

You will see here that we have specified that the package structure should be created in a directory called simTools. Although it is not strictly necessary, it is good practice to give the directory the same name as your final package. You will also see in this code that a default DESCRIPTION file has been created that includes this package name. We will return to this shortly, but for now it is sufficient to note that a default set of values has been provided to this file.

You may also notice in the preceding code we have set an option called rstudio. If you are working in RStudio, you may find that this is a handy feature because it creates an RStudio package project. You can then open this from the projects menu by selecting Open Project and then navigating to and selecting the .Rproj file created. This is in fact the default behavior of this function. If you don’t want to create an RStudio project you will need to set this option to FALSE.

Having run create, or using the project menu, you will now have a directory at the specified location that contains the directories and files listed (with the exception of the man directory). We will look at each of these in turn in the following sections.

Tip: Additional Package Files

Having used create or the project menu system, you may notice that some hidden files have been created. You will need to have your explorer window set up to show hidden files, which include .gitignore and .Rbuildignore. These files allow us to include files within our package locally but stop git and/or the R build process from using these files. By default, the .Rproj files will be listed in these files.

The DESCRIPTION File

The first file in an R package is the DESCRIPTION file. This file is used to list important package information, including the authors and the current maintainer of the package, the version number, and the license for the package. It is in this file that we also specify any package dependencies.

You will have noticed when we ran the create function that a DESCRIPTION file was being created with certain default values. We can actually specify options for devtools to automatically populate some of these fields for us, but for occasional packages it is simple enough to update the file. An example of a DESCRIPTION file for the simTools package for which we created the structure is given in Listing 19.1.

LISTING 19.1 Example of a DESCRIPTION File

Click here to view code image

Package: simTools
Title: Simulation Analysis Tools
Version: 1.0-0
Authors@R: c(
   person("Aimee", "Gott", email = "[email protected]", role = c("aut", "cre")),
   person("Andy", "Nicholls", email = "[email protected]", role = "aut"),
   person("Rich", "Pugh", email = [email protected], role = "ctb")
  )
Description: A series of tools for simulation analysis used for learning about
  distributions.
Depends:
    R (>= 3.1.2)
Imports:
    ggplot2 (>= 1.0.0)
License: GPL-2
LazyData: true

Tip: Package License

Note that the default License is the relatively open GPL-2, the same license as R itself. There are several standard licenses for R packages that are listed on the R-Project website, https://www.r-project.org/Licenses/, although it is not necessary to apply one of these licenses. Licenses should be chosen carefully as they describe what others can do with your code.

The NAMESPACE File

The NAMESPACE file is now a compulsory file when you develop a package. It allows us to specify which functions in our package will be “exported” so that the end user can see them. This is useful if we want to have some utility functions that we want to use in our code but we don’t want the end user to see them. It also allows us to import namespaces from other packages (that is, make the user-visible functions in another package available to our package). We will return to this topic later in this hour because it is possible to allow the “roxygen” headers we add to our functions to handle this for us.

The R Directory

The R directory is where all our R functions will be stored. When we have simply used the create function, this directory will be empty and we can start to add R script files (that is, files ending in “.R”). You could add all your functions in a single file, though it is good practice to include multiple R scripts for individual groups of functionality. It is worth noting, however, that you will often see a file called utils.R. This is typically where short utility functions (of just a couple lines) that are not intended to be used by the end user are stored.

For our sample package, we will create a function called sampleFromData. The code for this function can be seen in Listing 19.2. This code should be contained in an R script in the R directory.

LISTING 19.2 R Function for the simTools Package

Click here to view code image

sampleFromData <- function(data, size, replace = TRUE, ...){
  if (!is.numeric(size)) {
    stop("Size must be a numeric integer value")
  }

  lengthData <- nrow(data)

  if (!replace & size > lengthData){
    stop("Cannot sample greater than the data size without replacement")
  }

  # Sample a number of rows from the given dataset
  samples <- sample(seq_len(lengthData), size = size, replace = replace, ...)
  invisible(data[samples, ])

}

The man Directory

The man directory is where we store all the files that contain the user documentation for the functions in our package. We can, and should, create help files for all functions in a package. We must document any exported functions, i.e. functions that an end user will see.

Although you will be familiar with the HTML format of help files from running ?mean, for instance, this is not the way in which we write help files. They are written in a TeX-like format and saved in files ending with an .Rd extension. We need to generate one file for each of the functions and the package itself. Generating these files can be quite time consuming, and it is easy to forget to update the files if you make changes to the function itself. For these reasons we will instead generate the documentation using a package called roxygen2. We will return to this topic later in the hour.

Code Quality

When it comes to putting our code into a package, the quality of the code is of huge importance. Typically code in a package will be shared, will be returned to later, or is collecting together a large amount of functionality—or all these things. As such, it is vital that we think about the quality of our code.

Code quality doesn’t just refer to whether the code works or not, but relates to the styling, documentation, and usability of the code. All these can be taken account by following some guidelines for writing code. At Mango, code quality is vital, and since there is typically more than one developer working on the code at a time, using a consistent style makes it much easier to work on the code in a collaborative manner. We have introduced many good coding practices throughout this book, and if you follow these practices you will be well on your way to high-quality, well-written code. Although we suggest some guidelines for styling in this section, you do not need to follow these guidelines specifically. However, we recommend that you decide on a consistent way to style your code and stick with it.

As mentioned, all of the R code for our packages is stored in the R directory in a series of files. These files should have descriptive names that help you to identify the contents when you return to the code. Also, they should all take the file extension “.R” (note the capitalization). The functions and objects referenced in these files should be named in a way that helps to inform the user of their purpose. A consistent means of naming the objects should be used. A popular convention, and one that is used at Mango, is lowerCamelCase, where each new word is capitalized.

In terms of the documentation, all functions should have a “roxygen” header, which will be discussed further in the next section. The code itself should be well commented to clarify its purpose, with comments for roughly every 10 lines of code.

When it comes to the layout of the code, it is considered a good practice to indent and space the code in a consistent manner. It is typical to include spaces after operators such as + or * as well as after a comma. It is convention to indent code inside a function call as well as inside for loops and if/else structures. We recommend two spaces for each indentation.

In addition to the styling of our code and the coding practices we have discussed, such as not appending in a for loop, we should also consider what our code does to the R session. It is considered bad practice to do anything inside a function that changes the environment in any way, including the assignment of objects and changing options or settings. If there is a need to make a change (for instance, if you need to change the working directory), your function should set it back to the original value before exiting.

Automated Documentation with roxygen2

To the end user, the most important part of your package is the documentation. A package that is well documented is much easier for someone to pick up and work with, and it’s much easier to return to when you need to update or change the functionality in the future.

Package documentation can take many forms, though the most widely used, and the aspect we will focus on here, is the function help files. We can also write user guides, known as vignettes, which we will look at in Hour 20.

From reading help files for other functions, you will be familiar with the format of this documentation. Function help files list all the arguments and they detail the purpose and usage of each. We can also add information about the output of each function, additional details about the function, who wrote the function, and so on.

We are going to generate the documentation using roxygen headers. These headers go above the function to which they refer. This makes it much simpler to produce the documentation because we can do it alongside the function development. It is also easier to update if we make a change to the function because the header is there while we are working on the function.

Tip: Document as You Write

As you will see, it is very simple to create the roxygen headers for functions. As such, it is a good habit to write them even if you are not thinking of putting your functions into a package. This means that the code is well documented and easy for you or others to work with. It also means that if you do decide to turn the code into a package, it is already documented, so you don’t have to go back and do so.

Function Headers

We include a roxygen header above the function definition. Each line of the roxygen header starts with the symbols #'. This allows R to treat the lines as comments, but they will be recognized by roxygen as function headers. Following this we use special tags to indicate a particular component of the help file. Some tags and their uses are shown in Table 19.1.

TABLE 19.1 roxygen2 Header Tags

Some components do not need their tags explicitly written out because the first three paragraphs without tags are treated in a special way. The first three paragraphs are as follows:

1. The title of the help page (short, one sentence)

2. The description for the help page (brief description of the function)

3. The details section, which can provide much more information about the function, what it implements, and so on

For including special formatting we can use LaTeX formatting components. If you are not familiar with LaTeX, this won’t impact your ability to write documentation unless you need to include mathematical formulas. The main thing to point out is usage of %. In LaTeX the % symbol indicates a comment, so we actually need to use \% if we don’t want everything after it to be treated as a comment.

Listing 19.3 shows how this might look for a sample function in the simTools package we created earlier. Notice that, although we have not included the complete function definition again, this header goes directly above the function definition, in this case the one given in Listing 19.2.

LISTING 19.3 Roxygen Header for the sampleFromData Function

Click here to view code image

1: #' Sample from a dataset
2: #'
3: #' This function has been designed to sample from the rows of a two
4: #' dimensional data set returning all columns of the sampled rows.
5: #'
6: #' @param data The matrix or data.frame from which rows are to be
7: #' sampled.
8: #' @param size The number of samples to take.
9: #' @param replace Should values be replaced? By default takes the
10: #' value TRUE.
11: #' @param ... Any other parameters to be passed to the sample
12: #' function.
13: #'
14: #' @return Returns a dataset of the same type as the input data with
15: #' code{size} rows.
16: #'
17: #' @author Aimee Gott <agott@@mango-solutions.com>
18: #'
19: #' @export
20: #' @examples
21: #' sampleFromData(airquality, 100)
22: #'
23: sampleFromData <- function(data, size, replace = TRUE, ...){

One of the key tags, which you can see here on line 19, is @export. This tag is what makes this function visible to the end user. When we generate the documentation, the NAMESPACE file will be automatically updated to indicate that it will be exported, meaning that we do not need to manually generate the NAMESPACE file. There are similar tags, @import and @importFrom, that allow us to specify functions or packages that we need to make available to run our functions.

Other tags to note include @param, which can be seen on lines 6, 8, 9, and 11. This tag is used to identify the arguments of the function. Notice that following the tag we give the name of the argument, and after a space the text that describes that particular argument. As you can see, the text can span multiple lines, and text is treated as belonging to the last tag until another, new tag is encountered.

You may also notice that in giving an email address in line 17 we have used @@. This is due to the fact that the @ symbol is used before a tag, so we need to indicate that we really want an @ symbol by duplication of the symbol.

Documenting the Package

In addition to documenting our functions using roxygen2, we can also document the package itself. Obviously in this case we do not have a function to put the header above. The typical approach to this documentation is to create a single file named with the package name. In the example we have used in this hour, that would be a file named simTools.R. The header itself is then contained above the statement NULL or NA.

An example of package documentation for the example we have used in this hour is given in Listing 19.4. Just like with the function documentation, the first line is the title of the help page, and the second is the description text. We can also include tags such as @author, @examples, and even @references, as we would in function headers.

LISTING 19.4 Roxygen Header for the simTools Package

Click here to view code image

1: #' A package for performing common simulation tasks
2: #'
3: #' This package provides a series of tools for common simulation tasks such as
4: #' sampling from a data frame and generating plots of simulation experiments.
5: #'
6: #' @author Aimee Gott email{agott@@mango-solutions.com}
7: #' @docType package
8: #' @name simTools
9: NULL

The main difference is that we need to include the tags @docType and @name. For the first of these tags, we identify that the specific documentation is for a package. You can see this in line 7 of the example in Listing 19.4. As you will see in Hour 20, we will also use this tag when documenting other package components such as data. The tag @name is used to label the help document. This is what the user will call to see the help document for the package, and it takes the name of the package itself, as you can see in line 8 of Listing 19.4.

Creating and Updating the Help Pages

Once we have created the headers for all of the functions and for the package, we can generate the Rd files. The function roxygenize, in the roxygen2 package, can be used to do this, but there is also a function available in devtools called document. Both functions work in the same way, but here we will demonstrate the use of document.

As you saw with the function create earlier in this hour, we need only point to the top level of the package directory to generate, or update where it already exists, the package documentation.

Click here to view code image

> document("../simTools")
Updating simTools documentation
Loading simTools
Writing NAMESPACE
Writing sampleFromData.Rd
Writing simTools.Rd

You can see from the output messages that this updates the NAMESPACE file along with the Rd files for the functions and the package itself. When we’re working with RStudio, it is actually possible to open the Rd files and preview them. After opening an Rd file in RStudio, simply click the Preview button to see the HTML preview in the Help tab. Figure 19.1 shows the preview of the help file defined in Listing 19.3.

FIGURE 19.1 HTML preview of the simFromData help page

As part of the package building workflow, this stage should be completed before the build and check stages we will see in the next section. In practice, it is common to cycle around all of these stages multiple times in the process of creating and testing a package.

Tip: Documenting with Projects

As mentioned previously, if we are developing a package as a project in RStudio, we have quick access to a number of build features through the Build tab, which is made available in a package project. This includes the option to generate package documentation. This can be done by either selecting the Document option, typically in the More drop-down menu of the Build tab, or using the keyboard shortcut Ctrl+Shift+D.

Building a Package with devtools

Once we have put together all of the components of our package, whether that is simply R code and help files, as we have seen here, or additional components as we will see in Hour 20, we need to go through the process of preparing the package to be shared and then building it. Traditionally this was entirely done by using a series of command-line tools. We now have an easier way to handle this in the form of the package devtools. The package itself still uses the command-line tools but provides us with a simple, familiar interface to them.

Caution: Building a Package in Windows

In order to build packages in Windows, you will need to have installed RTools. This is an additional component available on CRAN that provides the command-line tools needed for R package development. It’s important to make sure that the correct version of R is installed and that the system path has been set up correctly. For details of how to install RTools, see the Appendix, “Installation,” of this book.

Checking

The first thing we should do before building our package to share is to run a series of checks. Before a package can be made available on CRAN, it must pass a series of checks relating to the structure of the package, aspects of the code, the documentation, and even whether the examples run without error. Even if we don’t intend to make a package available on CRAN, it is good practice to run these checks and ensure that our own package passes all of them. We can run these checks in devtools with the function check.

You can see an example of running check and partial output in Listing 19.5. As you can see from the output in line 2, the first thing that check does is run the document function. This ensures that the documentation is up to date because there are a number of documentation-related checks. The package is then built into a source version. This is to ensure that there are no files included in the check that would not be present in the final version of the package. The checks themselves then start from line 20. In the lines shown in Listing 19.5, checks are being run against the DESCRIPTION and NAMESPACE files. In these cases, they pass the checks, which you can see from the OK line ending.

LISTING 19.5 Running the check Function

Click here to view code image

1: > check("../simTools")
2: Updating simTools documentation
3: Loading simTools
4: Writing NAMESPACE
5: Writing sampleFromData.Rd
6: Writing simTools.Rd
7: "C:/PROGRA~1/R/R-31~1.2/bin/i386/R" --vanilla CMD build
8: "C:UsersagottDocumentssimTools" --no-manual --no-resave-data
9:
10: * checking for file 'C:UsersagottDocumentssimTools/DESCRIPTION' ... OK
11: * preparing 'simTools':
12: * checking DESCRIPTION meta-information ... OK
13: * checking for LF line-endings in source and make files
14: * checking for empty or unneeded directories
15: * building 'simTools_1.0-0.tar.gz'
16:
17: "C:/PROGRA~1/R/R-31~1.2/bin/i386/R" --vanilla CMD check
18: "C:UsersagottAppDataLocalTempRtmpwNk65n/simTools_1.0-0.tar.gz" --timings
19:
20: * using log directory 'C:/Users/agott/AppData/Local/Temp/RtmpwNk65n/simTools.Rcheck'
21: * using R version 3.1.2 (2014-10-31)
22: * using platform: i386-w64-mingw32 (32-bit)
23: * using session charset: ISO8859-1
24: * checking for file 'simTools/DESCRIPTION' ... OK
25: * this is package 'simTools' version '1.0-0'
26: * checking package namespace information ... OK
27: * checking package dependencies ... OK
28: ...

Where there are any issues, they will be raised with an ERROR, WARNING, or NOTE, depending on the severity. You should try to solve all issues that are raised; many can be solved easily, particularly those that relate to inaccurate documentation. However, although it is very important to resolve any ERRORs that are raised, it is less important for WARNINGs and NOTEs if you are not going to share your code, or at least not going to make it widely available or available on CRAN. For packages to be used in production code, we would recommend that you strive to resolve, or at least understand, all issues that are raised by the checks.

This check function can be repeatedly re-run until you are satisfied with the output and ready to build the package.

Building

We are now at a point where we can build the package. We do this using the build function in devtools. When building the package, we need to consider the type of package we want or need to create. We can either generate a source package or a binary package. A source package contains the source files for the code, whereas the binary versions have been compiled for either the Windows or OS X operating system. If you plan to share your code with other Windows (or OS X) users, you will typically want to create the binary package.

The only difference if we want to create the binary version of the package is that we set the value of the argument binary to be TRUE. An example of running the build function, along with the output generated, is shown in Listing 19.6.

LISTING 19.6 Building the Package

Click here to view code image

1: > build("../simTools", binary = TRUE)
2: "C:/PROGRA~1/R/R-31~1.2/bin/i386/R" --vanilla CMD INSTALL
3: "C:UsersagottDocumentssimTools" --build
4: * installing to library 'C:/Users/agott/AppData/Local/Temp/RtmpwNk65n/file105078613584'
5: * installing *source* package 'simTools' ...
6: ** R
7: ** preparing package for lazy loading
8: ** help
9: *** installing help indices
10: ** building package indices
11: ** testing if installed package can be loaded
12: *** arch - i386
13: *** arch - x64
14: * MD5 sums
15: packaged installation of 'simTools' as simTools_1.0-0.zip
16: * DONE (simTools)
17: [1] "C:/Users/agott/Documents/simTools_1.0-0.zip"

You can see from this example that when we generate the binary version of the package, it is first installed and then packaged up in the installed format. The package name and version number are taken from the DESCRIPTION file values that we set previously, so we do not need to separately inform the build function of these values. Because we have built a Windows binary package, you will notice on lines 15 and 17 that the package has the file extension .zip. If we had instead built a source package, it would have had the extension .tar.gz.

Installing

After we have built our package, whether that is in the form of a binary package or a source package, we are then ready to install it. The package that you have built is in the same format as any other package you would install, and as such can be installed, loaded, and used in the same way, as you can see below:

Click here to view code image

> install.packages("../simTools_1.0-0.zip", repos = NULL)
Installing package into 'C:/Users/agott/Documents/R/win-library/3.1'
(as 'lib' is unspecified)
package 'simTools' successfully unpacked and MD5 sums checked
> library(simTools)
> simDat <- sampleFromData(airquality, 2)
> simDat
   Ozone Solar.R Wind Temp Month Day
58    NA      47 10.3   73     6  27
36    NA     220  8.6   85     6   5

Summary

In this hour, we have looked at all the components required to create a simple R package with the basic components required. We have introduced some of the good practices for package development, including considerations around the code itself as well as how we can provide useful documentation components. We have looked at what is required to build a package and how to build one. In the next hour, we will discuss how to add further components to our packages to make them more production ready, including unit tests and user guides.

Q&A

Q. I use another package in my code. What do I need to do to make sure it is available for my package?

A. When it comes to dependencies of your code, you can list them in one of a number of ways. A package is typically listed under Depends or Imports, Suggests or LinkingTo. You use LinkingTo to specify that your function requires the C code of another package. A package listed as Suggests is one that is needed to run unit tests or examples, or for only very specific functionality as an option in maybe only one function in your package. Any package that contains functions required for the running of your package should be listed in either Depends or Imports. It is now best practice to use only the Imports field, although there are some occasions when Depends is still needed; hence, it is still available.

Q. Who should be listed as an author of a package?

A. This is entirely up to you. Typically an author has made substantial contributions to a package, whereas a contributor has made only a small contribution, such as a bug fix. The one role to consider with care is who is listed as the creator or maintainer (cre) or the package. This is the person who can be contacted by the R Core team or by users of the package. It is important that a single person is named in this role and that an email address is provided that can be used to contact the maintainer.

Q. I am just writing a couple of functions. Should I create a package from them?

A. When you are getting started with package building, you might find that it helps you to learn how to do so by creating a small package first. In general, although you may not actually build the package or want to share it further, by following the practices in this chapter and organizing code in this way, you make it much easier to work with, which means it’s easy to create a package if you need to later.

Q. Can I use roxygen headers even if I am not creating a package?

A. Yes, and we would strongly recommend that you do. Documenting functions you write in this way makes them much easier to work with and return to, as well as to convert into a package at a later date.

Workshop

The workshop contains quiz questions and exercises to help you solidify your understanding of the material covered. Try to answer all questions before looking at the “Answers” section that follows.

Quiz

1. What are the minimum required components for an R package?

2. How can you generate help documentation for functions?

3. What extra tags do you need to document a package?

4. What is the difference between a source package and a binary package?

5. If you don’t plan to make a package available on CRAN, do you need to ensure that all of the checks pass?

6. How do you install a package that you have developed?

Answers

1. At a minimum, you require the directories man and R and the files NAMESPACE and DESCRIPTION.

2. You can generate documentation for functions by including roxygen headers in the function R scripts. You use special tags that start with the @ symbol to document components of the function.

3. For the overall package documentation, you need to include the additional tags @docType and @name. The name tag should give the package name, which is what the user will call to access the help file. The docType tag simple needs to state the package.

4. A source package contains all the source code for the package but excludes the additional files that may be included in the package as you develop, such as RStudio project files. The binary package is the packaged-up version for a specific operating system such as Windows or OS X.

5. Although it is not a requirement to run the checks if you are not submitting to CRAN, it is good practice to do so. It is particularly recommended if you will be sharing your code with others or if it is intended to be used in production code. A package that passes the checks is generally considered to be of a higher quality than a package that does not.

6. You install a package that you have developed in the same way that you would install any other package you have been provided in source or binary format. Take a look back at Hour 2, “The R Environment,” for a reminder on how to do this.

Activities

1. Use devtools to create a skeleton package for a package called summaryTools.

2. Add in the appropriate location of an R function called numericSummary. This function should take two arguments: a numeric vector and the argument na.rm. The function should call a helper function that generates numeric summaries, including the mean and standard deviation. It should also call a helper function that returns the number and proportion of missing values. The numericSummary function should return all this information in a suitable format.

3. Use roxygen2 to create headers to document all three of the functions you have just written. Choose carefully which of these functions need to be visible to the end user.

4. Update the DESCRIPTION file and all other package documentation.

5. Build and check your package. Once you have resolved any issues raised by the check and have rebuilt the package, install it and then try calling your function.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Hour 19. Package Building

Create new playlist

Sign In

Sign Up

Hour 19. Package Building

Why Build an R Package?

The Structure of an R Package

Creating the Package Structure

The DESCRIPTION File

The NAMESPACE File

The R Directory

The man Directory

Code Quality

Automated Documentation with roxygen2

Function Headers

Documenting the Package

Creating and Updating the Help Pages

Building a Package with devtools

Checking

Building

Installing

Summary

Q&A

Workshop

Quiz

Answers

Activities

Table of Contents for
Hour 19. Package Building