Hour 20. Advanced Package Building

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Hour 20. Advanced Package Building

What You’ll Learn in This Hour:

What you can do to extend an R package

Why testing is important and how to use testthat

How to include datasets in a package

How to include a user guide in a package

What you need to do to use C++ code in a package

In the last hour, you saw how to put all of your code together in the form of a package to simplify the sharing and maintenance of code, as well as to aid in the development of high-quality, production-ready code. There are, however, a number of ways you can extend a package to make it more robust to changes and easier for users to get started with. You will see the most common of these extra components in this hour.

Extending R Packages

We have now managed to create a package that contains all the functions we need and even contains the help files for those functions—so why do we need to add more? Surely this is sufficient. In many respects, this is true. We can simply share our package as it is with no need to do anything more, but there are many advantages to the extensions you will see in this hour.

The first additional component we will cover is a test framework. As you have seen throughout this book, once we have code we may want to update it to make it more efficient or simply change the functionality as we find bugs or need new features. A test framework becomes a vital component here for ensuring that we do not introduce more errors into our code or revert back to issues we have already resolved.

There are many instances when we may need to share data with our end users. This may be simply for examples; it may be data relevant to the field that we want to share, or it may be reference data required by functions in the code. This last point is particularly common in the development of code for analytics. Whatever the reason for needing to share the data, we can incorporate it all in our package so there is no need to also send out data separately to the package we have developed.

The next component we’ll implement is the user guide. Whether you are just sharing code with colleagues or you plan to share widely with the R community, the end users of your package are going to need to know how to use it. The individual function help files will help users with questions of “How do I use this function?” and “What are all the options for this function?” However, they will not typically help with the overall workflow of your package. A user guide is aimed at helping to get users started with a general workflow for your package. Just as with data, we have written this anyway and intended to simply email it to people who need it, but incorporating it in the package ensures that it is up to date and always available for the end users.

The final additional component we cover in this hour is C++ code, or more specifically, code we have written with Rcpp. This is not going to be a component that you will include in every package you write, but as you saw in Hour 18, “Code Efficiency,” you may have chosen to incorporate such code into a function for efficiency, so you need to know how to include such code in an R package.

As you can probably see, inclusion of these two components, data and C++ code, will be dependent upon the package itself and its requirements and implementation. When it comes to the user guide and unit tests, they are again optional. However, it is considered to be a best practice to include these components, and we would recommend that you get into the habit of including them as standard in any package you write. As you will see in this hour, they are very simple to add, with devtools functionality available to help you with the package structure, and they don’t take much additional effort once you are familiar with them.

Developing a Test Framework

Whenever we develop code, we test it in some way. As we start out this might just be with an ad-hoc running of a function to ensure it does what we expect. Usually this is with small amounts of data, and typically we test the main functionality we have implemented. As we write more code and begin to change it to handle any issues that arise, we might write a script that can be run regularly where there are known expected outputs we are looking for. This is the beginning of a test framework. For all development, but especially production development, it is recommended that these informal tests are formalized so that they can easily be re-run with specific cases at any point. We can then include these tests within a package so that they are always kept together, and even the end user can run them to ensure the package is still working as it should.

An Introduction to testthat

There are a number of options for providing a test framework in R, but the one introduced here, testthat, is both widely used and easy to get started with. Before we consider how to include tests in an R package, we will simply look at how to write what are known as “unit tests” using testthat.

As an example in this hour, we will implement tests for the function we included in the R package that we developed in the previous hour, sampleFromData. This function is defined in Listing 19.2 and simply randomly samples rows from a dataset we provide. You will also notice that this function includes some error handling by checking that sensible arguments have been provided.

While we write the tests, we will need to consider what we might test. We will return to this topic shortly, but for now we will simply write some tests to ensure that data is returned as expected. If we were to ask you to check that this function worked correctly, you would most likely pick a simple dataset and test the function with argument values that are easy to check the output of. For example, you might try the following:

Table of Contents for Hour 20. Advanced Package Building

Create new playlist

Sign In

Sign Up

Hour 20. Advanced Package Building

Extending R Packages

Developing a Test Framework

An Introduction to testthat

Incorporating Tests into a Package

Including Data in Packages

Including a User Guide

Including a Vignette in a Package

Writing a Vignette

Code Using Rcpp

Summary

Q&A

Workshop

Quiz

Answers

Activities

Table of Contents for
Hour 20. Advanced Package Building