Hour 21. Writing R Classes

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Hour 21. Writing R Classes

What You’ll Learn in This Hour:

What a class is

How to create an S3 class

Generic functions and methods

Inheritance in S3

Documenting in S3

Limitations of S3

Now that you have seen how to build an R package, we will take a closer look at the class structures available in R and the benefits of implementing such structures in an R package. Classes and object orientation are concepts that will be more than familiar to anyone who has majored in computer science. Any readers familiar with these concepts will also be aware that despite many common themes between languages, there is no standard cross-language approach to object orientation.

It may come as no surprise to learn that R has several takes on what constitutes object-oriented programming. In this hour, we take a general look at some key features of object-oriented programming before focusing in on R’s S3 implementation. In Hour 22, “Formal Class Systems,” we will look more closely at some of the other options available to us in R.

What Is a Class?

In Hour 16, “Introduction to R Models and Object Orientation,” and Hour 17, “Common R Models,” you saw how to build and compare various types of models in R. In order to do so we took advantage of R’s S3 class structure. Our model objects had classes such as lm and survreg. We used the print, plot, and summary functions to analyze the models. For each class of object, the print, plot, and summary functions behaved in different ways, producing output appropriate to the class of model. Functions that behave differently depending on the class of input are known as “methods.”

The class and method concepts are fundamental to object-oriented programming. When we refer to a “class system” in R, we are talking about an object-oriented system, of which R has several.

Object Orientation in R

Back in Hour 1, “The R Community,” we discussed the history of S and its impact on R today. Nowhere is this impact felt more greatly than on R’s class system, particularly when it comes to modeling. Another claim we made in Hour 2, “The R Environment,” was that R is “loosely” object-oriented. In R, everything is an object and has a name and a class. There is also a clear distinction between data objects and function objects. The distinction between objects and functions that act on objects is the basis of an object-oriented programming environment. However, the functions that we write do not have to be associated with a particular class of object. We must therefore choose to use the object-oriented features available in R. In R today, there are actually four common class implementations: S3, S4, reference classes (a.k.a R5), and R6. The “S” in S3 and S4 refers directly to S, whereas the numbers refer to the S versions within which the classes were unveiled. Those that use the term “R5” for reference classes or R6 are simply continuing the number sequence. The terms have absolutely nothing to do with R versions.

Despite the sequential release of new class structures in R, the vast majority of R packages on CRAN today either implement an S3 system or no system at all. The S3 system is particularly appealing for package developers with an analytical background due to its relative simplicity and less rigid rules. This makes it more accessible when sharing code with other analysts. As you will see in Hour 22, the more rigid structures of the other class systems lend themselves more toward application development in R. However, even these implementations could be considered relaxed when compared with traditional object-oriented development languages such as Java.

Why Bother with Object Orientation?

In order to write professional-level code, we need to ensure that we are following good programming practice. Everyone tends to have their own definition of precisely what this means, but the central concepts are based around

Readability

Maintainability

Efficiency

In Hour 18, “Code Efficiency,” we looked closely at code efficiency. In Hour 19, “Package Building,” we then discussed code quality and talked about how adherence to a naming convention, regular commenting, and consistent layout and spacing can improve readability. In Hour 20, “Advanced Package Building,” we looked at building a test framework to help improve the maintainability of our code. Object orientation builds upon the theme of maintainability.

It is much easier to develop, test, and hence maintain modular code. We write modular code by ensuring that functions remain small and, where possible, have a single purpose. The modular approach facilitates the development of unit tests. In many cases, just writing modular code is sufficient to ensure that our code base is maintainable. The concept of object-oriented programming extends the idea of modular code and introduces other useful concepts such as type checking and inheritance.

Fundamentally, a class structure lets us define a consistent behavior for objects of that class. Once we can be sure that an object is of a particular structure, we can construct methods (functions) that understand this structure and react accordingly.

Class Example

Let’s imagine for a second that the data.frame class did not exist. Hopefully you would agree that with only vectors, matrices, arrays, and lists to store information, analyzing data would be pretty tough! We are used to thinking of data as a rectangular structure with a number of rows and columns. Each column contains a different type of information in which we are interested (dates, times, numeric values, character, and so on). Given that vectors, matrices, and arrays are all single-mode objects and can only store data of a single type, the only option available to us would be to store our data as a list. However, a list can store any object, whereas we only want to store columns of data. We therefore need to impose some rules on our list:

Every element must be a vector (to ensure we have “columns” containing a single type of data)

Each vector must have the same length (to ensure that we have a fixed dimension)

Each “column” should have a name attribute (for easy referencing of columns)

These rules ensure that our list functionally behaves like a rectangular data structure, but we also need it to look like one. We therefore impose the further rule:

The list looks like a rectangular data structure

To see what an object looks like, we usually just type its name and press Enter. In R, typing an object’s name is a shortcut for calling the print function on the object. When we say, “the list looks like a rectangular data structure” what we really mean is, “when we call print on the object, it looks like a rectangular data structure.” In summary, we have defined three rules that specify the structure of a data frame object and one rule that defines how the print function should behave when we pass it a data frame object. In other words, we have defined a “data frame” class and a print method for this class.

We don’t just want to print data frames, however. Once we have defined the structure, we can also define what happens when we call subset on the structure. We can write additional methods such as head and tail, which return the first and last few rows of data, respectively. We can write nrow, ncol, and dim methods. We can also define what happens when we call plot or aggregate. What we get from defining classes is structure and control. So long as we create an object of the right structure, we know that our methods will function as expected.

Inheritance

In object-oriented programming, inheritance is extremely useful to us because it keeps our code modular and saves us from duplicating code. When programmers talk about the benefits of inheritance, they typically talk about defining animals. Let’s imagine we want to define a cat object and a dog object. Cats and dogs have a lot in common. Among the many things they do, they eat and they sleep. However, a cat meows and a dog barks. Defining cats and dogs separately results in duplication; for each animal we must define what it means to eat and what it means to sleep. The idea of inheritance allows us to define an object hierarchy. First, we define what it means to be an “animal” object. An animal eats and an animal sleeps. We say that “cat” and “dog” objects inherit these properties from the “animal” object. We can then define the additional “meows” property for cats and “barks” property for dogs. Should we ever need to change what it means to eat or sleep, we need only make a single change to the “animal” object.

Each of the object-oriented systems in R benefits from inheritance. Consider the data.table class from the data.table package you saw in Hour 12, “Efficient Data Handling in R.” We can think of a data.table object as a data frame that, among other things, prints nicely when there are many rows. There are actually only a handful of methods that respond specifically to data.table objects. The rest of the functionality is inherited primarily from the data.frame class. Where a method has not been defined for the data.table class, R defaults to the method for the data.frame class. Beyond that, R defaults to the default method for an S3 object (of which data.frame objects belong). For example, calling summary on a data.table object still returns a statistical summary of each column as it would for a data.frame object, even though no summary method has been specifically written for the data.table class. Inheritance is a powerful idea that enables us to easily build upon the work of others.

Note: Multi-Level Hierarchy

The tbl_df class actually inherits from a tbl class, which in turn inherits from a data.frame. This is an example of multi-level hierarchy. We can use this property to build hierarchical class structures.

Why Use S3?

We begin our tour of R classes by looking at R’s most common class implementation, S3. Each of the basic data structures we have looked at throughout the book use an S3 structure. Standard linear models, generalized linear models, survival models, and mixed effects models all use an S3 class structure. We therefore know that we can print, plot, or summarize these objects in a consistent manner. By developing our own packages with S3, we can take advantage of this consistency by defining our own print, plot, and summary methods for a new class of object. We can also use S3 to create new methods specific to our new class of object.

The S3 class implementation is a form of generic function object-oriented programming. In generic function object-oriented programming, we call generic functions that then determine which function is appropriate to use with our object. For example, when we pass an object of class lm to the generic plot method, the method determines that the plot.lm function should be used. This type of implementation is rare among programming languages and is often frowned upon by experienced software developers. However, like R itself, the S3 class system is relatively straightforward to learn and is extremely popular among data scientists and statisticians alike. The implementation strikes a nice balance between the full flexibility of the R language and the more controlled rigor of other object-oriented programming languages.

Creating a New S3 Class

In most object-oriented programming environments, we begin by formally defining the structure of the class. We also place restrictions on each element of the class. However, S3 implements a lazy form of object-oriented programming that allows us to instantiate (create instances of) a new class without formally defining the class.

Instantiating S3 objects is incredibly straightforward. Remember that every object in R has a class. We can query the class of an object using the class function. Here’s an example:

> x <- 5
> class(x)
[1] "numeric"

The same class function can be used to change the class of an object. In the following example, we change the class of our numeric x value to a new class called superNumber.

Click here to view code image

> class(x) <- "superNumber"
> x
[1] 5
attr(,"class")
[1] "superNumber"

In this ad-hoc manner, we can change the class of any object to anything we like, whether we have defined the new class or not. Note that the class of an object is returned as an attribute. Objects can have several attributes that are returned via the attributes function:

> attributes(x)
$class
[1] "superNumber"

Tip: Removing a Class

We can return an object without its class attribute using the unclass function. The unclass function removes the class attribute, leaving only the underlying object and any attributes, as shown here:

Click here to view code image

> aDF <- data.frame(X = 1:3, Y = rnorm(3))
> aDF
  X           Y
1 1  0.52409671
2 2 -2.26076788
3 3 -0.01967972
> unclass(aDF)
$X
[1] 1 2 3

$Y
[1]  0.52409671 -2.26076788 -0.01967972

attr(,"row.names") [1] 1 2 3

Note that unclass returns a new object and does not affect the original object.

A More Formal Approach to Creating Classes

As you have seen, it is very easy to change the class of an object. However, it is not considered good practice to do so, nor is it particularly useful, especially if our goal is writing packages. A more standard approach is to define the structure that our class should take and then write a function that creates objects of that class. This is known as a “constructor” function. Traditionally, functions that generate objects of a particular class are named after the class of object that they create. For example, the ts function creates time series (ts) objects.

Because we are introducing a formal method for creating a class, let’s start with a more formal example and write a class for modular arithmetic. If you are not familiar with modular arithmetic, consider time as specified by a typical 12-hour clock. Imagine it is three o’clock (we ignore a.m. and p.m. for this example). In 10 hours’ time, we will say it’s one o’clock. We won’t say it’s 13 o’clock. A 12-hour clock is an example of “mod 12” arithmetic. We call the number 12 our “modulus.” Numbers must always be between 0 and 11 (when we hit 12, we restart at zero). We now define this formally in R using an S3 class structure. In lines 1 to 11 in Listing 21.1, we create a new class called modInt. Our object consists of an integer value and a modulus attribute. Some examples are also provided to illustrate the behavior of the constructor function.

LISTING 21.1 Writing a Function to Generate a New Class

Click here to view code image

1: > modInt <- function(x, modulus) {
2: +   # Create the object from the starting number and modulus, "mod"
3: +   # Divide by the modulus to get new number appropriate for that modulus
4: +   object <- x %% modulus
5: +   # Assign a class attribute to the object
6: +   class(object) <- "modInt"
7: +   # Store the modulus as an attribute
8: +   attr(object, "modulus") <- modulus
9: +   # Return the new object
10: +   object
11: + }
12: > # Examples
13: > modInt(3, 12)
14: [1] 3
15: attr(,"class")
16: [1] "modInt"
17: attr(,"modulus")
18: [1] 12
19: > modInt(13, 12)
20: [1] 1
21: attr(,"class")
22: [1] "modInt"
23: attr(,"modulus")
24: [1] 12

We have now created a constructor function that generates objects of our chosen modInt class. On its own this could perhaps be a useful function. However, to really see the benefit of the S3 class structure, we need to define some generic functions.

Generic Functions and Methods

Generic functions are functions that can behave differently depending on the class of object passed to them. The precise behavior is controlled by further functions known as methods. You saw the generic methods print, plot, and summary in Hour 16. If we inspect the source code of the print function, for example, we see that it calls the UseMethod function. It is the UseMethod function that determines which method function to call.

Click here to view code image

> print
function (x, ...)
UseMethod("print")
<bytecode: 0x00000000094cda60>
<environment: namespace:base>

As you saw in Hour 16, the S3 class structure provides a simple naming convention that we can use to create methods for a new class. The naming convention is as follows:

[genericFunction].[class]

A dot (.) is used to separate out the generic function from the class. The function print.lm defines what happens when we call the print function on an object with class lm. Let’s return to our sample modInt class that we defined in Listing 21.1. The two examples from line 12 onward were functional but not particularly nice to look at. We start by defining a print method to control the appearance of modInt objects. In order to do so, we create a function called print.modInt, shown next, and let R’s S3 class system do the rest:

Click here to view code image

> print.modInt <- function(aModIntObject){
+   # Extract the relevant components from the object
+   theValue <- as.numeric(aModIntObject)
+   theModulus <- attr(aModIntObject, "modulus")
+   # Print the object in the desired form
+   cat(theValue, " (mod ", theModulus, ") ", sep = "")
+ }
> x <- modInt(3, 12)
> x
3 (mod 12)

Note: Naming Conventions

In the print.modInt function, we use the argument name aModIntObject. This is to illustrate that we should pass a modInt object to the function. However, it is much better practice to follow the naming convention of the generic function that will call the method (in this case, print). The print function takes x and an ellipsis (...), and in practice these are the arguments that a print.modInt function would take. The primary benefit of following this convention is that the help files are much easier to follow. A user unfamiliar with classes is far more likely to type ?print than they are to type ?print.modInt. Further, the names should be in the same order as the generic and adhere to any default values defined in the generic. Following these conventions will vastly improve the usability of your class.

Note: Updating Methods

As with any function, the impact of updating a method is immediate. For example, if we update the print method for a class, then the next time we print an object of that class, it will print differently.

We can see what methods have been defined for a class via the class argument to the methods function:

Click here to view code image

> methods(class = "modInt")
[1] print
see '?methods' for accessing help and source code

The same function can be used to query all methods for a particular generic:

Click here to view code image

> methods("plot")
[1] plot.acf*           plot.data.frame*    plot.decomposed.ts* plot.default
[5] plot.dendrogram*    plot.density*       plot.ecdf           plot.factor*
[9] plot.formula*       plot.function       plot.hclust*        plot.histogram*
[13] plot.HoltWinters*   plot.isoreg*        plot.lm*            plot.medpolish*
[17] plot.mlm*           plot.ppr*           plot.prcomp*        plot.princomp*
[21] plot.profile.nls*   plot.raster*        plot.spec*          plot.stepfun
[25] plot.stl*           plot.table*         plot.ts             plot.tskernel*
[29] plot.TukeyHSD*
see '?methods' for accessing help and source code

Defining Methods for Arithmetic Operators

Mathematical operators can also be used as generic functions. We define an operator in exactly the same way we do any generic function:

[operator].[class]

Returning to our modInt example, we can use the + operator to define what happens when we add two modInt objects together. The function and some examples are shown in Listing 21.2. Note than when defining methods that involve operators, we place back ticks around the function name to avoid errors.

Caution: Defining Each Operator Separately!

Defining a method for + does not automatically create a method for -, *, or /. These must be defined separately.

LISTING 21.2 Defining Operator Methods

Click here to view code image

1: > # Define a new method 'add' method for the modInt class
2: > `+.modInt` <- function (x, y){
3: +   # We can only add objects that are of the same modulus
4: +   if(attr(x, "mod") != attr(y, "mod")){
5: +     stop("Cannot add numbers of differing modulus")
6: +   }
7: +   # Add the numbers together
8: +   totalNumber <- as.numeric(x) + as.numeric(y)
9: +   # Ensure a number in the correct modulus is returned
10: +   theResult <- modInt(totalNumber, attr(x, "mod"))
11: +   # Next step useful for inheritance (later)
12: +   class(theResult) <- class(x)
13: +   theResult
14: + }
15: >
16: > # Examples
17: > a <- modInt(7, 12)
18: > b <- modInt(9, 12)
19: > a + b
20: 4 (mod 12)
21: > c <- modInt(3, 4)
22: > a + c
23: Error in `+.modInt`(a, c) : Cannot add numbers of differing modulus

Caution: Operations on Different Classes of Objects

If we try to use an arithmetic operator such as + to combine objects of differing classes, R will attempt to use the method that is higher up the search path. This often results in an error. Attempting to combine S3 classes via an operator in this way is generally not recommended.

Lists vs. Attributes

Usually S3 classes are generated as lists (for example, the data.frame and lm classes). However, to create our modInt example, we used an attribute. This slightly simplifies numeric operations on objects of the modInt class and ensures that our numbers behave like regular integers in cases where we have not defined a method. However, it is just as easy to define the structure as a list, as the following example shows. Here, we create a modIntList class and a suitable print method:

Click here to view code image

> # Define a new modIntList class using a list structure
> modIntList <- function(x, modulus) {
+   # Define a list with two elements containing the number and modulus
+   object <- list(number = x %% modulus,
+                  modulus = modulus)
+   # Assign a class attribute to the object
+   class(object) <- "modIntList"
+   # Return the new object
+   object
+ }
>
> # Now define the print method
> print.modIntList <- function(aModIntListObject){
+   # Extract the relevant components from the object
+   theValue <- aModIntListObject$number
+   theModulus <- aModIntListObject$modulus
+   # Print the object in the desired form
+   cat(theValue, " (mod ", theModulus, ") ", sep = "")
+ }
>
> # Examples
> modIntList(14, 6)
2 (mod 6)

The modInt and modIntList examples are relatively straightforward examples of using classes. Generally we recommend using lists to create S3 classes. A list enables us to easily store different types of objects within our class. The list approach is also more similar to the S4 “slot” approach that we will discuss in Hour 22.

Creating New Generics

When generating your own classes, you might find it sufficient to use existing generics such as print, plot, and summary. However, it can sometimes be useful to define new generic functions, particularly if you want others to build on your work.

We can use the UseMethod function to create our own generic functions. New generics should call the UseMethod function and do nothing else. The methods themselves should do all the work. Always define a default method using [genericFunction].[default]. The default method is invoked in the absence of any other methods. If there is no obvious “one size fits all” default, then a default method that returns a sensible error message should be defined.

Consider writing a generic version that mimics the mathematical square operation. For a numeric value x, this is just x². But what would such a function do for a character value or an object in our modInt class? In Listing 21.3 we define a new generic named square along with some methods for the cases we have just highlighted. Having very simply defined the generic in line 2, we proceed to define some methods starting with the default method. Some examples of the new generics are shown toward the end of the listing.

LISTING 21.3 Creating a New Generic

Click here to view code image

1: > # Define a new generic
2: > square <- function(x) { UseMethod("square", x) }
3: >
4: > # Define default method!
5: > square.default <- function(x) x^2
6: >
7: > # Define some more methods
8: > square.character <- function(x) paste(x, x, sep = "")
9: >
10: > square.modInt <- function(x) {
11: +   # Standard square
12: +   simpleSquare <- as.numeric(x)^2
13: +   # Use correct modulus
14: +   modInt(simpleSquare, attr(x, "mod"))
15: + }
16: >
17: > # Check functionality
18: > square(2)
19: [1] 4
20: > square("A")
21: [1] "AA"
22: > x <- modInt(3, 4)
23: > square(x)
24: 1 (mod 4)

Inheritance in S3

One of the primary reasons for implementing a class structure is that it enables others to build upon it. Inheritance is a concept that allows us to take a class that has previously been defined and extend it. The benefit is that we need only define a handful of new generic functions. The rest are inherited from the base class. As we discussed earlier in the hour, a good example of this is the data.table class of object used by data.table. The data.table class extends/inherits from the data.frame class. We can see this inheritance when looking at the class of a data.table object:

Click here to view code image

> airDT <- data.table(airquality)
> class(airDT)
[1] "data.table" "data.frame"

As you saw in Hour 12, the data.table class changes the way a data frame prints. This is because the author has written a new print method specifically for the class. Other data.frame operations are unaffected by the extension. The summary and plot functions behave in exactly the same way for a data.table object as they do for a data.frame object.

When we query the class of a data.table object, a vector of classes is returned. To construct a new class that inherits from an existing class, we overwrite the class of our object with a vector of classes. For example, if we want to create a clockTime class representing integers as “mod 12” from our modInt class, we do so as follows:

Click here to view code image

> clockTime <- function(x){
+   # Fix x as mod 12
+   x <- modInt(x, 12)
+   # Define inheritance
+   class(x) <- c("clockTime", class(x))
+   x
+ }
> theTime <- clockTime(13)
> class(theTime)
[1] "clockTime" "modInt"

Earlier in the hour we defined a print method for our class. We also defined a method for the new square generic, the + operator. All of these are perfectly functional for our class, though for a clockTime class we expect a slightly different print method. In Listing 21.4 we define a new print method and add two instances of this class together. When we add them together, the modInt method is used because we haven’t defined a `+.clockTime`. However, the result still prints in the clockTime format due to inheritance.

LISTING 21.4 Inheritance in Action

Click here to view code image

1: > # Define a new print method for the clockTime class
2: > print.clockTime <- function(aClockTimeObject){
3: + cat(as.numeric(aClockTimeObject), ":00 ", sep = "")
4: + }
5: >
6: > # Examples
7: > time1 <- clockTime(5)
8: > time2 <- clockTime(42)
9: > time1
10: 5:00
11: > time2
12: 6:00
13: >
14: > # Add together to demonstrate inheritance
15: > time1 + time2
16: 24: 11:00

The example on line 15 works because of a sensible step that we took earlier when defining the `+.modInt` method in Listing 21.2. In line 12 we overwrote the class of the return object with the original class of one of the two objects we started with. If we hadn’t done so, then adding the two clockTime objects would return a modInt object, and we would lose one of the primary benefits of inheritance.

Note: Extending the Class Hierarchy

We can continue to extend classes indefinitely. However, it is rare to see S3 classes extended more than three or four times.

Tip: Checking Inheritance

Occasionally we may need to check that an object inherits from a particular class in order to ensure that a particular method will behave as expected.

Documenting S3

When building packages, it is important to document everything you can. You will see in Hour 22 that documenting more complex classes requires us to use new roxygen2 tags; S3, on the other hand, is much more straightforward. To start with, the class itself has no formal definition, so the only things we can document are the class constructor function, the methods, and any generics that we define. Each of these is a regular R function, and so we use standard tags such as @param and the others listed in Table 19.1 of Hour 19.

Technically we don’t have to generate help files for every method that we define, particularly if the method follows the argument-naming structure of the generic; you may notice that several of the methods in base R do not have help files (try ?print.lm, for example). However, it’s always good practice to create documentation, and roxygen2 makes it so easy, so why wouldn’t you?! Though this may be obvious, it is also helpful to mention in the title and description that the method relates to a particular class of object.

Limitations of S3

One of the reasons that the S3 concept is not popular among software developers is that we cannot formally define a new class of object before instantiating the object, whereas in most class implementations it is common to check that the components of an object are of the expected structure for the class object. The lack of a formal class definition leaves S3 open to user error, unless we decide to go the extra mile and write checks for both the constructor function and the individual methods. Not only does this involve a lot of duplication, we may soon find that half our code base is dedicated to error handling. If the prevention of user error matters that much, it’s time to step up to S4 classes or beyond.

The concept of inheritance is also fairly weak in S3; we have to be very careful to ensure that our methods allow for inheritance and do not force the creation of objects of one particular class. In class systems such as S4, inheritance is more formal, and type checking and validity are passed from the parent class through to the child class.

Summary

Following on from Hours 19 and 20, where you saw how to construct an R package, you have now seen how classes—and S3 classes in particular—can be used to improve package maintainability and add structure to our code base.

In Hour 22, we look at the more formal forms of object orientation available in R, starting with S4 classes. This will open the door to new concepts such as validity checking, multiple dispatch, and message-passing object orientation.

Q&A

Q. If S3 was the first implementation in S, isn’t it time to move on to something more advanced?

A. Perhaps. Many people don’t like S3, saying, “It’s lazy,” “It’s not a proper class implementation,” and so on. However, most of the good bits of R use S3 classes, and it’s usually better to try to build on top of the good bits!

Q. I’ve heard that S3 isn’t actually a class system at all. Is this true?

A. It’s not a very strict system, but it is, nevertheless, a class system. Technically it is an informal form of generic function object-oriented programming.

Q. If an S3 method takes the form [genericFunction].[class], what is going on with data.frame?

A. R has its quirks! It can be confusing to understand what is going on with functions such as print.data.frame. To confuse things even more, it is entirely possible to create a frame class and define a print.data method for that class, but I suggest you don’t! The overall message here is that R is flexible, and though a period can indicate the presence of an S3 class implementation, it can also just be part of an object’s name. That said, it’s good practice not to use periods when naming variables.

Workshop

The workshop contains quiz questions and exercises to help you solidify your understanding of the material covered. Try to answer all questions before looking at the “Answers” section that follows.

Quiz

1. True or false? S3 and S4 classes were first introduced in S version 3 and S version 4, respectively.

2. Which of the following should be used to plot the object myLm of lm class?

A. plot

B. plot.lm

C. plot.myLm

D. myLm.plot

3. How do you find out what methods are available for an S3 class?

4. What is the name of the function used to define new generics?

5. True or false? You must document an S3 method when building an R package.

Answers

1. True. This is another case of R inheriting behavior from S.

2. A. Technically plot.lm can be used directly; however, directly invoking a method is generally discouraged.

3. You use the methods function and specify the class= option.

4. The UseMethod function enables us to create new generics. We define a generic by writing a function that calls UseMethod.

5. False. However, you really should document it, particularly if the method does anything sophisticated.

Activities

1. Define a new S3 class. The aim of the class is to store simulated data from various known statistical distributions. In order to construct the new class, create the following items:

A constructor function that takes inputs n and distribution, representing the number of values to sample and the distribution to sample from. Ensure that the function has the option for other parameter arguments, as needed.

A print method that displays a table of summary statistics for the simulated data (mean, median, standard deviation, min, and max).

A plot method that draws a histogram of the random numbers, with a default title that states from which distribution the data has been simulated and how many values have been simulated.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Hour 21. Writing R Classes

Create new playlist

Sign In

Sign Up

Hour 21. Writing R Classes

What Is a Class?

Object Orientation in R

Why Bother with Object Orientation?

Class Example

Inheritance

Why Use S3?

Creating a New S3 Class

A More Formal Approach to Creating Classes

Generic Functions and Methods

Defining Methods for Arithmetic Operators

Lists vs. Attributes

Creating New Generics

Inheritance in S3

Documenting S3

Limitations of S3

Summary

Q&A

Workshop

Quiz

Answers

Activities

Table of Contents for
Hour 21. Writing R Classes