In this part of the book, you’ll learn about data wrangling, the art of getting your data into R in a useful form for visualization and modeling. Data wrangling is very important: without it you can’t work with your own data! There are three main parts to data wrangling:
This part of the book proceeds as follows:
In Chapter 7, you’ll learn about the variant of the data frame that we use in this book: the tibble. You’ll learn what makes them different from regular data frames, and how you can construct them “by hand.”
In Chapter 8, you’ll learn how to get your data from disk and into R. We’ll focus on plain-text rectangular formats, but will give you pointers to packages that help with other types of data.
In Chapter 9, you’ll learn about tidy data, a consistent way of storing your data that makes transformation, visualization, and modeling easier. You’ll learn the underlying principles, and how to get your data into a tidy form.
Data wrangling also encompasses data transformation, which you’ve already learned a little about. Now we’ll focus on new skills for three specific types of data you will frequently encounter in practice:
Chapter 10 will give you tools for working with multiple interrelated datasets.
Chapter 11 will introduce regular expressions, a powerful tool for manipulating strings.
Chapter 12 will show you how R stores categorical data. They are used when a variable has a fixed set of possible values, or when you want to use a nonalphabetical ordering of a string.
Chapter 13 will give you the key tools for working with dates and date-times.