Appendix A. Real-Life Resources

One of the greatest aspects of R is the surrounding community, both online and in person. This includes Web resources like Twitter and Stack Overflow, meetups and textbooks.

A.1. Meetups

Meetup.com is a fantastic resource for finding like-minded people and learning experiences for just about anything including programming, statistics, video games, cupcakes and beer. They are so pervasive that as of late-July 2013, there were over 126,000 meetup groups in nearly 200 countries. Data meetups draw particularly large crowds and usually take the format of socializing, a talk for 45 to 90 minutes, and then more socializing. Meetups are not only great for learning, but also for hiring or getting hired.

R meetups are very common, although some are starting to rebrand from R meetups to statistical programming meetups. Some popular meetups take place in New York, Chicago, Boston, Amsterdam, Washington D.C., San Francisco, London, Cleveland, Singapore and Melbourne. The talks generally show cool features in R, new packages or software or just an interesting analysis performed in R. The focus is usually on programming more than statistics. Table A.1 lists a number of popular meetups but it is an incredibly short list compared to how many meetups exist for R.

Image

Table A.1 R and Related Meetups

Machine Learning meetups are also good for finding presentations on R, although they will not necessarily be as focused on R. They are located in many of the same cities as R meetups and draw similar speakers and audiences. These meetups tend more toward the academic than focusing on programming.

The third core meetup type is Predictive Analytics. While they may seem similar to Machine Learning meetups, they cover different material. The focus is somewhere in between that of R and Machine Learning meetups. And yes, there is significant overlap in the audiences for these meetups.

Other meetup groups that might be of interest are data science, big data and data visualization.

A.2. Stack Overflow

Sometimes when confronted with a burning question that cannot be solved alone, a good place to turn for help is Stack Overflow (http://stackoverflow.com/). Previously the R mailing list was the best, or only, online resource for help, but that has since been superseded by Stack Overflow.

The site is a forum for asking programming questions where both questions and answers are voted on by users and people can build reputations as experts. This is a very quick way to get answers for even difficult questions.

Common search tags related to R are r, statistics, rcpp, ggplot2, shiny and other statistics-related terms.

Many R packages these days are hosted on GitHub, so if a bug is found and confirmed, the best way to address it is not on Stack Overflow but on the GitHub issues list for the package.

A.3. Twitter

Sometimes just a quick answer is needed that would fit in 140 characters. In this case, Twitter is a terrific resource for R questions ranging from simple package recommendations to code snippets.

To reach the widest audience, it is important to use hash tags such as #rstats, #ggplot2, #knitr, #rcpp, #nycdatamafia and #statistics.

Great people to follow are @drewconway, @mikedewar, @harlanharris, @xieyihui, @hadleywickham, @jeffreyhorner, @revodavid, @eddelbuettel, @johnmyleswhite, @Rbloggers, @statalgo, @ProbablePattern, @CJBayesian, @RLangTip, @cmastication, @nyhackr and this book’s author, @jaredlander.

A.4. Conferences

There are a number of conferences where R is either the focus or receives a lot of attention. There are usually presentations about or involving R, and sometimes classes that teach something specific about R.

The main one is the appropriately named useR! conference, which is a yearly event at rotating locations around the world. The Web site is at http://www.r-project.org/conferences.html.

R in Finance is a yearly conference that takes place in Chicago and is coorganized by Dirk Eddelbuettel. It is quantitatively focused and heavy in advanced math. The Web site is at http://www.rinfinance.com/.

Other statistics conferences that are worth attending are the Joint Statistical Meetings organized by the American Statistical Association (http://www.amstat.org/meetings/jsm.cfm) and Strata New York (http://strataconf.com/strata2013/public/content/home).

Data Gotham is a very new data science conference organized by some of the leaders of the data science community like Drew Conway and Mike Dewar. The Web site is at http://www.datagotham.com/.

A.5. Web Sites

Being that R is an open-source project with a strong community, it is only appropriate that there is a large ecosystem of Web sites devoted to it. Most of them are maintained by people who love R and want to share their knowledge. Some are exclusively focused on R and some only partially.

Besides http://www.jaredlander.com/, some of our favorites are R-Bloggers (http://www.r-bloggers.com/), Zero Intelligence Agents (http://drewconway.com/zia/), R Enthusiasts (http://gallery.r-enthusiasts.com/), Rcpp Gallery (http://gallery.rcpp.org/), Revolution Analytics (http://blog.revolutionanalytics.com/), Andrew Gelman’s site (http://andrewgelman.com/), John Myles White’s site (http://www.johnmyleswhite.com/) and chartsnthings from The New York Times graphics department (http://chartsnthings.tumblr.com/).

A.6. Documents

Over the years, a number of very good documents have been written about R and made freely available.

An Introduction to R, by William N. Venables, David M. Smith and The R Development Core Team, has been around since S, the precursor of R, and can be found at http://cran.r-project.org/doc/manuals/R-intro.pdf.

The R Inferno is a legendary document by Patrick Burns that delves into the nuances and idiosyncrasies of the language. It is available as both a printed book and a free PDF. Its Web site is http://www.burns-stat.com/documents/books/the-r-inferno/.

Writing R Extensions is a comprehensive treatise on building R packages that expands greatly on Chapter 24. It is available at http://cran.r-project.org/doc/manuals/R-exts.html.

A.7. Books

For a serious dose of statistics knowledge, textbooks offer a huge amount of material. Some are old fashioned and obtuse, while others are modern and packed with great techniques and tricks.

Our favorite statistics book—which happens to include a good dose of R code—is Data Analysis Using Regression and Multilevel/Hierarchical Models by Andrew Gelman and Jennifer Hill. The first half of the book is a good general text on statistics with R used for examples. The second half of the book focuses on Bayesian models using BUGS; the next edition is rumored to use STAN.

For advanced machine learning techniques, but not R code, Hastie, Tibshirani and Friedman’s landmark The Elements of Statistical Learning: Data Mining, Inference, and Prediction details a number of modern algorithms and models. It delves deep into the underlying math and explains how the algorithms, including the Elastic Net, work.

Other books, not necessarily textbooks, have recently came out that are focused primarily on R. Machine Learning for Hackers by Drew Conway and John Myles White uses R as a tool in learning some basic machine learning algorithms. Dynamic Documents with R and knitr by Yihui Xie is an in-depth look at knitr and expands greatly on Chapter 23. Integrating C++ into R, discussed in Section 24.6, receives full treatment in Seamless R and C++ Integration with Rcpp by Dirk Eddelbuettel.

A.8. Conclusion

Making use of R’s fantastic community is an integral part of learning R. Person-to-person opportunities exist in the form of meetups and conferences. The best online resources are Stack Overflow and Twitter. And naturally there are a number of books and documents available both online and in bookstores.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset