33 Excellent Free Books to Learn all about R

The R language is the de facto standard among statisticians for the development of statistical software, and is widely used for statistical software development and data analysis. R is a modern dialect of S, one of several statistical programming languages designed at Bell Laboratories.

R is much more than a programming language. It’s an interactive suite of software facilities for data manipulation, calculation, and graphical display. R offers a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The ability to download and install R packages is a key factor which makes R an excellent language to learn. What else makes R awesome? Here’s a taster.

  • It’s free, open source, and available for every major platform. So anyone can repeat your work whatever platform they run.
  • A huge set of high quality packages for statistical modelling, machine learning, visualisation, and importing and manipulating data.
  • Cutting edge tools.
  • A suite of operators for calculations on arrays, in particular matrices.
  • Deep-seated language support for data analysis. This includes features likes missing values, data frames, and subsetting.
  • Powerful tools for communicating your results.
  • Produce publication-quality graphs, including mathematical symbols. Dynamic and interactive graphics are available through additional packages. R packages make it easy to produce HTML or PDF, and create interactive websites with Shiny, a sublime R package.
  • A strong foundation in functional programming. The ideas of functional programming are well suited to solving many of the challenges of data analysis. R provides a powerful and flexible toolkit which allows you to write concise yet descriptive code.
  • RStudio, a powerful integrated development environment.
  • Powerful metaprogramming facilities; a fantastic environment for interactive data analysis.
  • Connects to high-performance programming languages like C, Fortran, and C++.
  • An amazingly vibrant and helpful community.

Packages are the fundamental units of reproducible R code. They include reusable R functions, the documentation that describes how to use them, and sample data. The CRAN package repository hosts over 14,000 packages, and Bioconductor is home to over 1,600 packages.

This article recommends 29 free books which will teach you the basics of R, how to produce amazing plots, how to apply R to lots of disciplines, and how to efficiently program in R. Many of the books are open source.

If you’re new to R, we strongly recommend reading our interactive tutorial: Introduction to R and RStudio for Data Science. It focuses on a common task in data science: import a data set, manipulate its structure, and then visualise the data. We use R and RStudio to accomplish this task.

1. R for Data Science by Hadley Wickham & Garrett Grolemund

R-Data-ScienceR for Data Science is the ideal introductory text for learning about what R can do. In fact, we’d go as far to say it’s the best introductory book for budding R data scientists. It teaches you the basics learning good practices for writing and organizing your R code, and RStudio, a powerful IDE. The focus of this book is on exploration, not confirmation or formal inference.

If you’re looking to grasp how to make simple and elegant plots in R, learn how to transform data, and embark on some data analysis, this is definitely your starting text.

There’s particularly good coverage about data wrangling, and you’ll master the basics of data frames, data importing, and tidy data.

Hadley Wickham has graciously made this book available online. It’s released under an open source license. You’ll probably want to purchase the paperback version, the book is so good.

Read the book

2. Introduction to Data Science by Rafael A Irizarry


This introductory book introduces concepts and skills that can help you tackle real-world data analysis challenges. It’s an exceptionally good read covering concepts from probability, statistical inference, linear regression and machine learning.

It also helps you develop skills such as R programming, data wrangling with dplyr, data visualization with ggplot2, algorithm building with caret, file organization with UNIX/Linux shell, version control with Git and GitHub, and reproducible document preparation with knitr and R markdown.

The book includes dozens of exercises to test whether you have understood the material.

It’s suggested price is $49.99, but the book can be downloaded without charge. And it’s released under an open source license.

Read the book

3. Hands-On Programming with R by Garrett Grolemund

Hands-On-Programming-with-RAs the title suggests, Hands-On Programming with R teaches you how to program in R. It’s expertly crafted. There’s hands-on examples in the book.

The book teaches you how to load data, assemble and disassemble data objects, navigate R’s environment system, write your own functions, and use all of R’s programming tools.

The book is released under an open source license.

Read the book

4. ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham

ggplot2-elegant-graphics-data-analysisggplot2 is a widely acclaimed data visualization package for the statistical programming language R. The package lets you create new beautiful plots. We use ggplot2 extensively for our Group Tests charts.

ggplot2 was created by Hadley Wickham. So it’s not surprising that we recommend his ggplot2: Elegant Graphics for Data Analysis book. It expertly teaches you the elements of ggplot2’s grammar and how they fit together. This book helps you understand the theory that underpins ggplot2, and will help you create new types of graphic specifically tailored to your needs

You can grab the code and text behind the ggplot2 book. ggplot2’s reference website is a welcome resource once you’ve mastered the basics.

Read the book

5. Data Visualization: A practical introduction by Keiran Healy


Data Visualization: A practical introduction offers students and researchers a hands-on introduction to the principles and practice of data visualization. No knowledge of R is assumed.

Data Visualization builds the reader’s expertise in ggplot2, an excellent visualization library for the R programming language. Through a series of worked examples, this accessible primer then demonstrates how to create plots piece by piece, beginning with summaries of single variables and moving on to more complex graphics. Learn how to produce and refine plots. The worked examples are a real godsend.

Topics include plotting continuous and categorical variables; layering information on graphics; producing effective “small multiple” plots; grouping, summarizing, and transforming data for plotting; creating maps; working with the output of statistical models; and refining plots to make them more comprehensible.

Kieran Healy is associate professor of sociology at Duke University.

Read the book

