Getting Started

The world confronts us. Make decisions we must.

Installing R and RStudio

We use R via RStudio. R is to RStudio as a car’s engine is to its dashboard.

More precisely, R is a programming language that runs computations, while RStudio is an integrated development environment (IDE) that provides an interface with many convenient features. Just as having access to a speedometer, rearview mirrors, and a navigation system makes driving much easier, using RStudio’s interface makes using R much easier.

Download and install R and RStudio (Desktop version) on your computer.

  1. Download and install R. If you are using a Mac, make sure to use the correct installation, depending on whether you are using newer “M1 and higher Macs” (first option) or older “Intel Macs” (second option). You can look up the chip used in your Mac by checking “About This Mac” under the Apple menu.

  2. Download and install RStudio Desktop (the free version).

If you are using Windows, download and install R Tools. Again, this is only for Windows users. The Rtools42 installer is what you want. The link is in the fifth paragraph of that page.

If you are using a Chromebook, follow this advice.

Using R via RStudio

Icons of R versus RStudio on your computer.

Much as we don’t drive a car by interacting directly with the engine but rather by interacting with elements on the car’s dashboard, we won’t be using R directly. Instead, we will use RStudio’s interface. After you install R and RStudio on your computer, you’ll have two new programs (also called applications) you can open. Always work in RStudio and not directly in the R application.

Open up RStudio. You should see three panes dividing the screen: the Console pane, the Environment pane, and the Files pane.

This is your workspace. Start with the big pane on the left:

There are three tabs in the Console pane. We’ll be focusing on the Console and Terminal tabs. When you first start R, the Console gives you some information about your version of R. The Console is where you can type and run R code. For example, if you type 1 + 1 and hit return, the Console returns 2.

We often use the phrase “run the following code.” This means that you should type or copy-and-paste the code into the Console and then hit the enter/return key.

Look at the top right:

The main two tabs we will be using are Environment and Git (which is not yet visible). The Environment tab shows you the datasets and variables you currently have loaded into R. In this case, we loaded in a dataset with 3407 rows and 5 columns and a variable x equal to 5. For you, the Environment should be empty. Let’s change that. Go to your Console and type:

x <- 5

This assigned the value 5 to an object, x. <- is the operator used to assign values to objects in R. Now, hit return/enter and you should see a variable x equal to 5 in your Environment tab. You must always hit return/enter after typing a command, otherwise RStudio will not realize that you want R to execute the command.

Look at the bottom right pane:

The Files tab displays your computer’s file system. When you create a project later, this tab will automatically show the contents of your project’s folder.

Package installation

R packages, also known as libraries, extend the power of R by providing additional functions and data.

R is to R packages as a new phone is to apps for that phone.

R is like a new phone. While it has a certain amount of features when you use it for the first time, it doesn’t have everything. R packages are like the apps you download onto your phone.

Consider an analogy to Instagram. If you have a new phone and want to share a photo with friends. You need to:

  1. Install the app: Since your phone is new and does not include the Instagram app, you need to download the app. You do this only once. (You might need to do this again in the future when there is an update to the app.)
  2. Open the app: After you’ve installed Instagram, you need to open it. You need to do this every time you use the app.

The process is very similar for an R package. You need to:

Installing versus loading an R package

  1. Install the package: This is like installing an app on your phone. Most packages are not installed by default when you install R and RStudio. Thus if you want to use a package for the first time, you need to install it. Once you’ve installed a package, you likely won’t install it again unless you want to update it to a newer version.

  2. Load the package: “Loading” a package is like opening an app on your phone. Packages are not loaded by default when you start RStudio. You need to load each package you want to use every time you restart RStudio.

Before installing packages, issue this command:

options(pkgType = "binary")

Doing so just makes various errors less likely. Do NOT do this is if you are installing on a Linux machine, e.g., a Chromebook or on a service like Posit Cloud.

Let’s install a useful package. At the Console pane within RStudio, type:

install.packages("remotes")

And press return/enter on your keyboard. You must include the quotation marks around the name of the package. A package can depend on other packages, which will be automatically installed if needed.

One tricky aspect of this process is that R will occasionally ask you:

Do you want to install from sources the packages which 
need compilation? (Yes/no/cancel)

Unless you have a good reason not to, always answer “no” to this question.

R packages generally live in one of two places:

Package loading

After you install a package, you need to “load” it by using the library() command. To load the remotes package, run the following code in the Console.

After running this code, a blinking cursor should appear next to the > symbol. (The > is the “prompt.”) This means you were successful and the remotes package is now loaded and ready to use. However, you might get a red “error message” which reads:

Error in library(remotes) : there is no package called ‘remotes’

This error message means that you haven’t successfully installed the package. If you get this error message, make sure to install the remotes package before proceeding.

For historical reasons, packages are also known as libraries, which is why the relevant command for loading them is library().

R will occasionally ask you if you want to install some packages. You almost always want to, otherwise R would not be asking you.

Package use

You have to load each package you want to use every time you start RStudio. If you don’t load a package before attempting to use one of its features, you will see an error message like:

Error: could not find function

This is a different error message than the one you just saw about a package not having been installed yet. R is telling you that you are trying to use a function in a package that has not yet been loaded. R doesn’t know where to “find” the function you want to use.

Let’s install a package that is not available from CRAN: all.primer.tutorials. Copy and paste the following to the R Console:

library(remotes)
install_github("PPBDS/all.primer.tutorials")

Many other new packages will be installed, including primer.data, which provides the data sets we use in the Primer. It may take a few minutes. If something gets messed up, it is often useful to read the error messages and see if a specific package is at fault. If so, use the remove.packages() function to remove the problematic package and then install it again.

Once you have primer.tutorials installed, run these commands.

library(all.primer.tutorials)
prep_rstudio_settings()

The prep_rstudio_settings() function sets various R/RStudio options to convenient values.

Tutorials

For each chapter of the textbook, there are one or more tutorials available in the all.primer.tutorials package. In order to access these tutorials, you should run library(all.primer.tutorials) in the R Console.

Access the tutorials via the Tutorial tab in the Environment pane in RStudio. If you don’t see any tutorials, try clicking the “Home” icon – the little house symbol with the thin red roof. You may need to restart your R session. Click on the “Session” menu and select “Restart R”.

In order to expand the window, you can drag and enlarge the Tutorial tab inside RStudio. In order to open a pop-up window, click the “Show in New Window” icon next to the Home icon.

You may notice that the Background Jobs tab in the Console pane will create output as the tutorial is starting up. This is because RStudio is running the code to create the tutorial.

Your work will automatically be saved between RStudio sessions. You can complete the tutorial in multiple sittings.

There are two basic ways you can close out of a tutorial safely before quitting your RStudio session:

  • If you clicked “Show in new window” and were working on the tutorial in a pop-up window, simply close the pop-up window.

  • If you were working on the tutorial inside the Tutorial pane of RStudio, simply press the red stop sign icon.

Basic Commands

There are a few terms, tips, and tricks that you should know before getting started with R.

  • Functions: these perform tasks by taking an input called an argument and returning an output.
sqrt(64)
[1] 8

sqrt() is a function that gives us the square root of the argument. 64 is the argument. Therefore, the output should be 8. Try it for yourself in the Console!

  • Help files: these provide documentation for packages, functions and datasets. You can bring up help files by adding a ? before the name of the object in the Console. Run ?sqrt.

  • Errors, warnings, and messages: these are important communications from R to you. When there is an error, the code will not run. Read (and/or Google) the message and try to fix it. Warnings don’t prevent code from completing. For example, if you create a scatterplot based on data with two missing values, you will see this warning:

    Warning message: Removed 2 rows containing missing values (geom_point).

Messages are similar. In both cases, you should fix the underlying issue until the error/warning/message goes away.

Summary

You should have done the following:

  • Installed the latest versions of R and RStudio.

  • Run, from the Console:

options(pkgType = "binary")
  • Installed, from Github, the all.primer.tutorials package:
remotes::install_github("PPBDS/all.primer.tutorials")
  • Run this command to set some RStudio options:
all.primer.tutorials::prep_rstudio_settings()
  • Learned some basic terminology that will help you when you read the rest of the Primer.

Let’s get started.