Jump to main content or area navigation.

Contact Us

CADDIS Volume 4: Data Analysis

Download Software

R Command Line Tutorial

RExit EPA Disclaimer is a free statistical software, providing access to a broad array of statistical tools. This page provides a brief outline of some commands that will help users begin to work with this software.

Workspace

Before beginning any computations, it is helpful to first set up a working directory. Using Windows Explorer (or any other comparable method), make a new folder for storing your work. Then, after launching R, select File: Change dir...

R Screen

Navigate to the folder that you just created and select it. R will now store your working data in this directory. It also will automatically look in this directory for scripts and data that you wish to import. Example data will be used to demonstrate most of the scripts in this section. These data can be obtained from the sample data section and should be stored in your working directory.

Basic syntax

Variable names in R can be composed of combinations of letters, numbers, underscores, and periods. They are case sensitive. Note that in this and all subsequent sections, R commands can be run by cutting and pasting text directly into the R Console window.

x, y, X, Y, flow.rate

To assign a value to a variable, use the assignment operator, <-.

x <- 1			# Assign a single value to the variable x
x <- c(1,3,2)		# Assign a vector of numbers to x
x <- c(T,F,T)		# Assign a vector of logical values to x 
x <- list(colors = c("red", "blue", "black"), numbers = c(1,3))	
                        # Assign a list of dissimilar objects to x

The value of any variable can be examined by typing the variable name, or by using the print command:

x
print(x)

Simple mathematical and statistical operations can be performed on different numerical vectors.

x + y 		# Addition
x - y		# Subtraction
x * y		# Multiplication
x / y		# Division
mean(x)		# Arithmetic mean
var(x)		# Variance
sum(x)		# The sum of all the elements of x

The most commonly used format for storing data is the data frame, which is a list of objects of the same length. Data frames allow one to combine logical, numerical, and factor data in a single data structure.

site.name <- c("A", "B", "C", "D")      # A site label stored as a
                                        #   factor
pH <- c(7.6, 6.0, 4.0, 8.2)		# Site pH stored as a 
                                        #   numerical vector
abund.baetis <- c(103, 204, 602, 301)	# Baetis abundance stored as 
                                        #   a numerical vector
sampled.spring <- c(T, T, F, T)		# Sampling season stored as a 
                                        #   logical vector
all.data <- data.frame(site.name, pH, abund.baetis, sampled.spring)
                    # All data combined together 
                                        # as a data frame

Elements of a vector can be referred to in various ways.

x[1] 		# The first element of the vector x
x[1:3]		# The first three elements of vector x
x[c(T,T,F)]  	# The first two elements of x (assuming that x
                #   has three elements)
x[-1] 		# All of x except for the first element

We can also refer to different subsets of a data frame in various ways.

all.data$pH		        # The element labeled "pH" from the data frame
                                #   all.data
all.data[, "pH"]	        # The same column labeled "pH"
all.data[, 2]           	# The second column of the data.frame
all.data[1,]		        # The first row of the data.frame

Within R, you can access help pages on a particular command by typing,

help(<command name>)

For example:

help(glm)
help(mean)

Top of page


Jump to main content.