Working with simple data

Interacting with R

What will happen when you enter these commands? Try it out.

1 + 2
2 * 3
4 ^ 5
6.7 / 8.9

Saving data in variables

a <- 10
a
## [1] 10
b <- a + 11
b
## [1] 21
c <- a / b
c
## [1] 0.4761905

If you're working with very large numbers you can use scientific notation:

2e10
## [1] 2e+10
2 * 10^10
## [1] 2e+10
2e10 == 2 * 10^10
## [1] TRUE

Everything is a vector

You can see how many elements a vector holds using the length function:

length(10)
## [1] 1
length(c)
## [1] 1
length(1:10)
## [1] 10

Composing vectors

c(1,2,3,4)
## [1] 1 2 3 4
d <- c(5,6,7,8)
d + 10
## [1] 15 16 17 18
d + d
## [1] 10 12 14 16

Strings

To create strings, surround your text with either double " ... " or single ' ... ' quotes:

"a"
## [1] "a"
"a" == 'a'
## [1] TRUE
c( "a", "b", "c", "d" )
## [1] "a" "b" "c" "d"

Escape characters

s <- "My data are \"awesome\"!"
cat(s)
## My data are "awesome"!

Two other special string characters are tab \t and newline \n:

s <- "a\tb\tc"
cat(s)
## a    b   c
s <- "a\nb\nc"
cat(s)
## a
## b
## c

Boolean values

To create Boolean values use TRUE or FALSE:

TRUE
## [1] TRUE
FALSE
## [1] FALSE
TRUE == FALSE
## [1] FALSE

Missing Data

c(1, 2, NA, 4)
## [1]  1  2 NA  4
c( "a", NA, "c", NA )
## [1] "a" NA  "c" NA
c(TRUE, FALSE, NA, FALSE)
## [1]  TRUE FALSE    NA FALSE
is.na( c(1, NA) )
## [1] FALSE  TRUE

A note about NULL

NULL is used to signify unassigned variables:

NULL
## NULL
is.null(NULL)
## [1] TRUE

Indexing syntax in R

Extracting values

Let's say we have a vector of numbers:

myNumbers <- c( 10, 20, 30, 40, 50 )

We can extract elements from 1D vectors using the index syntax [] and integers:

myNumbers
## [1] 10 20 30 40 50
myNumbers[1]
## [1] 10
myNumbers[3]
## [1] 30

We can use integer vectors with more than one element inside of our index [...]'s::

myNumbers[ c(1, 3) ]
## [1] 10 30

You can use the : operator to easily create a sequence of numbers:

2:4
## [1] 2 3 4
myNumbers[2:4]
## [1] 20 30 40

myNumbers
## [1] 10 20 30 40 50
myNumbers[ c(FALSE, TRUE,  TRUE,  TRUE,  TRUE ) ]
## [1] 20 30 40 50
myNumbers[ c(TRUE,  FALSE, FALSE, FALSE, FALSE) ]
## [1] 10

Logical operators always return a logical vector:

myNumbers > 25
## [1] FALSE FALSE  TRUE  TRUE  TRUE
myNumbers < 25
## [1]  TRUE  TRUE FALSE FALSE FALSE
myNumbers == 30
## [1] FALSE FALSE  TRUE FALSE FALSE
myNumbers != 30
## [1]  TRUE  TRUE FALSE  TRUE  TRUE

The %in% operator asks if the first set of numbers can be found in the second:

30 %in% myNumbers
## [1] TRUE
c(10, 100) %in% myNumbers
## [1]  TRUE FALSE

The ! operator negates (flips) each value of a logical vector:

!TRUE
## [1] FALSE
!(myNumbers > 25)
## [1]  TRUE  TRUE FALSE FALSE FALSE

So how can we combine logical comparisons with indexing?

myNumbers[myNumbers > 25]
## [1] 30 40 50
myNumbers[myNumbers < 25]
## [1] 10 20

You can get fancy…

myNumbers[ (myNumbers %% 2) == 0 ]
## [1] 10 20 30 40 50

Assigning values

myNumbers
## [1] 10 20 30 40 50
myNumbers[3]    <- 100
myNumbers
## [1]  10  20 100  40  50
myNumbers[2:3]  <- c(1,2)
myNumbers
## [1] 10  1  2 40 50

Bigger data structures

Matrix

A matrix is a vector of vectors, each the same length and with the same type of data:

m <- matrix(1:8, nrow = 2, ncol = 4)
m
##      [,1] [,2] [,3] [,4]
## [1,]    1    3    5    7
## [2,]    2    4    6    8

You access values on a matrix by using a one element index, referring to a n'th position:

m[2]
## [1] 2

Alternatively you can specify a [row, col]:

m[1, 2]
## [1] 3

Or just a row:

m[1, ]
## [1] 1 3 5 7

Or just a column:

m[, 2]
## [1] 3 4

If you forget this syntax, just pay attention to how R prints out matrices!

Array

array(1:8, dim = c(2, 2, 2))
## , , 1
## 
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
## 
## , , 2
## 
##      [,1] [,2]
## [1,]    5    7
## [2,]    6    8

Lists

l <- list( a = c(1, 2, 3, 4), b = c("a", "b", "c") )
l
## $a
## [1] 1 2 3 4
## 
## $b
## [1] "a" "b" "c"

You can access individual vectors on lists using indexing with numbers or names:

l[1]
## $a
## [1] 1 2 3 4
l["a"]
## $a
## [1] 1 2 3 4

Did you notice what type of thing was returned there?

To simplify the result of indexing down to a vector (rather than a one element list):

l[[1]]
## [1] 1 2 3 4
l[["a"]]
## [1] 1 2 3 4
l$a
## [1] 1 2 3 4
l$a == l[["a"]]
## [1] TRUE TRUE TRUE TRUE

Make sure you understand why this fails:

l$a == l["a"]

After class