The data

Load your data here:

# Enter your code here!
##   derisi.05.5.txt derisi.09.5.txt derisi.11.5.txt derisi.13.5.txt
## 1       0.1647863       0.2714257      0.04684025      0.42867841
## 2       0.1003049       0.2363395     -0.10935876      0.14925937
## 3       0.1492594       0.4688439     -0.01449957      0.14404637
## 4      -1.1909972              NA     -0.96296927     -0.51045706
## 5      -0.7417826       0.2265085     -0.98564471     -0.04841221
## 6      -0.3510744       0.4254593     -0.57346686     -0.52076944
##   derisi.15.5.txt derisi.18.5.txt derisi.20.5.txt
## 1      0.14274017      -0.2108968       0.1826923
## 2      0.11103131      -0.1408255       0.2521734
## 3      0.07176267       0.1150332       0.2141248
## 4     -2.39592868              NA      -0.9378783
## 5     -1.27578631      -0.1942948              NA
## 6      0.40490312       0.3196179       0.2485348

Explore the raw data

For example:

library(ggplot2)
library(tidyr)

ggplot(gather(data, key = "array", value = "log2.ratio")) +
  geom_density(aes(log2.ratio, color = array))

With your data:

# Enter your code here!

Create a hierarchical cluster

Here are a few tricks for tweaking the clusters you make using the builtin heatmap function:

library(gplots)
heats <- colorpanel( n    = 20
                   , low  = "royalblue3"
                   , mid  = "black"
                   , high = "yellow"
                   )

Make a cluster using heatmap!

Example:

m <- as.matrix(data)

#Heat map requires that there be no missing values, so we'll prune rows with NA's
m <- na.omit(m)

# Depending on the size of your data, this might take awhile
heatmap(m, col = heats, scale = "none", labRow="", labCol="",  zlim = c(-4, 4), na.rm = TRUE)

Your data:

# Enter your code here!

Correlation matrix

For example:

interactionMatrix <- cor(m, use = 'pairwise.complete.obs')

Try it:

# Enter your code here!

For example:

interactionTable            <- reshape2::melt(interactionMatrix)
colnames(interactionTable)  <- c( "array.1"
                                , "array.2"
                                , "value"
                                )

ggplot( interactionTable
      , aes(array.1, array.2, fill = value)
      ) + geom_tile() + labs(x = "Arrays", y = "Arrays")

# Enter your code here!

Now experiment!

  • Explore the efffects of using different methods to calculate the correlation matrix (the default is pearson).
  • Explore the data that are invisibly returned from a call to heatmap (hint: it's invisible, but can still be captured in a variable).
  • Use cutree to perform node trimming on your cluster
  • Create "zoomed in" views of subclusters with data subsetting
  • Play with the effects of scaling