AlexKnyshov

my GitHub Web Site

R tutorial 2: Ancestral State Reconstruction (ML) and Visualization (simple)

Legend

Code block
Output block
Console command block

Tools needed

We would need the following packages:

  • package APE
  • package phytools
  • package ggplot2
  • package ggtree

And the following files:

  • phylogenetic tree in newick format
  • discrete data in nexus format

Read in the phylogeny and the data matrix

Phylogeny in this example is a bipartitions file output from RAxML. The data matrix is a nexus file, exported from Mesquite in simplified format. We can load the files like that if they are in the working directory:

library(ape)

# load tree and data
tree1 <- ladderize(read.tree("treefile.tre"))
data_nex <- read.nexus.data("matrix.nex")

Matrix was read into a list of vectors each corresponding to row of the original matrix (taxon). To be able to easily access a column of the matrix, I reformat it to data frame. The matrix contains multiple characters. Typically, a single character is reconstructed and vizualized at a time. Thus I will extract a single character (21) from the matrix.

data_df <- as.data.frame(do.call(rbind, data_nex)) #reformat to data frame
char21 <- data_df[ ,21] #subdivide single character (21st)

ASR

The character is coded for all the taxa, has no polymorphisms, and can be straightforwardly analyzed. I perform joint ML ASR using the function ace from the package APE. The lik.anc element will contain the data we need to vizualize on the phylogeny, the proportional likelihoods of each state at each node.

#do asr
asr <- ace(char21, tree1, "discrete")

Visualizing the ASR using pie charts

First I will demonstrate how to plot this data using regular APE toolkit. I plot a phylogeny itself, then set the colors of tip labels and annotate tip labels to show tip character states. Finally the pie charts are plotted, using the same colors.

par(mar=c(0,0,0,0)) #reduce white space around the plot
plot(tree1, cex=0.5) #draw the tree, with tiny tip labels
#set up colors for tip taxa
colors1 <- rep("", length(tree1$tip.label))
colors1[char21=="0"] <- "red"
colors1[char21=="1"] <- "blue"
#plot tip circles
tiplabels(rep("",144), #plot circles for tip states
          match(names(char21), tree1$tip.label),
          bg=colors1,
          frame = "circle",
          cex=0.2)
par(lwd=0.5) #change global line width settings to narrow
#plot pie charts
nodelabels(pie = asr$lik.anc, cex=0.3, piecol = c("red","blue"))

We can generate a similar figure using ggtree. However, data input needs to be formatted a bit different. So first I reformat data, then I create a ggplot object with the tree, data frame attached to it, tip labels (those are plotted separately in ggtree), points annotating tip character states, and color set up. Finally, pies are created using ggtree nodepie function, and those are then inserted onto the previously created ggplot object.

library(ggplot2)
library(ggtree)
## ggtree v1.14.6  For help: https://guangchuangyu.github.io/software/ggtree
## 
## If you use ggtree in published research, please cite the most appropriate paper(s):
## 
## - Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods in Ecology and Evolution 2017, 8(1):28-36, doi:10.1111/2041-210X.12628
## 
## - Guangchuang Yu, Tommy Tsan-Yuk Lam, Huachen Zhu, Yi Guan. Two methods for mapping and visualizing associated data on phylogeny using ggtree. Molecular Biology and Evolution 2018, accepted. doi: 10.1093/molbev/msy194
## 
## Attaching package: 'ggtree'
## The following object is masked from 'package:ape':
## 
##     rotate
#reformat data for tip character state annotation
df1 <- as.data.frame(cbind(taxa=rownames(data_df), state=data_df$V21), stringsAsFactors = F)
#create plot with tree, tip labels, and tip states
p <- ggtree(tree1, ladderize=F) %<+%  df1 + geom_tiplab(size=2) + geom_tippoint(aes(fill=state),shape=21) +
  scale_fill_manual(values = c("red","blue"))
#reformat data for pie charts
pie_data <- as.data.frame(cbind(asr$lik.anc, node=144+(1:143)))
#create pie charts
pies <- nodepie(pie_data, 1:2, c("red","blue"))
#plot pie charts on the main plot
inset(p, pies,width = 0.03, height = 0.03)
## Loading required package: ggimage