DATASET_BOXPLOT.Rmd
##Introduction The function of DATASET_BOXPLOT
template
is to visualize multi-column wide data as a box plot displayed outside
the tree. The box plot shows the distribution of each data set
corresponding to the tip, including the maximum, minimum, quartiles and
extreme values. The DATASET_BOXPLOT
template belongs to the
“Basic graphics” class (refer to the Class for detail
information).
This section shows how to use the itol.toolkit to prepare a box plot template. Without itol.toolkit, users would have to calculate and list the maximum, minimum, quartiles and extreme values for each set of data manually. When there are multiple extreme values in the data, the number of columns for each records in input data becomes unequal,resulting in significant inconvenience for the user in preparing the template file. The itol.toolkit could automatically generate the data format required for the template from input data, greatly improving the efficiency of drawing box plots and reducing the difficulty of data preparation.
This section uses dataset 1 to draw a box plot (refer to the Dataset for detail information).
The first step is to load the newick
format tree file
tree_of_itol_templates.tree
and its corresponding metadata
templates_frequence
. Briefly, the
templates_frequence
contains the usage of each template
type in 21 published studies.
library(itol.toolkit)
library(data.table)
library(ape)
library(dplyr)
tree <- system.file("extdata",
"tree_of_itol_templates.tree",
package = "itol.toolkit")
df_frequence <- system.file("extdata",
"templates_frequence.txt",
package = "itol.toolkit")
df_frequence <- fread(df_frequence)
We perform a simple process on the input raw data to convert the wide data to a long data. The first column of the converted table shows the templates used in each study, and the second column shows the frequency of each template used in each article. Next, we use this as input data to generate a box plot template.
df_data <- df_frequence %>%
melt(id.vars=c("templates")) %>%
na.omit() %>%
select(templates,value)
unit_31 <- create_unit(data = df_data,
key = "E031_boxplot_1",
type = "DATASET_BOXPLOT",
tree = tree)
We can also convert the original data, such as log conversion, and then use it to draw a box plot.
df_data_log$value <- log(df_data$value)
unit_32 <- create_unit(data = df_data_log,
key = "E032_boxplot_2",
type = "DATASET_BOXPLOT",
tree = tree)
You can adjust the height of the box plot by adjusting
unit@specific_themes$basic_plot$width
.
unit_32@specific_themes$basic_plot$width <- 500
IOCAS, weiyLiu@outlook.com↩︎
CACMS, njbxhzy@hotmail.com↩︎
IOCAS, tongzhou2017@gmail.com↩︎