

1. Introduction

1.1 Meta Information

TOmicsVis: TranscriptOmics Visualization.

Website: https://benben-miao.github.io/TOmicsVis/

1.2 Github and CRAN Install

New!!! TOmicsVis Shinyapp:

# Start shiny application.
TOmicsVis Shinyapp
TOmicsVis Shinyapp

1.2.1 Install required packages from Bioconductor:

# Install required packages from Bioconductor
BiocManager::install(c("ComplexHeatmap", "EnhancedVolcano", "clusterProfiler", "enrichplot", "impute", "preprocessCore", "Mfuzz"))

1.2.2 Github: https://github.com/benben-miao/TOmicsVis/

Install from Github:


# Resolve network by GitClone

1.2.3 CRAN: https://cran.r-project.org/package=TOmicsVis

Install from CRAN:

# Install from CRAN

2. Libary packages

# 1. Library TOmicsVis package
#> Loading required package: Biobase
#> Loading required package: BiocGenerics
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>     Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#>     as.data.frame, basename, cbind, colnames, dirname, do.call,
#>     duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
#>     lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
#>     pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
#>     tapply, union, unique, unsplit, which.max, which.min
#> Welcome to Bioconductor
#>     Vignettes contain introductory material; view with
#>     'browseVignettes()'. To cite Bioconductor, see
#>     'citation("Biobase")', and for packages 'citation("pkgname")'.
#> Loading required package: e1071
#> Registered S3 method overwritten by 'GGally':
#>   method from   
#>   +.gg   ggplot2
#> Attaching package: 'DynDoc'
#> The following object is masked from 'package:BiocGenerics':
#>     path

# 2. Extra package
# install.packages("ggplot2")

3. Usage cases

3.1 Samples Statistics

3.1.1 quantile_plot

Input Data: Dataframe: Weight and Sex traits dataframe (1st-col: Weight, 2nd-col: Sex).

Output Plot: Quantile plot for visualizing data distribution.

# 1. Load example datasets
#>   Weight    Sex
#> 1  36.74 Female
#> 2  38.54 Female
#> 3  44.91 Female
#> 4  43.53 Female
#> 5  39.03 Female
#> 6  26.01 Female

# 2. Run quantile_plot plot function
  data = weight_sex,
  my_shape = "fill_circle",
  point_size = 1.5,
  conf_int = TRUE,
  conf_level = 0.95,
  split_panel = "Split_Panel",
  legend_pos = "right",
  legend_dir = "vertical",
  sci_fill_color = "Sci_NPG",
  sci_color_alpha = 0.75,
  ggTheme = "theme_light"

3.1.2 box_plot

Input Data: Dataframe: Length, Width, Weight, and Sex traits dataframe (1st-col: Value, 2nd-col: Traits, 3rd-col: Sex).

Output Plot: Plot: Box plot support two levels and multiple groups with P value.

# 1. Load example datasets
#>   Value Traits    Sex
#> 1 36.74 Weight Female
#> 2 38.54 Weight Female
#> 3 44.91 Weight Female
#> 4 43.53 Weight Female
#> 5 39.03 Weight Female
#> 6 26.01 Weight Female

# 2. Run box_plot plot function
  data = traits_sex,
  test_method = "t.test",
  test_label = "p.format",
  notch = TRUE,
  group_level = "Three_Column",
  add_element = "jitter",
  my_shape = "fill_circle",
  sci_fill_color = "Sci_AAAS",
  sci_fill_alpha = 0.5,
  sci_color_alpha = 1,
  legend_pos = "right",
  legend_dir = "vertical",
  ggTheme = "theme_light"

3.1.3 violin_plot

Input Data: Dataframe: Length, Width, Weight, and Sex traits dataframe (1st-col: Value, 2nd-col: Traits, 3rd-col: Sex).

Output Plot: Plot: Violin plot support two levels and multiple groups with P value.

# 1. Load example datasets

# 2. Run violin_plot plot function
  data = traits_sex,
  test_method = "t.test",
  test_label = "p.format",
  group_level = "Three_Column",
  violin_orientation = "vertical",
  add_element = "boxplot",
  element_alpha = 0.5,
  my_shape = "plus_times",
  sci_fill_color = "Sci_AAAS",
  sci_fill_alpha = 0.5,
  sci_color_alpha = 1,
  legend_pos = "right",
  legend_dir = "vertical",
  ggTheme = "theme_light"

3.1.4 survival_plot

Input Data: Dataframe: survival record data (1st-col: Time, 2nd-col: Status, 3rd-col: Group).

Output Plot: Survival plot for analyzing and visualizing survival data.

# 1. Load example datasets
#>   Time Status Group
#> 1   48      0    CT
#> 2   48      0    CT
#> 3   48      0    CT
#> 4   48      0    CT
#> 5   48      0    CT
#> 6   48      0    CT

# 2. Run survival_plot plot function
  data = survival_data,
  curve_function = "pct",
  conf_inter = TRUE,
  interval_style = "ribbon",
  risk_table = TRUE,
  num_censor = TRUE,
  sci_palette = "aaas",
  ggTheme = "theme_light",
  x_start = 0,
  y_start = 0,
  y_end = 100,
  x_break = 10,
  y_break = 10

3.2 Traits Analysis

3.2.1 corr_heatmap

Input Data: Dataframe: All genes in all samples expression dataframe of RNA-Seq (1st-col: Genes, 2nd-col~: Samples).

Output Plot: Plot: heatmap plot filled with Pearson correlation values and P values.

# 1. Load example dataset
#>              Genes   CT_1   CT_2   CT_3 LT20_1 LT20_2 LT20_3 LT15_1 LT15_2
#> 1     transcript_0 655.78 631.08 669.89 654.21 402.56 447.09 510.08 442.22
#> 2     transcript_1  92.72 112.26 150.30  88.35  76.35  94.55 120.24  80.89
#> 3    transcript_10  21.74  31.11  22.58  15.09  13.67  13.24  12.48   7.53
#> 4   transcript_100   0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00
#> 5  transcript_1000   0.00  14.15  36.01   0.00   0.00 193.59 208.45   0.00
#> 6 transcript_10000  89.18 158.04  86.28  82.97 117.78 102.24 129.61 112.73
#>   LT15_3 LT12_1 LT12_2 LT12_3 LT12_6_1 LT12_6_2 LT12_6_3
#> 1 399.82 483.30 437.89 444.06   405.43   416.63   464.75
#> 2  73.94  96.25  82.62  85.48    65.12    61.94    73.44
#> 3  13.35  11.16  11.36   6.96     7.82     4.01    10.02
#> 4   0.00   0.00   0.00   0.00     0.00     0.00     0.00
#> 5 232.40 148.58   0.00 181.61     0.02    12.18     0.00
#> 6  85.70  80.89 124.11 115.25   113.87   107.69   119.83

# 2. Run corr_heatmap plot function
  data = gene_expression,
  corr_method = "pearson",
  cell_shape = "square",
  fill_type = "full",
  lable_size = 3,
  axis_angle = 45,
  axis_size = 12,
  lable_digits = 3,
  color_low = "blue",
  color_mid = "white",
  color_high = "red",
  outline_color = "white",
  ggTheme = "theme_light"
#> Scale for fill is already present.
#> Adding another scale for fill, which will replace the existing scale.

3.2.2 pca_analysis

Input Data1: Dataframe: All genes in all samples expression dataframe of RNA-Seq (1st-col: Genes, 2nd-col~: Samples).

Input Data2: Dataframe: Samples and groups for gene expression (1st-col: Samples, 2nd-col: Groups).

Output Table: PCA dimensional reduction analysis for RNA-Seq.

# 1. Load example datasets

#>   Samples Groups
#> 1    CT_1     CT
#> 2    CT_2     CT
#> 3    CT_3     CT
#> 4  LT20_1   LT20
#> 5  LT20_2   LT20
#> 6  LT20_3   LT20

# 2. Run pca_analysis plot function
res <- pca_analysis(gene_expression, samples_groups)
#>               PC1         PC2         PC3        PC4         PC5       PC6
#> CT_1   -27010.536 -18328.2803   5955.2569 46547.7319  11394.1043 -7197.285
#> CT_2    16248.651  29132.9251   -824.1857 20747.9618 -18798.8755 21096.088
#> CT_3    22421.017 -26832.3964   6789.4490  5864.1171 -15375.3418 17424.861
#> LT20_1 -18587.073   -472.9036 -21638.7836  7765.9575    114.1225 -3943.968
#> LT20_2  33275.933  -9874.9959 -14991.3942 -7443.9250  -4600.8302 -8072.298
#> LT20_3  -1596.255  11683.5426 -10892.8493   381.0795  11080.3560 -8994.187
#>                PC7        PC8        PC9        PC10        PC11       PC12
#> CT_1     2150.6739   4850.320   4051.745   7666.9445  -3141.9327  -2487.939
#> CT_2   -12329.1138  -3353.734   4805.659   1503.8533  11184.0296  -4865.436
#> CT_3    12744.2255 -10037.516 -11468.842    202.4016 -11001.6260  -3847.291
#> LT20_1   8864.7482 -14171.127  -1968.082  -3562.1899   7446.2105  14831.486
#> LT20_2   -941.3943  -5072.401   5345.106   6494.1383  -3954.2153   9351.346
#> LT20_3   7263.9321  -7774.725  -1853.546 -21427.2641    -46.1503 -12507.011
#>              PC13       PC14          PC15
#> CT_1    -2704.613  2396.7383  2.528517e-11
#> CT_2    -2633.057 -1375.3352  6.825657e-11
#> CT_3     5193.978   188.5601  2.255671e-11
#> LT20_1   3937.457 -7871.8062  4.864246e-11
#> LT20_2 -12904.673  6071.6618 -2.020696e-10
#> LT20_3  -5369.380  2606.1762  1.903509e-11

3.2.3 pca_plot

Input Data1: Dataframe: All genes in all samples expression dataframe of RNA-Seq (1st-col: Genes, 2nd-col~: Samples).

Input Data2: Dataframe: Samples and groups for gene expression (1st-col: Samples, 2nd-col: Groups).

Output Plot: Plot: PCA dimensional reduction visualization for RNA-Seq.

# 1. Load example datasets

#>   Samples Groups
#> 1    CT_1     CT
#> 2    CT_2     CT
#> 3    CT_3     CT
#> 4  LT20_1   LT20
#> 5  LT20_2   LT20
#> 6  LT20_3   LT20

# 2. Run pca_plot plot function
  sample_gene = gene_expression,
  group_sample = samples_groups,
  xPC = 1,
  yPC = 2,
  point_size = 5,
  text_size = 5,
  fill_alpha = 0.10,
  border_alpha = 0.00,
  legend_pos = "right",
  legend_dir = "vertical",
  ggTheme = "theme_light"

3.2.4 tsne_analysis

Input Data1: Dataframe: All genes in all samples expression dataframe of RNA-Seq (1st-col: Genes, 2nd-col~: Samples).

Input Data2: Dataframe: Samples and groups for gene expression (1st-col: Samples, 2nd-col: Groups).

Output Table: TSNE analysis for analyzing and visualizing TSNE algorithm.

# 1. Load example datasets

# 2. Run tsne_analysis plot function
res <- tsne_analysis(gene_expression, samples_groups)
#>       TSNE1     TSNE2
#> 1 -67.41252 -16.61397
#> 2  43.08349 -34.02654
#> 3 123.32273  54.14358
#> 4 -42.52065 -31.30027
#> 5  94.98790  48.97986
#> 6 -23.90637 -22.26434

3.2.5 tsne_plot

Input Data1: Dataframe: All genes in all samples expression dataframe of RNA-Seq (1st-col: Genes, 2nd-col~: Samples).

Input Data2: Dataframe: Samples and groups for gene expression (1st-col: Samples, 2nd-col: Groups).

Output Plot: TSNE plot for analyzing and visualizing TSNE algorithm.

# 1. Load example datasets

# 2. Run tsne_plot plot function
  sample_gene = gene_expression,
  group_sample = samples_groups,
  seed = 1,
  multi_shape = FALSE,
  point_size = 5,
  point_alpha = 0.8,
  text_size = 5,
  text_alpha = 0.80,
  fill_alpha = 0.10,
  border_alpha = 0.00,
  sci_fill_color = "Sci_AAAS",
  legend_pos = "right",
  legend_dir = "vertical",
  ggTheme = "theme_light"

3.2.6 umap_analysis

Input Data1: Dataframe: All genes in all samples expression dataframe of RNA-Seq (1st-col: Genes, 2nd-col~: Samples).

Input Data2: Dataframe: Samples and groups for gene expression (1st-col: Samples, 2nd-col: Groups).

Output Table: UMAP analysis for analyzing RNA-Seq data.

# 1. Load example datasets

# 2. Run tsne_plot plot function
res <- umap_analysis(gene_expression, samples_groups)
#>             UMAP1       UMAP2
#> CT_1   -0.6752746  0.49425898
#> CT_2    1.0232441  0.03062202
#> CT_3   -0.4722297 -1.32183550
#> LT20_1 -0.2414214  0.13870703
#> LT20_2  0.1991701 -1.23434000
#> LT20_3  0.6431577  1.11879669

3.2.7 umap_plot

Input Data1: Dataframe: All genes in all samples expression dataframe of RNA-Seq (1st-col: Genes, 2nd-col~: Samples).

Input Data2: Dataframe: Samples and groups for gene expression (1st-col: Samples, 2nd-col: Groups).

Output Plot: UMAP plot for analyzing and visualizing UMAP algorithm.

# 1. Load example datasets

# 2. Run tsne_plot plot function
  sample_gene = gene_expression,
  group_sample = samples_groups,
  seed = 1,
  multi_shape = TRUE,
  point_size = 5,
  point_alpha = 1,
  text_size = 5,
  text_alpha = 0.80,
  fill_alpha = 0.00,
  border_alpha = 0.00,
  sci_fill_color = "Sci_AAAS",
  legend_pos = "right",
  legend_dir = "vertical",
  ggTheme = "theme_light"

3.2.8 dendro_plot

Input Data: Dataframe: All genes in all samples expression dataframe of RNA-Seq (1st-col: Genes, 2nd-col~: Samples).

Output Plot: Plot: dendrogram for multiple samples clustering.

# 1. Load example datasets

# 2. Run plot function
  data = gene_expression,
  dist_method = "euclidean",
  hc_method = "ward.D2",
  tree_type = "rectangle",
  k_num = 5,
  palette = "npg",
  color_labels_by_k = TRUE,
  horiz = FALSE,
  label_size = 1,
  line_width = 1,
  rect = TRUE,
  rect_fill = TRUE,
  xlab = "Samples",
  ylab = "Height",
  ggTheme = "theme_light"
#> Registered S3 method overwritten by 'dendextend':
#>   method     from 
#>   rev.hclust vegan

3.3 Differential Expression Analyais

3.3.1 venn_plot

Input Data2: Dataframe: Paired comparisons differentially expressed genes (degs) among groups (1st-col~: degs of paired comparisons).

Output Plot: Venn plot for stat common and unique gene among multiple sets.

# 1. Load example datasets
#>        CT.vs.LT20      CT.vs.LT15       CT.vs.LT12     CT.vs.LT12_6
#> 1 transcript_9024 transcript_4738  transcript_9956 transcript_10354
#> 2  transcript_604 transcript_6050  transcript_7601  transcript_2959
#> 3 transcript_3912 transcript_1039  transcript_5960  transcript_5919
#> 4 transcript_8676 transcript_1344  transcript_3240  transcript_2395
#> 5 transcript_8832 transcript_3069 transcript_10224  transcript_9881
#> 6   transcript_74 transcript_9809  transcript_3151  transcript_8836

# 2. Run venn_plot plot function
  data = degs_lists,
    title_size = 1,
    label_show = TRUE,
    label_size = 0.8,
    border_show = TRUE,
    line_type = "longdash",
    ellipse_shape = "circle",
    sci_fill_color = "Sci_AAAS",
    sci_fill_alpha = 0.65

3.3.2 upsetr_plot

Input Data2: Dataframe: Paired comparisons differentially expressed genes (degs) among groups (1st-col~: degs of paired comparisons).

Output Plot: UpSet plot for stat common and unique gene among multiple sets.

# 1. Load example datasets
#>        CT.vs.LT20      CT.vs.LT15       CT.vs.LT12     CT.vs.LT12_6
#> 1 transcript_9024 transcript_4738  transcript_9956 transcript_10354
#> 2  transcript_604 transcript_6050  transcript_7601  transcript_2959
#> 3 transcript_3912 transcript_1039  transcript_5960  transcript_5919
#> 4 transcript_8676 transcript_1344  transcript_3240  transcript_2395
#> 5 transcript_8832 transcript_3069 transcript_10224  transcript_9881
#> 6   transcript_74 transcript_9809  transcript_3151  transcript_8836

# 2. Run upsetr_plot plot function
  data = degs_lists,
  sets_num = 4,
  keep_order = FALSE,
  order_by = "freq",
  decrease = TRUE,
  mainbar_color = "#006600",
  number_angle = 45,
  matrix_color = "#cc0000",
  point_size = 4.5,
  point_alpha = 0.5,
  line_size = 0.8,
  shade_color = "#cdcdcd",
  shade_alpha = 0.5,
  setsbar_color = "#000066",
  setsnum_size = 6,
  text_scale = 1.2

3.3.3 flower_plot

Input Data2: Dataframe: Paired comparisons differentially expressed genes (degs) among groups (1st-col~: degs of paired comparisons).

Output Plot: Flower plot for stat common and unique gene among multiple sets.

# 1. Load example datasets

# 2. Run plot function
  flower_dat = degs_lists,
  angle = 90,
  a = 1,
  b = 2,
  r = 1,
  ellipse_col_pal = "Spectral",
  circle_col = "white",
  label_text_cex = 1

3.3.4 volcano_plot

Input Data2: Dataframe: All DEGs of paired comparison CT-vs-LT12 stats dataframe (1st-col: Genes, 2nd-col: log2FoldChange, 3rd-col: Pvalue, 4th-col: FDR).

Output Plot: Volcano plot for visualizing differentailly expressed genes.

# 1. Load example datasets
#>    Gene log2FoldChange      Pvalue         FDR
#> 1  A1I3    -1.13855748 0.000111040 0.000862478
#> 2   A1M     0.59076131 0.070988041 0.192551708
#> 3   A2M     0.09297827 0.819706797 0.913893947
#> 4 A2ML1    -0.26940689 0.745374782 0.874295125
#> 5  ABAT     1.24811621 0.000001440 0.000016800
#> 6 ABCC3    -0.72947545 0.005171574 0.024228298

# 2. Run volcano_plot plot function
  data = degs_stats,
  title = "CT-vs-LT12",
  log2fc_cutoff = 1,
  pq_value = "pvalue",
  pq_cutoff = 0.05,
  cutoff_line = "longdash",
  point_shape = "large_circle",
  point_size = 2,
  point_alpha = 0.5,
  color_normal = "#888888",
  color_log2fc = "#008000",
  color_pvalue = "#0088ee",
  color_Log2fc_p = "#ff0000",
  label_size = 3,
  boxed_labels = FALSE,
  draw_connectors = FALSE,
  legend_pos = "right"

3.3.5 ma_plot

Input Data2: Dataframe: All DEGs of paired comparison CT-vs-LT12 stats2 dataframe (1st-col: Gene, 2nd-col: baseMean, 3rd-col: Log2FoldChange, 4th-col: FDR).

Output Plot: MversusA plot for visualizing differentially expressed genes.

# 1. Load example datasets
#>    name     baseMean log2FoldChange         padj
#> 1  A1I3    0.1184475      0.0000000           NA
#> 2   A1M 1654.4618140      0.6789538 5.280802e-02
#> 3   A2M  681.0463277      1.5263838 3.920000e-07
#> 4 A2ML1  389.7226640      3.8933573 1.180000e-14
#> 5  ABAT  364.7810090     -2.3554014 1.559230e-04
#> 6 ABCC3    1.1346239      1.2932740 4.491812e-01

# 2. Run volcano_plot plot function
  data = degs_stats2,
  foldchange = 2,
  fdr_value = 0.05,
  point_size = 3.0,
  color_up = "#FF0000",
  color_down = "#008800",
  color_alpha = 0.5,
  top_method = "fc",
  top_num = 20,
  label_size = 8,
  label_box = TRUE,
  title = "CT-vs-LT12",
  xlab = "Log2 mean expression",
  ylab = "Log2 fold change",
  ggTheme = "theme_light"

3.3.6 heatmap_group

Input Data1: Dataframe: Shared DEGs of all paired comparisons in all samples expression dataframe of RNA-Seq. (1st-col: Genes, 2nd-col~: Samples).

Input Data2: Dataframe: Samples and groups for gene expression (1st-col: Samples, 2nd-col: Groups).

Output Plot: Heatmap group for visualizing grouped gene expression data.

# 1. Load example datasets

# 2. Run heatmap_group plot function
  sample_gene = gene_expression2[1:30,],
  group_sample = samples_groups,
  scale_data = "row",
  clust_method = "complete",
  border_show = TRUE,
  border_color = "#ffffff",
  value_show = TRUE,
  value_decimal = 2,
  value_size = 5,
  axis_size = 8,
  cell_height = 10,
  low_color = "#00880055",
  mid_color = "#ffffff",
  high_color = "#ff000055",
  na_color = "#ff8800",
  x_angle = 45

3.3.7 circos_heatmap

Input Data2: Dataframe: Shared DEGs of all paired comparisons in all samples expression dataframe of RNA-Seq. (1st-col: Genes, 2nd-col~: Samples).

Output Plot: Circos heatmap plot for visualizing gene expressing in multiple samples.

# 1. Load example datasets
#>   Genes  CT_1    CT_2  CT_3 LT20_1 LT20_2 LT20_3 LT15_1 LT15_2 LT15_3 LT12_1
#> 1 ACAA2 24.50   39.83 55.38 114.11 159.32  96.88 169.56 464.84 182.66 116.08
#> 2  ACAN 14.97   18.71 10.30  71.23 142.67 213.54 253.15 320.80 104.15 174.02
#> 3  ADH1  1.54    1.56  2.04  14.95  13.60  15.87  12.80  17.74   6.06  10.97
#> 4  AHSG  0.00 1911.99  0.00   0.00   0.00   0.00   0.00   0.00   0.00   0.00
#> 5 ALDH2  2.07    2.86  2.54   0.85   0.49   0.47   0.42   0.13   0.26   0.00
#> 6 AP1S3  6.62   14.59  9.30  24.90  33.94  23.19  24.00  36.08  27.40  24.06
#>   LT12_2 LT12_3 LT12_6_1 LT12_6_2 LT12_6_3
#> 1 497.29 464.48   471.43   693.62   229.77
#> 2 305.81 469.48  1291.90   991.90   966.77
#> 3  10.71  30.95     9.84    10.91     7.28
#> 4   0.00   0.00     0.00     0.00     0.00
#> 5   0.28   0.11     0.37     0.15     0.11
#> 6  38.74  34.54    62.72    41.36    28.75

# 2. Run circos_heatmap plot function
  data = gene_expression2[1:50,],
  low_color = "#0000ff",
  mid_color = "#ffffff",
  high_color = "#ff0000",
  gap_size = 25,
  cluster_run = TRUE,
  cluster_method = "complete",
  distance_method = "euclidean",
  dend_show = "inside",
  dend_height = 0.2,
  track_height = 0.3,
  rowname_show = "outside",
  rowname_size = 0.8
#> Note: 15 points are out of plotting region in sector 'group', track
#> '3'.
#> Note: 15 points are out of plotting region in sector 'group', track
#> '3'.