vignettes/guides/plotting_multiomic_data.Rmd
plotting_multiomic_data.Rmd
plotgardener
makes it easy to create reproducible,
publication-quality figures from multi-omic data. Since each plot can be
placed in exactly the desired location, users can stack
multiple types of genomic data so that their axes and data are correctly
aligned. In this section we will show some examples of plotting
multi-omic data and how the pgParams
object and “below”
y-coordinate can facilitate this process.
In the following example, we plot the same genomic region
(i.e. chr21:28000000-30300000
) represented in Hi-C data,
loop annotations, signal track data, GWAS data, all along a common gene
track and genome label axis:
## Load example data
library(plotgardenerData)
data("IMR90_HiC_10kb")
data("IMR90_DNAloops_pairs")
data("IMR90_ChIP_H3K27ac_signal")
data("hg19_insulin_GWAS")
## Create a plotgardener page
pageCreate(
width = 3, height = 5, default.units = "inches",
showGuides = FALSE, xgrid = 0, ygrid = 0
)
## Plot Hi-C data in region
plotHicSquare(
data = IMR90_HiC_10kb,
chrom = "chr21", chromstart = 28000000, chromend = 30300000,
assembly = "hg19",
x = 0.5, y = 0.5, width = 2, height = 2,
just = c("left", "top"), default.units = "inches"
)
## Plot loop annotations
plotPairsArches(
data = IMR90_DNAloops_pairs,
chrom = "chr21", chromstart = 28000000, chromend = 30300000,
assembly = "hg19",
x = 0.5, y = 2.5, width = 2, height = 0.25,
just = c("left", "top"), default.units = "inches",
fill = "black", linecolor = "black", flip = TRUE
)
## Plot signal track data
plotSignal(
data = IMR90_ChIP_H3K27ac_signal,
chrom = "chr21", chromstart = 28000000, chromend = 30300000,
assembly = "hg19",
x = 0.5, y = 2.75, width = 2, height = 0.5,
just = c("left", "top"), default.units = "inches"
)
## Plot GWAS data
plotManhattan(
data = hg19_insulin_GWAS,
chrom = "chr21", chromstart = 28000000, chromend = 30300000,
assembly = "hg19",
ymax = 1.1, cex = 0.20,
x = 0.5, y = 3.5, width = 2, height = 0.5,
just = c("left", "top"), default.units = "inches"
)
## Plot gene track
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(org.Hs.eg.db)
plotGenes(
chrom = "chr21", chromstart = 28000000, chromend = 30300000,
assembly = "hg19",
x = 0.5, y = 4, width = 2, height = 0.5,
just = c("left", "top"), default.units = "inches"
)
## Plot genome label
plotGenomeLabel(
chrom = "chr21", chromstart = 28000000, chromend = 30300000,
assembly = "hg19",
x = 0.5, y = 4.5, length = 2, scale = "Mb",
just = c("left", "top"), default.units = "inches"
)
pgParams
object
The pgParams()
function creates a pgParams
object that can contain any argument from plotgardener
functions.
We can recreate and simplify the multi-omic plot above by saving the
genomic region, left-based x-coordinate, and width into a
pgParams
object:
params <- pgParams(
chrom = "chr21", chromstart = 28000000, chromend = 30300000,
assembly = "hg19",
x = 0.5, just = c("left", "top"),
width = 2, length = 2, default.units = "inches"
)
Since these values are the same for each of the functions we are
using to build our multi-omic figure, we can now pass the
pgParams
object into our functions so we don’t need to
write the same parameters over and over again:
## Load example data
data("IMR90_HiC_10kb")
data("IMR90_DNAloops_pairs")
data("IMR90_ChIP_H3K27ac_signal")
data("hg19_insulin_GWAS")
## Create a plotgardener page
pageCreate(
width = 3, height = 5, default.units = "inches",
showGuides = FALSE, xgrid = 0, ygrid = 0
)
## Plot Hi-C data in region
plotHicSquare(
data = IMR90_HiC_10kb,
params = params,
y = 0.5, height = 2
)
## Plot loop annotations
plotPairsArches(
data = IMR90_DNAloops_pairs,
params = params,
y = 2.5, height = 0.25,
fill = "black", linecolor = "black", flip = TRUE
)
## Plot signal track data
plotSignal(
data = IMR90_ChIP_H3K27ac_signal,
params = params,
y = 2.75, height = 0.5
)
## Plot GWAS data
plotManhattan(
data = hg19_insulin_GWAS,
params = params,
ymax = 1.1, cex = 0.20,
y = 3.5, height = 0.5
)
## Plot gene track
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(org.Hs.eg.db)
plotGenes(
params = params,
y = 4, height = 0.5
)
## Plot genome label
plotGenomeLabel(
params = params,
y = 4.5, scale = "Mb"
)
The pgParams
object also simplifies the code for making
complex multi-omic figures when we want to change the genomic region of
our plots. If we want to change the region for the figure above, we can
simply put it into the pgParams
object and re-run the code
to generate the figure:
params <- pgParams(
chrom = "chr21", chromstart = 29000000, chromend = 30000000,
assembly = "hg19",
x = 0.5, just = c("left", "top"),
width = 2, length = 2, default.units = "inches"
)
Alternatively, if we want to plot around a particular gene rather
than a genomic region we can use pgParams()
to specify
gene
and geneBuffer
. If
geneBuffer
is not included, the default buffer adds
(gene length) / 2
base pairs to the ends of the gene
coordinates.
Since multi-omic plots often involve vertical stacking, the placement
of multi-omic plots can be facilitated with the “below” y-coordinate.
Rather than providing a numeric
value or unit
object to the y
parameter in plotting functions, we can
place a plot below the previously drawn plotgardener
plot
with a character
value consisting of the distance below the
last plot, in page units, and “b”. For example, on a page made in
inches, y = "0.1b"
will place a plot 0.1 inches below the
last drawn plot.
We can further simplify the placement code of our multi-omic figure above by using the “below” y-coordinate to easily stack our plots:
## Load example data
data("IMR90_HiC_10kb")
data("IMR90_DNAloops_pairs")
data("IMR90_ChIP_H3K27ac_signal")
data("hg19_insulin_GWAS")
## pgParams
params <- pgParams(
chrom = "chr21", chromstart = 28000000, chromend = 30300000,
assembly = "hg19",
x = 0.5, just = c("left", "top"),
width = 2, length = 2, default.units = "inches"
)
## Create a plotgardener page
pageCreate(
width = 3, height = 5, default.units = "inches",
showGuides = FALSE, xgrid = 0, ygrid = 0
)
## Plot Hi-C data in region
plotHicSquare(
data = IMR90_HiC_10kb,
params = params,
y = 0.5, height = 2
)
## Plot loop annotations
plotPairsArches(
data = IMR90_DNAloops_pairs,
params = params,
y = "0b",
height = 0.25,
fill = "black", linecolor = "black", flip = TRUE
)
## Plot signal track data
plotSignal(
data = IMR90_ChIP_H3K27ac_signal,
params = params,
y = "0b",
height = 0.5
)
## Plot GWAS data
plotManhattan(
data = hg19_insulin_GWAS,
params = params,
ymax = 1.1, cex = 0.20,
y = "0.25b",
height = 0.5
)
## Plot gene track
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(org.Hs.eg.db)
plotGenes(
params = params,
y = "0b",
height = 0.5
)
## Plot genome label
plotGenomeLabel(
params = params,
y = "0b",
scale = "Mb"
)
In many multi-omic visualizations, multiple signal tracks are often
aligned and stacked to compare different kinds of signal data and/or
signals from different samples. plotgardener
does
not normalize signal data based on variables like read
depth, but it is possible to scale
plotgardener
signal plots to the same y-axis.
To determine the appropriate y-axis range, we first must get the maximum signal score from all of our datasets to be compared:
library(plotgardenerData)
data("IMR90_ChIP_H3K27ac_signal")
data("GM12878_ChIP_H3K27ac_signal")
maxScore <- max(c(IMR90_ChIP_H3K27ac_signal$score,
GM12878_ChIP_H3K27ac_signal$score))
print(maxScore)
#> [1] 40.91454
In each of our signal plotting calls, we will then use the
range
parameter to set the range of both our y-axes to
c(0, maxScore)
. Here we can do this with our
pgParams
object:
params <- pgParams(
chrom = "chr21",
chromstart = 28000000, chromend = 30300000,
assembly = "hg19",
range = c(0, maxScore)
)
We are now ready to plot, align, and compare our signal plots along the genomic x-axis and the score y-axis:
## Create a page
pageCreate(width = 7.5, height = 2.1, default.units = "inches",
showGuides = FALSE, xgrid = 0, ygrid = 0)
## Plot and place signal plots
signal1 <- plotSignal(
data = IMR90_ChIP_H3K27ac_signal, params = params,
x = 0.5, y = 0.25, width = 6.5, height = 0.65,
just = c("left", "top"), default.units = "inches"
)
signal2 <- plotSignal(
data = GM12878_ChIP_H3K27ac_signal, params = params,
linecolor = "#7ecdbb",
x = 0.5, y = 1, width = 6.5, height = 0.65,
just = c("left", "top"), default.units = "inches"
)
## Plot genome label
plotGenomeLabel(
chrom = "chr21",
chromstart = 28000000, chromend = 30300000,
assembly = "hg19",
x = 0.5, y = 1.68, length = 6.5,
default.units = "inches"
)
## Add text labels
plotText(
label = "IMR90", fonsize = 10, fontcolor = "#37a7db",
x = 0.5, y = 0.25, just = c("left", "top"),
default.units = "inches"
)
plotText(
label = "GM12878", fonsize = 10, fontcolor = "#7ecdbb",
x = 0.5, y = 1, just = c("left", "top"),
default.units = "inches"
)
sessionInfo()
#> R version 4.3.2 (2023-10-31)
#> Platform: x86_64-apple-darwin20 (64-bit)
#> Running under: macOS Sonoma 14.2.1
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> time zone: America/New_York
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats4 grid stats graphics grDevices utils datasets
#> [8] methods base
#>
#> other attached packages:
#> [1] org.Hs.eg.db_3.18.0
#> [2] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
#> [3] GenomicFeatures_1.54.4
#> [4] AnnotationDbi_1.64.1
#> [5] Biobase_2.62.0
#> [6] GenomicRanges_1.54.1
#> [7] GenomeInfoDb_1.38.8
#> [8] IRanges_2.36.0
#> [9] S4Vectors_0.40.2
#> [10] BiocGenerics_0.48.1
#> [11] plotgardenerData_1.8.0
#> [12] plotgardener_1.8.2
#>
#> loaded via a namespace (and not attached):
#> [1] DBI_1.2.2 bitops_1.0-7
#> [3] biomaRt_2.58.2 rlang_1.1.3
#> [5] magrittr_2.0.3 matrixStats_1.2.0
#> [7] compiler_4.3.2 RSQLite_2.3.5
#> [9] png_0.1-8 systemfonts_1.0.6
#> [11] vctrs_0.6.5 stringr_1.5.1
#> [13] pkgconfig_2.0.3 crayon_1.5.2
#> [15] fastmap_1.1.1 dbplyr_2.5.0
#> [17] XVector_0.42.0 utf8_1.2.4
#> [19] Rsamtools_2.18.0 rmarkdown_2.26
#> [21] strawr_0.0.91 ragg_1.3.0
#> [23] purrr_1.0.2 bit_4.0.5
#> [25] xfun_0.43 zlibbioc_1.48.2
#> [27] cachem_1.0.8 jsonlite_1.8.8
#> [29] progress_1.2.3 blob_1.2.4
#> [31] highr_0.10 DelayedArray_0.28.0
#> [33] BiocParallel_1.36.0 parallel_4.3.2
#> [35] prettyunits_1.2.0 R6_2.5.1
#> [37] plyranges_1.22.0 bslib_0.7.0
#> [39] stringi_1.8.3 RColorBrewer_1.1-3
#> [41] rtracklayer_1.62.0 jquerylib_0.1.4
#> [43] Rcpp_1.0.12 SummarizedExperiment_1.32.0
#> [45] knitr_1.45 Matrix_1.6-5
#> [47] tidyselect_1.2.1 rstudioapi_0.16.0
#> [49] abind_1.4-5 yaml_2.3.8
#> [51] codetools_0.2-19 curl_5.2.1
#> [53] lattice_0.22-6 tibble_3.2.1
#> [55] withr_3.0.0 KEGGREST_1.42.0
#> [57] evaluate_0.23 gridGraphics_0.5-1
#> [59] desc_1.4.3 BiocFileCache_2.10.1
#> [61] xml2_1.3.6 Biostrings_2.70.3
#> [63] filelock_1.0.3 pillar_1.9.0
#> [65] MatrixGenerics_1.14.0 generics_0.1.3
#> [67] RCurl_1.98-1.14 hms_1.1.3
#> [69] ggplot2_3.5.0 munsell_0.5.0
#> [71] scales_1.3.0 glue_1.7.0
#> [73] tools_4.3.2 BiocIO_1.12.0
#> [75] data.table_1.15.2 GenomicAlignments_1.38.2
#> [77] fs_1.6.3 XML_3.99-0.16.1
#> [79] colorspace_2.1-0 GenomeInfoDbData_1.2.11
#> [81] restfulr_0.0.15 cli_3.6.2
#> [83] rappdirs_0.3.3 textshaping_0.3.7
#> [85] fansi_1.0.6 S4Arrays_1.2.1
#> [87] dplyr_1.1.4 gtable_0.3.4
#> [89] yulab.utils_0.1.4 sass_0.4.9
#> [91] digest_0.6.35 SparseArray_1.2.4
#> [93] ggplotify_0.1.2 rjson_0.2.21
#> [95] memoise_2.0.1 htmltools_0.5.8
#> [97] pkgdown_2.0.7 lifecycle_1.0.4
#> [99] httr_1.4.7 bit64_4.0.5