R/plot_locus_multi.R
plot_locus_multi.Rd
Plot multi-GWAS/QTL or multi-ancestry(i.e. trans-ethnic) fine-mapping results generated by tools like PAINTOR.
plot_locus_multi(
dat_ls,
LD_ls,
locus_dir,
conditions = names(dat_ls),
show_plot = TRUE,
verbose = TRUE,
...
)
A named list of data.tables generated by
echolocatoR::finemap_loci
.
A named list of link disequilibrium (LD) matrices
(one per item in dat_ls
).
Storage directory to use.
Conditions to group dat_ls
by.
The length of conditions
must equal the number of
items in dat_ls
.
Print plot to screen.
Print messages.
Arguments passed on to plot_locus
dataset_type
Dataset type (e.g. "GWAS" or "eQTL").
color_r2
Whether to color data points (SNPs) by how strongly they correlate with the lead SNP (i.g. LD measured in terms of r2).
finemap_methods
Fine-mapping methods to plot tracks for, where the y-axis show the Posterior Probabilities (PP) of each SNP being causal.
track_order
The order in which tracks should appear (from top to bottom).
track_heights
The height of each track (from top to bottom).
plot_full_window
Include a track with a Manhattan plot of the full GWAS/eQTL locus (not just the zoomed-in portion).
dot_summary
Include a dot-summary plot that highlights the Lead, Credible Set, and Consensus SNPs.
qtl_suffixes
If columns with QTL data is included in dat
,
you can indicate which columns those are with one or more string suffixes
(e.g. qtl_suffixes=c(".eQTL1",".eQTL2")
to use the columns
"P.QTL1", "Effect.QTL1", "P.QTL2", "Effect.QTL2").
mean.PP
Include a track showing mean Posterior Probabilities (PP) averaged across all fine-mapping methods.
sig_cutoff
Filters out SNPs to plot based on an (uncorrected) p-value significance cutoff.
gene_track
Include a track showing gene bodies.
max_transcripts
Maximum number of transcripts per gene.
tx_biotypes
Transcript biotypes to include in the gene model track.
By default (NULL
), all transcript biotypes will be included.
See get_tx_biotypes for a full list of
all available biotypes
point_size
Size of each data point.
point_alpha
Opacity of each data point.
snp_group_lines
Include vertical lines to help highlight SNPs belonging to one or more of the following groups: Lead, Credible Set, Consensus.
labels_subset
Include colored shapes and RSID labels to help highlight SNPs belonging to one or more of the following groups: Lead, Credible Set, Consensus.
xtext
Include x-axis title and text for each track (not just the lower-most one).
show_legend_genes
Show the legend for the gene_track
.
zoom_exceptions_str
Names of tracks to exclude when zooming.
xgr_libnames
Passed to XGR_plot.
Which XGR annotations to check overlap with.
For full list of libraries see
here.
Passed to the RData.customised
argument in xRDataLoader.
Examples:
"ENCODE_TFBS_ClusteredV3_CellTypes"
"ENCODE_DNaseI_ClusteredV3_CellTypes"
"Broad_Histone"
xgr_n_top
Passed to XGR_plot. Number of top annotations to be plotted (passed to XGR_filter_sources and then XGR_filter_assays).
nott_epigenome
Include tracks showing brain cell-type-specific epigenomic data from Nott et al. (2019).
nott_regulatory_rects
Include track generated by NOTT2019_epigenomic_histograms.
nott_show_placseq
Include track generated by NOTT2019_plac_seq_plot.
nott_binwidth
When including Nott et al. (2019) epigenomic data in the track plots, adjust the bin width of the histograms.
nott_bigwig_dir
Instead of pulling Nott et al. (2019) epigenomic data from the UCSC Genome Browser, use a set of local bigwig files.
roadmap
Find and plot annotations from Roadmap.
roadmap_n_top
Passed to ROADMAP_plot. Number of top annotations to be plotted (passed to ROADMAP_query).
roadmap_query
Only plot annotations from Roadmap whose
metadata contains a string or any items from a list of strings
(e.g. "brain"
or c("brain","liver","monocytes")
).
credset_thresh
The minimum fine-mapped posterior probability
for a SNP to be considered part of a Credible Set.
For example, credset_thresh=.95
means that all Credible Set SNPs
will be 95% Credible Set SNPs.
facet_formula
Formula to facet plots by. See facet_grid for details.
save_plot
Save plot as RDS file.
plot_format
Format to save plot as when saving with ggsave.
save_RDS
Save the tracks as an RDS file (Warning: These plots take up a lot disk space).
return_list
Return a named list with each track as a separate plot
(default: FALSE
). If TRUE
, will return a merged plot using
wrap_plots.
dat
Data to query transcripts with.
LD_reference
LD reference to use:
"1KGphase1" : 1000 Genomes Project Phase 1 (genome build: hg19).
"1KGphase3" : 1000 Genomes Project Phase 3 (genome build: hg19).
"UKB" : Pre-computed LD from a British European-decent subset of UK Biobank. Genome build : hg19
"<vcf_path>" : User-supplied path to a custom VCF file
to compute LD matrix from.
Accepted formats: .vcf / .vcf.gz / .vcf.bgz
Genome build : defined by user with target_genome
.
"<matrix_path>" : User-supplied path to a pre-computed LD matrix
Accepted formats: .rds / .rda / .csv /
.tsv / .txt
Genome build : defined by user with target_genome
.
conda_env
Conda environments to search in.
If NULL
(default), will search all conda environments.
nThread
Number of threads to parallelize over.
LD_matrix
LD matrix.
zoom
Zoom into the center of the locus when plotting (without editing the fine-mapping results file). You can provide either:
The size of your plot window in terms of basepairs
(e.g. zoom=50000
for a 50kb window).
How much you want to zoom in (e.g. zoom="1x"
for the full locus, zoom="2x"
for 2x zoom into the center of the locus, etc.).
You can pass a list of window sizes (e.g. c(50000,100000,500000)
)
to automatically generate
multiple views of each locus.
This can even be a mix of different style inputs: e.g.
c("1x","4.5x",25000)
.
genomic_units
Which genomic units to return window limits in.
density_adjust
Passed to adjust
argument in
geom_density.
strip.text.y.angle
Angle of the y-axis facet labels.
dpi
dpi to use for raster graphics
height
height (defaults to the height of current plotting window)
width
width (defaults to the width of current plotting window)
consensus_thresh
The minimum number of fine-mapping tools in which a SNP is in the Credible Set in order to be included in the "Consensus_SNP" column.
locus_dir <- file.path(tempdir(),echodata::locus_dir)
#### Make dat_ls ####
dat <- echodata::filter_snps(echodata::BST1, bp_distance = 10000)
#> FILTER:: Filtering by SNP features.
#> + FILTER:: Post-filtered data: 114 x 26
dat_ls <- list(gwas1=dat, gwas2=dat)
#### Make LD_ls ####
LD_matrix <- echodata::BST1_LD_matrix
LD_ls <- list(ancestry1=LD_matrix, ancestry2=LD_matrix)
#### Make plot ####
plot_list <- plot_locus_multi(dat_ls = dat_ls,
LD_ls = LD_ls,
locus_dir = locus_dir)
#> LD_matrix detected. Coloring SNPs by LD with lead SNP.
#> Filling r/r2 NAs with 0
#> LD_matrix detected. Coloring SNPs by LD with lead SNP.
#> Filling r/r2 NAs with 0
#> +-------- Locus Plot: BST1 --------+
#> + support_thresh = 2
#> + Calculating mean Posterior Probability (mean.PP)...
#> + 4 fine-mapping methods used.
#> + 8 Credible Set SNPs identified.
#> + 4 Consensus SNPs identified.
#> + Filling NAs in CS cols with 0.
#> + Filling NAs in PP cols with 0.
#> ++ echoplot:: GWAS full window track
#> ++ echoplot:: GWAS track
#> ++ echoplot:: Merged fine-mapping track
#> Melting PP and CS from 5 fine-mapping methods.
#> + echoplot:: Constructing SNP labels.
#> Adding SNP group labels to locus plot.
#> ++ echoplot:: Adding Gene model track.
#> Converting dat to GRanges object.
#> max_transcripts= 1 .
#> 1 transcripts from 1 genes returned.
#> Fetching data...
#> OK
#> Parsing exons...
#> OK
#> Defining introns...
#> OK
#> Defining UTRs...
#> OK
#> Defining CDS...
#> OK
#> aggregating...
#> Done
#> Constructing graphics...
#> + Adding vertical lines to highlight SNP groups.
#> +>+>+>+>+ zoom = 1x +<+<+<+<+
#> + echoplot:: Get window suffix...
#> + echoplot:: Removing GWAS full window track @ zoom=1x
#> + Removing subplot margins...
#> + Reordering tracks...
#> + Ensuring last track shows genomic units.
#> + Aligning xlimits for each subplot...
#> + Checking track heights...