motifbreakR is a package to predict how much a SNP will disrupt
a transcription factor binding motif (if it falls within one).
Notes:
BSgenomeUsers must manually run library(BSgenome)
before running any motifbreakR functions
to successfully use this tool.
threshold=
If filterp=TRUE
, this argument indicates the p-value threshold.
If filterp=FALSE
, this argument instead indicates the pct threshold.
MOTIFBREAKR(
rsid_list,
results_dir = file.path(tempdir(), "results"),
pwmList = NULL,
pwmList_max = NULL,
genome_build = NULL,
organism = "Hsapiens",
threshold = 0.85,
show.neutral = FALSE,
method = "default",
calculate_pvals = TRUE,
force_new = FALSE,
background = c(A = 0.25, C = 0.25, G = 0.25, T = 0.25),
granularity = NULL,
nThread = 1,
verbose = TRUE
)
RSIDs of SNPs to test for motif disruption between the reference and alternative alleles..
Directory where results should be saved
as a file named:
<results_dir>/_genome_wide/motifbreakR/motifbreakR_results.rds.
If NULL
, results will not be saved to disk.
An object of class MotifList
containing the motifs that
you wish to interrogate
Limit the maximum number of PWM datasets tested
(e.g. 10
). If NULL
, no limit it set.
Genome build to use.
Only include datasets in the pwmList
performed in a particular organism.
Numeric; the maximum p-value for a match to be called or a minimum score threshold
Logical; include neutral changes in the output
Character; one of default
, log
, ic
, or notrans
; see
details.
Calculate p-values for all SNPs tested. WARNING: May take a long time if many SNPs and/or PWM are selected.
If results of the same name already exist,
overwrite them with new analyses (TRUE
).
Otherwise, import the existing results and skip the analyses
(default: FALSE
).
Numeric Vector; the background probabilities of the nucleotides
Numeric Vector; the granularity to which to round the PWM,
larger values compromise full accuracy for speed of calculation. A value of
NULL
does no rounding.
Number of threads to parallelize analyses across.
Print messages.
Other motifbreakR:
MOTIFBREAKR_filter_by_metadata()
,
MOTIFBREAKR_summarize()
library(BSgenome) ## <-- IMPORTANT!
#> Loading required package: BiocGenerics
#>
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#> as.data.frame, basename, cbind, colnames, dirname, do.call,
#> duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
#> lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
#> pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
#> tapply, union, unique, unsplit, which.max, which.min
#> Loading required package: S4Vectors
#> Loading required package: stats4
#>
#> Attaching package: 'S4Vectors'
#> The following objects are masked from 'package:base':
#>
#> I, expand.grid, unname
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
#> Loading required package: GenomicRanges
#> Loading required package: Biostrings
#> Loading required package: XVector
#>
#> Attaching package: 'Biostrings'
#> The following object is masked from 'package:base':
#>
#> strsplit
#> Loading required package: rtracklayer
#### Example fine-mapping results ####
merged_DT <- echodata::get_Nalls2019_merged()
#### Run motif analyses ####
mb_res <- MOTIFBREAKR(rsid_list = c("rs11175620"),
# limit the number of datasets tested
# for demonstration purposes only
pwmList_max = 4,
calculate_pvals = FALSE)
#> Loading required namespace: motifbreakR
#>
#> genome_build set to hg19 by default.
#> Loading required namespace: SNPlocs.Hsapiens.dbSNP144.GRCh37
#> Loading required namespace: BSgenome.Hsapiens.UCSC.hg19
#> Using genome_build hg19
#> + MOTIFBREAKR:: Converting SNP list into motifbreakR input format.
#> See system.file("LICENSE", package="MotifDb") for use restrictions.
#> Filtering pwmList to only include organism: Hsapiens
#> Only testing the first 4 datasets in pwmList.
#> + MOTIFBREAKR:: Identifying motifs and predicting disruptions.
#> reached end of SNPs list length = 1 with 1 potentially disruptive matches to 1 of 4 motifs.
#> + MOTIFBREAKR:: Saving results ==> /tmp/Rtmp3k34Cf/results/_genome_wide/motifbreakR/motifbreakR_results.rds