Run motifbreakR

motifbreakR is a package to predict how much a SNP will disrupt a transcription factor binding motif (if it falls within one). Notes:

BSgenomeUsers must manually run library(BSgenome) before running any motifbreakR functions to successfully use this tool.
threshold= If filterp=TRUE, this argument indicates the p-value threshold. If filterp=FALSE, this argument instead indicates the pct threshold.

MOTIFBREAKR(
  rsid_list,
  results_dir = file.path(tempdir(), "results"),
  pwmList = NULL,
  pwmList_max = NULL,
  genome_build = NULL,
  organism = "Hsapiens",
  threshold = 0.85,
  show.neutral = FALSE,
  method = "default",
  calculate_pvals = TRUE,
  force_new = FALSE,
  background = c(A = 0.25, C = 0.25, G = 0.25, T = 0.25),
  granularity = NULL,
  nThread = 1,
  verbose = TRUE
)

Source

Publication GitHub Vingette

Arguments

rsid_list: RSIDs of SNPs to test for motif disruption between the reference and alternative alleles..
results_dir: Directory where results should be saved as a file named: <results_dir>/_genome_wide/motifbreakR/motifbreakR_results.rds. If NULL, results will not be saved to disk.
pwmList: An object of class MotifList containing the motifs that you wish to interrogate
pwmList_max: Limit the maximum number of PWM datasets tested (e.g. 10). If NULL, no limit it set.
genome_build: Genome build to use.
organism: Only include datasets in the pwmList performed in a particular organism.
threshold: Numeric; the maximum p-value for a match to be called or a minimum score threshold
show.neutral: Logical; include neutral changes in the output
method: Character; one of default, log, ic, or notrans; see details.
calculate_pvals: Calculate p-values for all SNPs tested. WARNING: May take a long time if many SNPs and/or PWM are selected.
force_new: If results of the same name already exist, overwrite them with new analyses (TRUE). Otherwise, import the existing results and skip the analyses (default: FALSE).
background: Numeric Vector; the background probabilities of the nucleotides
granularity: Numeric Vector; the granularity to which to round the PWM, larger values compromise full accuracy for speed of calculation. A value of NULL does no rounding.
nThread: Number of threads to parallelize analyses across.
verbose: Print messages.

Value

Motif disruption predictions in

GRanges format.

Examples

library(BSgenome) ## <-- IMPORTANT!
#> Loading required package: BiocGenerics
#> 
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#> 
#>     Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#>     as.data.frame, basename, cbind, colnames, dirname, do.call,
#>     duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
#>     lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
#>     pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
#>     tapply, union, unique, unsplit, which.max, which.min
#> Loading required package: S4Vectors
#> Loading required package: stats4
#> 
#> Attaching package: 'S4Vectors'
#> The following objects are masked from 'package:base':
#> 
#>     I, expand.grid, unname
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
#> Loading required package: GenomicRanges
#> Loading required package: Biostrings
#> Loading required package: XVector
#> 
#> Attaching package: 'Biostrings'
#> The following object is masked from 'package:base':
#> 
#>     strsplit
#> Loading required package: rtracklayer
#### Example fine-mapping results ####
merged_DT <- echodata::get_Nalls2019_merged()
#### Run motif analyses ####
mb_res <- MOTIFBREAKR(rsid_list = c("rs11175620"),
                      # limit the number of datasets tested 
                      # for demonstration purposes only
                      pwmList_max = 4,
                      calculate_pvals = FALSE)
#> Loading required namespace: motifbreakR
#> 
#> genome_build set to hg19 by default.
#> Loading required namespace: SNPlocs.Hsapiens.dbSNP144.GRCh37
#> Loading required namespace: BSgenome.Hsapiens.UCSC.hg19
#> Using genome_build hg19
#> + MOTIFBREAKR:: Converting SNP list into motifbreakR input format.
#> See system.file("LICENSE", package="MotifDb") for use restrictions.
#> Filtering pwmList to only include organism: Hsapiens
#> Only testing the first 4 datasets in pwmList.
#> + MOTIFBREAKR:: Identifying motifs and predicting disruptions.
#> reached end of SNPs list length = 1 with 1 potentially disruptive matches to 1 of 4 motifs.
#> + MOTIFBREAKR:: Saving results ==> /tmp/Rtmp3k34Cf/results/_genome_wide/motifbreakR/motifbreakR_results.rds

Source

Arguments

Value

See also

Examples