R/MOTIFBREAKR_filter.R
MOTIFBREAKR_filter.Rd
For each SNP we have at least one allele achieving a p-value<1e-4 threshold that we required. The seqMatch column shows what the reference genome sequence is at that location, with the variant position appearing in an uppercase letter. pctRef and pctAlt display the the score for the motif in the sequence as a percentage of the best score that motif could achieve on an ideal sequence. In other words, (scoreVariant−minscorePWM)/(maxscorePWM−minscorePWM). We can also see the absolute scores for our method in scoreRef and scoreAlt and their respective p-values.
MOTIFBREAKR_filter(
mb_res,
merged_DT,
filter_by_locus = NULL,
remove_NA_geneSymbol = TRUE,
pct_threshold = NULL,
pvalue_threshold = 1e-04,
qvalue_threshold = 0.05,
effect_strengths = NULL,
snp_filter = "Support>0",
top_geneSymbol_hits = NULL,
no_no_loci = NULL,
verbose = TRUE
)
Results generated by MOTIFBREAKR, in GRanges format.
Table with columns Locus and
SNP to filter mb_res
by.
Filter mb_res
to only include SNPs present in a given Locus (e.g. "BST1"
).
Set to NULL
(default) for not perform any filtering.
Requires merged_DT
argument to be supplied.
Remove results where geneSymbol==NA
.
Remove rows below the percentage of the optimal binding score (PCT) threshold.
Remove rows below the raw significance value (p-value) threshold.
Remove rows below the multiple testing-corrected significance value (q-value) threshold.
Only include results with certain effect strengths.
Condition to filter SNPs by,
after mb_res
and merged_DT
have been merged together
into one table.
Only include N top results per gene symbol
based on absolute risk_score
, where N=top_geneSymbol_hits
.
If top_geneSymbol_hits=NULL
, no such filtering is performed.
Filter out SNPs contained within specific loci
in the merged_DT
table.
Print messages.
merged_DT <- echodata::get_Nalls2019_merged()
mb_res <- MOTIFBREAKR(rsid_list = c("rs11175620"),
# limit the number of datasets tests
# for demonstration purposes only
pwmList_max = 4,
calculate_pvals = FALSE)
#> genome_build set to hg19 by default.
#> Using genome_build hg19
#> + MOTIFBREAKR:: Using pre-existing tmp file.
mb_res_filt <- MOTIFBREAKR_filter(mb_res = mb_res,
merged_DT = merged_DT)
#> Creating 'risk_' columns
#> + MOTIFBREAKR 1 SNPs in results.
#> + MOTIFBREAKR 1 SNPs in results @ `snp_filter='Support>0'`
#> + MOTIFBREAKR 0 SNPs in results @ `pvalue_threshold=1e-04`
#> + MOTIFBREAKR 0 SNPs in results @ `qvalue_threshold=0.05`