Query IMPACT annotations — IMPACT

Query annotations/LD-scores generated by IMPACT (Inference and Modeling of Phenotype-related ACtive Transcription), IMPACT predicts transcription factor (TF) binding at a motif site by learning the epigenomic profiles at those sites (primarily ENCODE). All data are aligned to the hg19 genome build. All data has also been reformatted to tabix indexed files and uploaded to Zenodo here to allow for rapid querying.

IMPACT_query(
  query_dat,
  types = c("annot", "ldscore"),
  populations = c("EAS", "EUR"),
  query_genome = "hg19",
  target_genome = "hg19",
  overlapping_only = TRUE,
  output_format = c("wide", "long", "list"),
  add_metadata = FALSE,
  conda_env = "echoR_mini",
  nThread = 1,
  verbose = TRUE
)

Arguments

query_dat

Variant-level summary statistics.

types

File types to include.

populations

Population ancestries to include ("EAS" = East Asian; "EUR" = European).

query_genome

Genome build that the query_granges is aligned to.

target_genome

Genome build of the VCF file.

overlapping_only

Remove variants that do not overlap with the positions in query_dat.

output_format

Output format options:

"wide" : Spread annotation across columns and keep 1 row/SNP.
"long" : Melt annotation across rows and allow multiple rows/SNP.
"list" : Do not perform merging of queries and instead return results as a named list, where the name is the file the annotation came from.

add_metadata

Add metadata about each sample (Warning: can substantially increase the dataset size).

conda_env

Conda environments to search in. If NULL (default), will search all conda environments.

nThread

Number of threads to use.

verbose

Print messages.

Value

A named list or data.table of annotations merged with query_dat.

Examples

query_dat <- echodata::BST1[1:50,]
annot_dt <- IMPACT_query(query_dat=query_dat, populations="EUR")
#> Constructing GRanges query using min/max ranges across one or more chromosomes.
#> + as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
#> ========= echotabix::query =========
#> query_dat is already a GRanges object. Returning directly.
#> Inferred format: 'table'
#> Querying tabular tabix file using: Rsamtools.
#> Checking query chromosome style is correct.
#> Chromosome format: 1
#> Retrieving data.
#> Converting query results to data.table.
#> Processing query: 4:14884541-16649679
#> Adding 'query' column to results.
#> Retrieved data with 50 rows
#> Saving query ==> /tmp/Rtmp3k34Cf/file5da929a41f76.gz
#> ========= echotabix::query =========
#> query_dat is already a GRanges object. Returning directly.
#> Inferred format: 'table'
#> Querying tabular tabix file using: Rsamtools.
#> Checking query chromosome style is correct.
#> Chromosome format: 1
#> Retrieving data.
#> Converting query results to data.table.
#> Processing query: 4:14884541-16649679
#> Adding 'query' column to results.
#> Retrieved data with 13 rows
#> Saving query ==> /tmp/Rtmp3k34Cf/file5da91d558a3f.gz