Rapid querying, colocalization, and plotting of summary stats from the eQTL Catalogue.
eQTL Catalogue currently contains 110 QTL datasets (full, genome-wide, standardized summary statistics and metadata) from 20 different studies (including GTEx V8), across many tissues/cell types/conditions (updated: 7/5/20).
The functions in catalogueR are partly derived from the eQTL Catalogue tutorial.
Additional eQTL Catalogue resources:
NOTE: The ALT allele is always the effect allele in eQTL Catalogue.
To ensure all dependendencies are installed and don’t conflict with each other, you can create a conda environment by downloading this yaml` file and entering the following in command line:
conda env create -n <path_to_yml>/catalogueR.yml
If you don’t use the conda env, you will also need to make sure tabix is installed.
if(!"devtools" %in% installed.packages()){install.packages("devtools")}
devtools::install_github("RajLabMSSM/catalogueR")
Supply one or more paths to [GWAS] summary stats files (one per locus) and automatically download any eQTL data within that range. The files can be any of these formats, either gzip-compressed (.gz
) or uncompressed: .csv
, .tsv
, space-separated
The summary stats files must have the following column names (order doesn’t matter):
SNP
(rsid for each SNP)CHR
(chromosome; with or without the “chr” prefix is fine)POS
(basepair position)
sumstats_paths <- example_sumstats_paths()
gwas.qtl_paths <- eQTL_Catalogue.query(sumstats_paths = sumstats_paths,
qtl_search = c("myeloid","Alasoo_2018"),
output_dir = "./catalogueR_queries",
split_files = T,
merge_with_gwas = T,
force_new_subset = T,
nThread=4)
GWAS.QTL <- gather_files(file_paths = gwas.qtl_paths)
# Interactive datatable of results
## WARNING: Don't use this function on large datatables, might cause freezing.
createDT(head(GWAS.QTL))