Compute LD from 1000 Genomes

Downloads a subset vcf of the 1KG database that matches your locus coordinates. Then uses ld to calculate LD on the fly.

LD_1KG(
  locus_dir,
  dat,
  LD_reference = "1KGphase1",
  superpopulation = NULL,
  samples = NULL,
  local_storage = NULL,
  leadSNP_LD_block = FALSE,
  force_new_vcf = FALSE,
  force_new_MAF = FALSE,
  fillNA = 0,
  stats = "R",
  verbose = TRUE
)

Arguments

locus_dir	Storage directory to use.
dat	GWAS summary statistics subset to query the LD panel with.
LD_reference	LD reference to use: "1KGphase1" : 1000 Genomes Project Phase 1 "1KGphase3" : 1000 Genomes Project Phase 3 "UKB" : Pre-computed LD from a British European-decent subset of UK Biobank.
superpopulation	Superpopulation to subset LD panel by (used only if `LD_reference` is "1KGphase1" or "1KGphase3".)
samples	Sample names to subset the VCF by before computing LD.
local_storage	Storage folder for previously downloaded LD files. If `LD_reference` is "1KGphase1" or "1KGphase3", `local_storage` is where VCF files are stored. If `LD_reference` is "UKB", `local_storage` is where LD compressed numpy array (npz) files are stored. Set to `NULL` to download VCFs/LD npz from remote storage system.
leadSNP_LD_block	Only return SNPs within the same LD block as the lead SNP (the SNP with the smallest p-value).
fillNA	When pairwise LD (r) between two SNPs is `NA`, replace with 0.
verbose	Print messages.

Details

This approach is taken, because other API query tools have limitations with the window size being queried. This approach does not have this limitations, allowing you to fine-map loci more completely.

Arguments

Details

See also