Query the 1000 Genomes Project for a subset of their individual-level VCF files.

LD_1KG_download_vcf(
  dat,
  LD_reference = "1KGphase1",
  superpopulation = NULL,
  samples = NULL,
  local_storage = NULL,
  locus_dir,
  force_new_vcf = FALSE,
  verbose = TRUE
)

Arguments

dat

GWAS summary statistics subset to query the LD panel with.

LD_reference

LD reference to use:

  • "1KGphase1" : 1000 Genomes Project Phase 1

  • "1KGphase3" : 1000 Genomes Project Phase 3

  • "UKB" : Pre-computed LD from a British European-decent subset of UK Biobank.

superpopulation

Superpopulation to subset LD panel by (used only if LD_reference is "1KGphase1" or "1KGphase3".)

samples

Sample names to subset the VCF by before computing LD.

local_storage

Storage folder for previously downloaded LD files. If LD_reference is "1KGphase1" or "1KGphase3", local_storage is where VCF files are stored. If LD_reference is "UKB", local_storage is where LD compressed numpy array (npz) files are stored. Set to NULL to download VCFs/LD npz from remote storage system.

locus_dir

Storage directory to use.

force_new_vcf

Force the creation of a new LD file even if one exists.

verbose

Print messages.

Details

data("BST1") vcf_subset.popDat <- LD_1KG_download_vcf( dat = BST1, LD_reference = "1KGphase1", locus_dir = file.path(tempdir(), locus_dir) )

See also