Calculate and/or query linkage disequilibrium (LD) from reference panels (UK Biobank, 1000 Genomes), a user-supplied pre-computed LD matrix

load_or_create(
  locus_dir,
  dat,
  force_new_LD = FALSE,
  LD_reference = c("1KGphase1", "1KGphase3", "UKB"),
  ref_genome = "hg19",
  samples = NULL,
  superpopulation = NULL,
  local_storage = NULL,
  leadSNP_LD_block = FALSE,
  fillNA = 0,
  verbose = TRUE,
  remove_tmps = TRUE,
  as_sparse = TRUE,
  download_method = "axel",
  nThread = 1
)

Arguments

locus_dir

Storage directory to use.

dat

GWAS summary statistics subset to query the LD panel with.

force_new_LD

If LD file exists, create a new one.

LD_reference

LD reference to use:

  • "1KGphase1" : 1000 Genomes Project Phase 1

  • "1KGphase3" : 1000 Genomes Project Phase 3

  • "UKB" : Pre-computed LD from a British European-decent subset of UK Biobank.

ref_genome

Genome build of the LD panel (used only if providing custom LD panel).

samples

Sample names to subset the VCF by before computing LD.

superpopulation

Superpopulation to subset LD panel by (used only if LD_reference is "1KGphase1" or "1KGphase3".)

local_storage

Storage folder for previously downloaded LD files. If LD_reference is "1KGphase1" or "1KGphase3", local_storage is where VCF files are stored. If LD_reference is "UKB", local_storage is where LD compressed numpy array (npz) files are stored. Set to NULL to download VCFs/LD npz from remote storage system.

leadSNP_LD_block

Only return SNPs within the same LD block as the lead SNP (the SNP with the smallest p-value).

fillNA

Value to fill LD matrix NAs with.

verbose

Print messages.

remove_tmps

Remove all intermediate files like VCF, npz, and plink files.

as_sparse

Convert the LD matrix to a sparse matrix.

download_method
  • "axel" : Multi-threaded

  • "wget" : Single-threaded

  • "download.file" : Single-threaded

  • "internal" : Single-threaded (passed to download.file)

  • "wininet" : Single-threaded (passed to download.file)

  • "libcurl" : Single-threaded (passed to download.file)

  • "curl" : Single-threaded (passed to download.file)

or "download.file" (single-threaded) .

nThread

Number of threads to parallelize over.

Value

A symmetric LD matrix of pairwise SNP correlations.

Details

Options:

  • Download pre-computed LD matrix from UK Biobank.

  • Download raw VCF file from 1KG and compute LD on the fly.

  • Compute LD on the fly from a user-supplied VCF file.

  • Use a user-supplied pre-computed LD-matrix.

See also

Examples

data("BST1") data("locus_dir") locus_dir <- file.path(tempdir(), locus_dir) BST1 <- BST1[seq(1, 50), ] if (FALSE) { LD_matrix <- load_or_create( locus_dir = locus_dir, dat = BST1, LD_reference = "1KGphase1" ) }