Convert and query

If it is not tabix format already (determined by checking for a .tbi file of the same name in the same directory), the full summary statistics file is converted into tabix format for super fast querying. A query is then made using the min/max genomic positions to extract a locus-specific summary stats file.

convert_and_query(
  fullSS_path,
  study_dir = NULL,
  subset_path = tempfile(".tsv.gz"),
  chrom_col = "CHR",
  start_col = "BP",
  end_col = start_col,
  min_POS,
  max_POS,
  chrom,
  save_subset = TRUE,
  nThread = 1,
  verbose = TRUE
)

Arguments

fullSS_path: Path to the full summary statistics file (GWAS or QTL). It is usually best to provide the absolute path rather than the relative path.
study_dir: Path to study folder.
subset_path: Path to save queried data subset as.
chrom_col: column for chromosome
start_col: column for start position
end_col: column for end position (is the same as start for snps)
min_POS: Minimum genomic position to query.
max_POS: Maximum genomic position to query.
chrom: Chromosome to query (e.g. "chr12" or "12").
save_subset: Whether to save the queried data subset.
nThread: Number of threads to use.
verbose: Print messages.

Value

data.table of locus subset summary statistics

Examples

if (FALSE) {
BST1 <- echodata::BST1
fullSS_path <- echodata::example_fullSS()
subset_path <- file.path(tempdir(), "BST1_Nalls23andMe_2019_subset.tsv.gz")
dat <- convert_and_query(
    fullSS_path = fullSS_path,
    subset_path = subset_path,
    min_POS = min(BST1$POS),
    max_POS = max(BST1$POS),
    chrom = BST1$CHR[1]
)
}

Arguments

Value

See also

Examples