If it is not tabix format already (determined by checking for a .tbi file of the same name in the same directory), the full summary statistics file is converted into tabix format for super fast querying. A query is then made using the min/max genomic positions to extract a locus-specific summary stats file.

convert_and_query(
  fullSS_path,
  study_dir = NULL,
  subset_path = tempfile(".tsv.gz"),
  chrom_col = "CHR",
  start_col = "BP",
  end_col = start_col,
  min_POS,
  max_POS,
  chrom,
  save_subset = TRUE,
  nThread = 1,
  verbose = TRUE
)

Arguments

fullSS_path

Path to the full summary statistics file (GWAS or QTL). It is usually best to provide the absolute path rather than the relative path.

study_dir

Path to study folder.

subset_path

Path to save queried data subset as.

chrom_col

column for chromosome

start_col

column for start position

end_col

column for end position (is the same as start for snps)

min_POS

Minimum genomic position to query.

max_POS

Maximum genomic position to query.

chrom

Chromosome to query (e.g. "chr12" or "12").

save_subset

Whether to save the queried data subset.

nThread

Number of threads to use.

verbose

Print messages.

Value

data.table of locus subset summary statistics

See also

Other tabix: convert()

Examples

if (FALSE) {
BST1 <- echodata::BST1
fullSS_path <- echodata::example_fullSS()
subset_path <- file.path(tempdir(), "BST1_Nalls23andMe_2019_subset.tsv.gz")
dat <- convert_and_query(
    fullSS_path = fullSS_path,
    subset_path = subset_path,
    min_POS = min(BST1$POS),
    max_POS = max(BST1$POS),
    chrom = BST1$CHR[1]
)
}