If it is not tabix format already
(determined by checking for a .tbi
file of the same name in the same directory),
the full summary statistics file is converted into tabix format
for super fast querying.
A query is then made using the min/max genomic positions to extract a
locus-specific summary stats file.
convert_and_query(
fullSS_path,
study_dir = NULL,
subset_path = tempfile(".tsv.gz"),
chrom_col = "CHR",
start_col = "BP",
end_col = start_col,
min_POS,
max_POS,
chrom,
save_subset = TRUE,
nThread = 1,
verbose = TRUE
)
Path to the full summary statistics file (GWAS or QTL). It is usually best to provide the absolute path rather than the relative path.
Path to study folder.
Path to save queried data subset as.
column for chromosome
column for start position
column for end position (is the same as start for snps)
Minimum genomic position to query.
Maximum genomic position to query.
Chromosome to query (e.g. "chr12" or "12").
Whether to save the queried data subset.
Number of threads to use.
Print messages.
data.table
of locus subset summary statistics
Other tabix:
convert()
if (FALSE) {
BST1 <- echodata::BST1
fullSS_path <- echodata::example_fullSS()
subset_path <- file.path(tempdir(), "BST1_Nalls23andMe_2019_subset.tsv.gz")
dat <- convert_and_query(
fullSS_path = fullSS_path,
subset_path = subset_path,
min_POS = min(BST1$POS),
max_POS = max(BST1$POS),
chrom = BST1$CHR[1]
)
}