Infer sample size from summary stats using MungeSumstats.
get_sample_size(
dat,
compute_n = c("ldsc", "giant", "metal", "sum"),
return_only = NULL,
force_new = FALSE,
standardise_headers = FALSE,
verbose = TRUE,
...
)
Fine-mapping results data.
How to compute per-SNP sample size (new column "N").
If the column "N" is already present in dat
, this column
will be used to extract per-SNP sample sizes
and the argument compute_n
will be ignored.
If the column "N" is not present in dat
, one of the following
options can be supplied to compute_n
:
0
: N will not be computed.
>0
: If any number >0 is provided,
that value will be set as N for every row.
**Note**: Computing N this way is incorrect and should be avoided
if at all possible.
"sum"
: N will be computed as:
cases (N_CAS) + controls (N_CON), so long as both columns are present.
"ldsc"
: N will be computed as effective sample size:
Neff =(N_CAS+N_CON)*(N_CAS/(N_CAS+N_CON)) / mean((N_CAS/(N_CAS+N_CON))(N_CAS+N_CON)==max(N_CAS+N_CON)).
"giant"
: N will be computed as effective sample size:
Neff = 2 / (1/N_CAS + 1/N_CON).
"metal"
: N will be computed as effective sample size:
Neff = 4 / (1/N_CAS + 1/N_CON).
A function to return only a single value from the
inferred/imputed sample size column (e.g. max
, min
).
If "Neff" (or "N") already exists in sumstats_dt
,
replace it with the recomputed version.
Standardise headers first.
Print messages.
Additional argument passed to return_only
function,
if return_only
is not NULL
.
dat <- echodata::BST1
dat2 <- echodata::get_sample_size(dat = dat)
#> Preparing sample size column (N).
#> Computing effective sample size using the LDSC method:
#> Neff = (N_CAS+N_CON) * (N_CAS/(N_CAS+N_CON)) / mean((N_CAS/(N_CAS+N_CON))[(N_CAS+N_CON)==max(N_CAS+N_CON)]))
#> + Mapping colnames from MungeSumstats ==> echolocatoR