255 samples | Different brain regions from the same donor | PSI matrix from sQTL. Dream | coef = age.
expression_dir = "~/ad-omics_hydra/microglia_omics/expression_tables/added_pilot_314s/expr_4brain_regions/"
work_plots = "~/pd-omics/katia/Microglia/mic_255s/nat_gen_rev1/PSI_data/"
metadata_path = "~/pd-omics/katia/Microglia/mic_255s/metadata_files/"
PSI = proportion of reads supporting the inclusion of a cassette exon. We are using quantile normalisation to be safe with the regressions.
How should we interpret the results? So each entry is a splicing event, either a cassette exon or an alternative splice site. So it’s about how much the exon or splice site is used changes with age. In either direction.
# load(paste0(expression_dir, "Expression_filt_255s.Rdata")) #gene expression data
# dim(genes_counts_exp_3rd) # 19376 255
load(paste0(metadata_path, "metadata_255filt_eng_29jun2020.Rdata"))
# str(metadata3rd_pass)
load(paste0(work_plots, "all_microglia_filtered_psi_quantile_df.RData"))
# dim(psi_qnorm)
gene_annot <- read.delim2(paste0(work_plots, "all_microglia_filtered_gene_annotation.tsv"), header = T)
metadata <- metadata3rd_pass
ageByDonor = unique(metadata[,c("donor_id", "age", "sex")])
ggplot(ageByDonor, aes(x=age, fill=age)) +
geom_histogram(bins = 25, colour='black', position = "stack", fill="#5c88da99") +
labs(x="Age", y="Donors") +
scale_y_continuous(breaks = (1:20)) +
scale_x_continuous(breaks=seq(20,120,10)) +
theme_classic()
To use cause_of_death in the model, I got the NAs and change it for “Other” category.
To use C1-C4 I calculated the median for the missing samples.
NOTE: The splicing data is already normalised so, we’ll not use voom here.
params = BiocParallel::MulticoreParam(workers=4, progressbar=T)
register(params)
registerDoParallel(4)
metadata$cause_of_death_categories[metadata$cause_of_death_categories %in% NA] <- "Other"
# table(metadata$cause_of_death_categories)
metadata$C1 = metadata$C1 %>% replace_na(median(metadata$C1, na.rm = T))
metadata$C2 = metadata$C2 %>% replace_na(median(metadata$C2, na.rm = T))
metadata$C3 = metadata$C3 %>% replace_na(median(metadata$C3, na.rm = T))
metadata$C4 = metadata$C4 %>% replace_na(median(metadata$C4, na.rm = T))
# Matching the names from the DE analysis to use the same code
genes_counts4deg = psi_qnorm
metadata4deg = metadata
# all(colnames(genes_counts4deg) == metadata$donor_tissue) # Check the order of columns - TRUE
# Check variance partition version
# packageVersion("variancePartition") # Must be 1.17.7
# The variable to be tested should be a fixed effect
form <- ~ sex + (1|donor_id) + age + tissue + (1|cause_of_death_categories) + C1 + C2 + C3 + C4 + picard_pct_mrna_bases + picard_summed_median + picard_pct_ribosomal_bases
# estimate weights using linear mixed model of dream
vobjDream = suppressWarnings( voomWithDreamWeights( genes_counts4deg, form, metadata4deg ) )
# Since the splicing data is already normalised, we'll not use dream voom here. We use the Weights from Dream with the quantile normalised data.
vobjDream$E = as.matrix(genes_counts4deg)
# Fit the dream model on each gene
fitmm = suppressWarnings (dream( vobjDream, form, metadata4deg ))
# Examine design matrix
# createDT(fitmm$design, 3)
res_age <- data.frame(topTable(fitmm, coef='age',
number=nrow(genes_counts4deg), sort.by = "p"), check.names = F)
genes_norm = as.data.frame(vobjDream$E)
res.pca = prcomp(t(genes_norm))
autoplot(res.pca, data = metadata4deg, colour = 'age') +
scale_colour_viridis_c() +
theme_classic()
I’m applying the removeBatchEffect function to get the residuals.
allResiduals <- removeBatchEffect(x = genes_norm,
batch = metadata4deg$tissue,
batch2 = paste(metadata4deg$sex, metadata4deg$cause_of_death_categories),
# design = model.matrix(~ age, data = metadata4deg), #force to not regress age. It didn't change the PCA.
covariates = as.matrix(metadata4deg[, c("picard_pct_mrna_bases", "picard_summed_median", "picard_pct_ribosomal_bases", "C1","C2","C3","C4" )]))
write.table(allResiduals, file = paste0(work_plots, "age_splicing_dream_residuals.txt"), quote = F, sep = "\t")
res.pca = prcomp(t(allResiduals))
autoplot(res.pca, data = metadata4deg, colour = 'age') +
scale_colour_viridis_c() +
theme_classic()
R version 3.6.2 (2019-12-12) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS 10.16
Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] parallel grid stats graphics grDevices utils datasets [8] methods base
other attached packages: [1] tidyr_1.1.2 dplyr_1.0.2 doParallel_1.0.15
[4] iterators_1.0.12 ggfortify_0.4.10 gridExtra_2.3
[7] RColorBrewer_1.1-2 kableExtra_1.1.0 reshape2_1.4.4
[10] pheatmap_1.0.12 readxl_1.3.1 ggpubr_0.2.4
[13] magrittr_2.0.1 gplots_3.1.0 statmod_1.4.32
[16] edgeR_3.28.0 BiocParallel_1.20.1 variancePartition_1.17.7 [19] Biobase_2.46.0 BiocGenerics_0.32.0 scales_1.1.1
[22] foreach_1.4.8 limma_3.42.0 factoextra_1.0.6
[25] ggplot2_3.3.2 ggsci_2.9 ggeasy_0.1.0
loaded via a namespace (and not attached): [1] nlme_3.1-142 bitops_1.0-6 pbkrtest_0.4-8.6
[4] webshot_0.5.2 progress_1.2.2 httr_1.4.2
[7] numDeriv_2016.8-1.1 tools_3.6.2 DT_0.13
[10] R6_2.5.0 KernSmooth_2.23-16 colorspace_2.0-0
[13] withr_2.3.0 tidyselect_1.1.0 prettyunits_1.1.1
[16] compiler_3.6.2 rvest_0.3.5 xml2_1.2.2
[19] labeling_0.4.2 caTools_1.18.0 readr_1.3.1
[22] stringr_1.4.0 digest_0.6.27 minqa_1.2.4
[25] rmarkdown_2.0 colorRamps_2.3 pkgconfig_2.0.3
[28] htmltools_0.5.0 lme4_1.1-21 htmlwidgets_1.5.2
[31] rlang_0.4.8 rstudioapi_0.13 generics_0.1.0
[34] farver_2.0.3 jsonlite_1.7.1 crosstalk_1.1.0.1
[37] gtools_3.8.2 Matrix_1.2-18 Rcpp_1.0.5
[40] munsell_0.5.0 lifecycle_0.2.0 stringi_1.5.3
[43] yaml_2.2.1 MASS_7.3-51.6 plyr_1.8.6
[46] ggrepel_0.8.2 crayon_1.3.4 lattice_0.20-38
[49] splines_3.6.2 hms_0.5.3 locfit_1.5-9.1
[52] knitr_1.26 pillar_1.4.7 boot_1.3-23
[55] ggsignif_0.6.0 codetools_0.2-16 glue_1.4.2
[58] evaluate_0.14 BiocManager_1.30.10 vctrs_0.3.5
[61] nloptr_1.2.1 cellranger_1.1.0 gtable_0.3.0
[64] purrr_0.3.4 xfun_0.11 viridisLite_0.3.0
[67] tibble_3.0.4 lmerTest_3.1-1 ellipsis_0.3.1