Evaluating the efficacy of methodologies for Deconvolution of transcriptional profiles: A benchmarking study
Author(s): Aakanksha Singh,Katharina Imkeller
Affiliation(s): 1. Institute of Neurology (Edinger Institute), University Hospital, Goethe University, Frankfurt, Germany; 2. Frankfurt Cancer Institute (FCI), Frankfurt, Germany; 3. University Cancer Centre (UCT), Frankfurt, Germany; 4. Group of Computational Immunology, Goethe University, Frankfurt, Germany
Quantification of different types of immune cells in the tumor microenvironment is essential to understand tumor heterogeneity. In many different tumor entities, characterization of tumor-microenvironment allowed more sensitive survival analyses and more accurate tumor stratification. The process of separating a heterogeneous transcriptional mixture signal into its constituent cellular components is called cell-type deconvolution. Many deconvolution methodologies have been developed for deconvoluting all types of transcriptomics data, be it for bulk RNAseq or spatial transcriptomics. However, these methodologies differ greatly in their mathematical approach to quantify cell-types in a sample and it remains unclear how the deconvolution result is affected by different experimental parameters. We aim to evaluate the efficacy of the deconvolution methodologies on datasets of various origins and develop a comprehensive method that is capable of consistently deconvoluting any and all types of transcriptional data. For benchmarking deconvolution, we used publicly available single-cell datasets to generate pseudo-bulk samples with known numbers of specific cell types. We then deconvolute the pseudo-bulk samples with five publicly available methodologies ( quanTIseq, EPIC, MCP, SpotLight and SpatialDeconv). We bootstrapped the process 30 times to increase statistical efficiency. We deconvoluted each pseudo-bulk sample on the three commonly used amplicon- and probe-based transcript quantification panels, namely, BD Rhapsody, Visium TenX Genomics, and Nanostring. From this benchmarking analysis, we were able to study the impact of gene panels that differ in the number of genes and their quantification efficiency on the deconvolution process. To further explore the relationship with gene-panels, we deconvoluted gene panels from the same tissue from publicly available Human Ovarian Cancer dataset. Our analysis shows an inherent bias within each methodology. Our preliminary analysis shows SpatialDecon performing the best with a custom single-cell signature matrix. We are working to explore the relationship between the deconvolution methodolodies and normalization techniques, signature matrices, differential sequencing depths, and cutoffs in transcriptional data.