Differential Accessible Regions analysis of single-cell 10X Genomics multiome data

Differential Accessible Regions analysis of single-cell 10X Genomics multiome data


Author(s): Dario Righelli,Elena Zuin,Davide Risso

Affiliation(s): Department of Statistical Sciences, University of Padova, Padua, Italy

Social media: https://twitter.com/drighelli

Multi-modal single-cell experiments have become increasingly popular in studying biological mechanisms involved in disease and drug treatments. With the emergence of multi-modal single-cell technologies, such as the 10X Genomics Multiome platform, researchers can investigate various cellular characteristics from the same cells, including gene expression (scRNAseq/GEX), chromatin accessibility (scATACseq/ATAC), methylation, and cell surface protein expression. While several computational methods have been proposed to analyze and integrate multi-modal single-cell data, there is still a need for a unified approach and for guidance on the current best practices. Here, we present a possible approach for the computation of Differential Accessible Regions (DARs) in 10X Multiome datasets. We start the analysis with the TENxMultiomeTools package (newer version of what was already presented at EuroBioc22 - https://github.com/drighelli/TENxMultiomeTools) to load the 10x-cellranger output into Bioconductor-friendly data structures for cell quality control, cell filtering, cell labelling, and advanced downstream analyses, including normalization and differential analysis. Among the steps requiring more care, cell type assignment is particularly important. Here, we leverage the simultaneous measurements of gene expression and chromatin accessibility to annotate cells based on their RNA-seq profiles and then project this cell labels knowledge onto the scATACseq data to focus our analysis on the cell types and build pseudo-bulk for this data type. However, diversity in features across the samples posed challenges in building the pseudo-bulk. We used the bioconductor DEScan2 software (https://bioconductor.org/packages/DEScan2/) to construct a consensus peak list across the samples and its count matrix per cell type, allowing us to explore typical normalization approaches for this kind of data and discover DARs between conditions. We are also exploring multiple approaches for multimodal data integration, from simple ones such as peak-gene annotation to complex ones such as gene regulatory network (GRN) construction across multiple modalities. Our proposed workflow provides users with a unified framework and a set of best practices for single-cell 10X Genomics multiome data analysis.