A mixed-cell control design to assess data processing in single-cell proteomics

A mixed-cell control design to assess data processing in single-cell proteomics


Author(s): Samuel Grégoire,Sébastien Pyr dit Ruys,Christophe Vanderaa,Didier Vertommen,Laurent Gatto

Affiliation(s): UCLouvain



Single-cell proteomics (SCP) aims at studying cellular heterogeneity by focusing on the functional effectors of the cells, proteins. While this is essential to identify cells undergoing subtle processes and point out underlying relevant protein and proteoform abundance patterns, assessing protein content inside a single cell is challenging. Thanks to recent breakthroughs in mass spectrometry and sample processing, it has become possible to increase the depth of proteome covered, reduce the time needed to analyse a cell and make this technology more accessible [1]. However, extracting meaningful biological information from this type of data requires robust and suitable data analysis methods. Progress in this field is tempered by the lack of standardised workflows. Currently, data analysis workflows are custom made and substantially different from one research team to another [2]. Moreover, it is difficult to evaluate specific steps or entire pipelines as ground truths are missing. In an effort to bridge the gap towards the standardisation of SCP data analysis, our team has developed the `scp` package [3] relying on the `QFeatures` and `SingleCellExperiment` infrastructures to provide a standardised framework for SCP data analysis. In addition, we produced our own SCP datasets to constitute a basis for data analysis benchmarking. To this end, we used a design containing cell lines mixed in known proportions to generate controlled variability [4]. In this work, we used the `scp` package to test different combinations of data processing steps and evaluated them using our ground truth data. We illustrate how we benefited from this modular, standardised framework and highlight some crucial steps. [1] Slavov, Nikolai. Scaling Up Single-Cell Proteomics. Molecular & Cellular Proteomics 21, no 1 (2022): 100179. https://doi.org/10.1016/j.mcpro.2021.100179. [2] Vanderaa, Christophe, and Laurent Gatto. 2023. The Current State of Single-Cell Proteomics Data Analysis. Current Protocols 3 (1): e658. https://doi.org/10.1002/cpz1.658 [3] Vanderaa Christophe and Laurent Gatto. Replication of Single-Cell Proteomics Data Reveals Important Computational Challenges. Expert Review of Proteomics, 1–9 (2021). https://doi.org/10.1080/14789450.2021.1988571 [4] Tian, L., Dong, X., Freytag, S. et al. Benchmarking single cell RNA-sequencing analysis pipelines using mixture control experiments. Nat Methods 16, 479–487 (2019). https://doi.org/10.1038/s41592-019-0425