STEGO.R for (easy) interrogation of combined scTCR repertoire and scRNA-seq data
Author(s): Kerry Mullan, My Ha, Sebastiaan Valkiers, Kris Laukens, Benson Ogunjimi, and Pieter Meysman
Affiliation(s): University of Antwerp
Social media: https://twitter.com/kerry_mullan
Introduction. T cells are critical to protect against a broad array of aberrant threats including pathogens and cancer. The hypervariable T cell receptor (TCR), created through somatic recombination, is what allows for recognition of a diverse array of antigens. Current technologies allow capturing of both single cell expression data (scRNA-seq) with the paired single cell TCR sequencing (scTCR-seq) data to further understand the role of T cells. The current analytical pipelines can analyze either the scRNA-seq or the TCR repertoire, with limited capacity to analyze both in an integrated fashion. Here we developed STEGO (Single cell TCR and Expression Grouped Ontologies) Shiny R application to facilitate the complex analysis required for understanding T cells role in various conditions. Program parameters. STEGO.R application can process 10x Genomics and BD Rhapsody data. The application includes the Seurat quality control (QC) process, merging with Harmony, followed by annotations with scGATE. In addition, the program includes functionalities for TCR clustering (ClusTCR2) and annotation with target epitopes from TCRex predictions. The scRNA-seq with scTCR-seq is broken down into four sections: top clonotype, expanded clonotypes, clustering and epitope. The Shiny R interface also facilitates the program’s accessibility to novice R coders. Preliminary analysis. Out of 22 selected public datasets,12 could be processed with STEGO.R. One dataset concerned colon inflammations following melanoma therapies, and the original studies did not integrate the scRNA-seq with scTCR-seq analysis. Here STEGO identified a private clonal expansion. The colitis expanded T cells had more CD8+ T cells with cytotoxic markers including GNLY, PFR1, GZMB, NKG7, HLA-DR transcripts. The analysis also identified a TRGV4 cluster associated with melanoma cases as well as two TRBV6-2 clusters specific to colitis. Discussion. STEGO.R facilitates fast and reproducible analysis of complex scRNA-seq with TCR repertoire data. We have demonstrated its utility by extracting novel insights into T-cell biology from publicly available datasets and identifying novel insights into T-cell biology. We anticipate this program will facilitate the identification of subtle T population differences and if these are specific to a TCR clone and/or the expanded repertoire.