Unraveling Immunogenomic Diversity in Single-Cell Data Author(s): Ahmad Al Ajami,Annekathrin Silvia Ludt,Federico Marini,Katharina Imkeller Affiliation(s): Neurological Institute (Edinger Institute), University Hospital Frankfurt, Goethe University, Germany Immune molecules such as B and T cell receptors, human leukocyte antigens (HLAs), or killer Ig-like receptors (KIRs) are encoded in the most genetically diverse loci of the human genome. Many of these immune genes are hyperpolymorphic – showing high allelic diversity across human populations.
T.A.R.D.I.S.: Targeted Analysis and Raw Data Integration in Mass Spectrometry Author(s): Pablo Vangeenderhuysen,Beata Pomian,Marilyn De Graeve,Lieselot Y. Hemeryck,Lynn Vanhaecke Affiliation(s): Laboratory of Integrative Metabolomics (LIMET), Department of Translational Physiology, Infectiology and Public Health, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium Social media: https://github.com/pablovgd In recent years, ultra-high-performance liquid chromatography, coupled to high-resolution mass spectrometry (UHPLC-HRMS) has risen as the main method to measure small molecules that directly reflect the outcome of complex biochemical reactions in biological systems.
Rarr: A native R reader for Zarr Author(s): Mike L Smith Affiliation(s): European Molecular Biology Laboratory Social media: https://twitter.com/grimbough The Zarr file format has been developed for the storage of large multi-dimensional arrays. Its design is specifically focused to facilitate easy access to datasets stored in the cloud, and this has led to adoption of the Zarr file format across a wide range of scientific disciplines, including genomics, astrophysics, earth sciences and microscopy imaging.
PredictIO: A Package For Meta-Analysis of Immunotherapy Clinical Trials in Cancer Author(s): Benjamin Haibe-Kains Affiliation(s): University of Toronto Social media: http://twitter.com/bhaibeka Clinical profiling studies have shed light on molecular features and mechanisms that modulate response or resistance to immunotherapy but their predictive value remains largely unclear. We (Bareche et al., Annals of Oncology 2022) and others (Litchfield et al., Cell 2021) have recently curated a compendium of public datasets of DNA, RNA and clinical profiles of patients treated with immunotherapy.
Neutralise: an open science initiative for neutral comparison of two-sample tests Author(s): Leyla KODALCI,Olivier Thas Affiliation(s): I-BioStat, Data Science Institute, Hasselt University, Agoralaan Gebouw D, B-3590 Diepenbeek, Belgium. Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Krijgslaan 281, B-9000 Gent, Belgium. National Institute of Applied Statistics Research Australia (NIASRA), University of Wollongong, Northfields Avenue, NSW 2522, Australia The two-sample problem is one of the earliest problems in statistics: given two samples, the question is whether or not the observations were sampled from the same distribution.
msqrob2PTM: differential abundance and differential usage analysis of MS-based proteomics data at the post-translational modification and peptidoform level
msqrob2PTM: differential abundance and differential usage analysis of MS-based proteomics data at the post-translational modification and peptidoform level Author(s): Nina Demeulemeester,Lennart Martens,Lieven Clement Affiliation(s): Ghent University - VIB Novel multiple open-modification search engines that were developed in the proteomics community boost the identification of post-translational modifications (PTMs) with mass spectrometry (MS) based technologies. These developments can switch proteomics research in the next gear as PTMs are key switches in many cellular pathways that play vital roles in cell proliferation, migration, metastasis and ageing.
MoleculeExperiment enables consistent infrastructure for molecule-resolved spatial transcriptomics data
MoleculeExperiment enables consistent infrastructure for molecule-resolved spatial transcriptomics data Author(s): Bárbara Zita Peters Couto,Ellis Patrick,Shila Ghazanfar Affiliation(s): School of Mathematics and Statistics, University of Sydney Social media: https://www.linkedin.com/in/b%C3%A1rbara-zita-peters-couto-bab4a1127/ Imaging-based spatial transcriptomics technologies have achieved subcellular resolution, enabling detection of individual molecules in their native tissue context. Data associated with these technologies promises unprecedented opportunity towards understanding cellular and subcellular biology. However, there is a dearth of existing computational infrastructure to represent such data, and particularly to summarise and transform into existing widely adopted computational tools in single cell transcriptomics analysis, including SingleCellExperiment and SpatialExperiment classes.
Linear models for single-cell proteomics Author(s): Christophe Vanderaa,Laurent Gatto Affiliation(s): UCLouvain Social media: https://twitter.com/c_vanderaa Mass spectrometry (MS)-based single-cell proteomics (SCP) has become a credible player in the single-cell biology arena [1,2]. Continuous technical improvements have pushed the boundaries of sensitivity and throughput. However, the computational efforts to support the analysis of these complex data have been missing. Strong batch effects coupled to high proportions of missing values complicate the analysis, causing strong entanglement between biological and technical variability [3,4].
Juggling with offsets unlocks bulk RNA-seq tools for fast and scalable differential usage and aberrant splicing analyses
Juggling with offsets unlocks bulk RNA-seq tools for fast and scalable differential usage and aberrant splicing analyses Author(s): Alexandre Segers,Jeroen Gilis,Mattias Van Heetvelde,Elfride De Baere,Lieven Clement Affiliation(s): Ghent Millions of patients suffer from rare Mendelian diseases, for whom a diagnostic rate of their pathogenic variants of 15-75% is currently achieved with whole exome sequencing (WES) and whole genome sequencing (WGS) . There is growing evidence that the diagnostic rate can be further improved by discovering mutations in intronic and in other non-coding regions that contribute to disease by disrupting transcriptional regulation.
Identification and analysis of gene and genome duplications with the doubletrouble Bioconductor package
Identification and analysis of gene and genome duplications with the doubletrouble Bioconductor package Author(s): Fabrício Almeida-Silva,Yves Van de Peer Affiliation(s): VIB-UGent Center for Plant Systems Biology Social media: https://twitter.com/almeidasilvaf Gene and genome duplications are a source of raw genetic material for evolution. However, whole-genome duplications (WGD) and small-scale duplications (SSD) contribute to genome evolution in different manners. Here, we present doubletrouble, an R/Bioconductor package that allows the identification and classification of duplicated genes from whole-genome protein sequences.
GeDi - Improving gene set distances accounting for network-based information Author(s): Annekathrin Silvia Ludt,Federico Marini Affiliation(s): Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany Social media: https://twitter.com/AnnekathrinLudt Functional enrichment analysis, performed either via scripted analysis or with web-based tools, is one of the most frequently adopted steps in computational biology, especially when aiming to identify the systems level mechanisms captured by high-dimensional molecular datasets.
Evaluating the efficacy of methodologies for Deconvolution of transcriptional profiles: A benchmarking study
Evaluating the efficacy of methodologies for Deconvolution of transcriptional profiles: A benchmarking study Author(s): Aakanksha Singh,Katharina Imkeller Affiliation(s): 1. Institute of Neurology (Edinger Institute), University Hospital, Goethe University, Frankfurt, Germany; 2. Frankfurt Cancer Institute (FCI), Frankfurt, Germany; 3. University Cancer Centre (UCT), Frankfurt, Germany; 4. Group of Computational Immunology, Goethe University, Frankfurt, Germany Quantification of different types of immune cells in the tumor microenvironment is essential to understand tumor heterogeneity.
Differential detection workflows for multi-patient single-cell RNA-seq data Author(s): Jeroen Gilis,Laura Perin,Milan Malfait,Koen Van den Berge,Bie Verbist,Davide Risso,Lieven Clement Affiliation(s): Ghent University Social media: https://twitter.com/GilisJeroen Single-cell RNA-sequencing (scRNA-seq) has improved our understanding of complex biological processes by elucidating cell-level heterogeneity in gene expression. One of the key tasks in the downstream analysis of scRNA-seq data is studying differential gene expression (DGE). Traditional DGE analyses aim to identify genes for which the average expression differs between biological groups of interest, e.
DESpace: a sensitive approach to discover spatially variable genes Author(s): Peiying Cai,Mark Robinson,Simone Tiberi Affiliation(s): University of Zurich Background Spatially resolved transcriptomics (SRT) technologies allow measuring gene expression profiles, while also retaining information of the spatial tissue. SRT technologies have led to the release of novel methods that take advantage of the joint availability of mRNA abundance and spatial information. Notably, several computational tools have been developed to identify spatially variable genes (SVGs), i.
demuxSNP: supervised demultiplexing of scRNAseq data using cell hashing and SNPs Author(s): Michael P Lynch,Laurent Gatto,Aedin C Culhane Affiliation(s): University of Limerick Sequencing at a single-cell resolution allows unprecedented understanding of biologically relevant differences between individual cells compared to previous bulk methods. The cost of sequencing has dropped considerably in recent years. Multiplexing, that is the loading of multiple biological samples into each sequencing lane, is widely used to further reduce costs.
Democratising Knowledge Representation with BioCypher Author(s): Sebastian Lobentanzer,Julio Saez-Rodriguez Affiliation(s): Institute for Computational Biomedicine, University Hospital Heidelberg, Germany Standardising the representation of biomedical knowledge among all researchers is an insurmountable task, hindering the effectiveness of many computational methods. To facilitate harmonisation and interoperability despite this fundamental challenge, we propose to standardise the framework of knowledge graph creation instead. We implement this standardisation in BioCypher, a modular and accessible framework to transparently build biomedical knowledge graphs while preserving provenances of the source data.
Data fission for post-clustering differential analysis using dearseq Author(s): Benjamin Hivert,Denis Agniel,Rodolphe Thiébaut,Boris P Hejblum Affiliation(s): Univ. Bordeaux, INSERM, INRIA, SISTM team, BPH, U1219, F-33000 Bordeaux, France Differential expression analysis of gene expression data is crucial to describe the biological phenomena that discriminate between groups of samples at the gene level. Many statistical tests for differential analysis have been proposed. Recently, dearseq, a variance component score test was developed to ensure a better control of the False Discovery Rate in large sample studies than state-of-the-art methods for differential analysis.
COTAN v2: a Comprehensive and Versatile Framework for Single-Cell Gene Co-Expression Studies and Cell Type Identification
COTAN v2: a Comprehensive and Versatile Framework for Single-Cell Gene Co-Expression Studies and Cell Type Identification Author(s): Silvia Giulia Galfre',Marco Fantozzi,Daniel Puttini,Corrado Priami,Francesco Morandin Affiliation(s): University of Pisa The estimation of gene co-expression in single-cell RNA sequencing (scRNA-seq) is a critical step in the analysis of scRNA-seq data. The low efficiency of scRNA-seq methodologies makes sensitive computational approaches crucial to accurately infer transcription profiles in a cell population. COTAN is a statistical and computational method that analyzes the co-expression of gene pairs at the single-cell level.
Assessing differences in cell type/state abundance: compositionality, heteroscedasticity and bias Author(s): Koen Van den Berge,Alemu Takele Assefa,Bie Verbist Affiliation(s): Janssen R&D Social media: https://twitter.com/koenvdberge_be Assessing differences in cellular composition between conditions and disease states is of principal interest in immunology and medicine, helping in unraveling disease and informing drug development. The data for such analyses typically consist of a count matrix, where each element of the matrix denotes the number of cells observed for a particular cell identity (be it cell type or state) in a sample.
Analysis of multi-condition single-cell data with latent embedding multivariate regression Author(s): Constantin Ahlmann-Eltze,Wolfgang Huber Affiliation(s): EMBL Heidelberg Social media: https://twitter.com/const_ae Single-cell RNA sequencing with data from multiple biological conditions enables studying the response heterogeneity of a complex tissue to a treatment. Current approaches divide the cells into discrete groups and identify differentially expressed genes between corresponding groups. Here, we propose a method that operates without such grouping. Latent embedding multivariate regression (LEMUR) factorizes the logarithmized count matrix like principal component analysis (PCA) while at the same time accounting for the known covariates per cell.
Access and use the European prediction service for biological data Author(s): Ludwig Lautenbacher,Wassim Gabriel,Tobias Schmidt,Marco Schmidt,Dulguun Bold,Christian Panse,Tobias Kockmann,Mathias Wilheim Affiliation(s): FGCZ ETHZ|UZH Social media: https://twitter.com/hb9feb The DLOmix-serving is an open-source and modular machine learning (ML) inference server for biological data based on NVIDIA Triton . The idea is to implement a standardized interface for accessing various prediction models, see [2, 3]. Furthermore, it can be hosted on different institutional sites using a standard Docker image for service robustness and throughput matters.
A novel statistical method for single isoform proteogenomics inference Author(s): Jordy Bollon,Michael Shortreed,Ben T Jordan,Rachel Miller,Colin Dewey,Gloria M Sheynkman,Simone Tiberi Affiliation(s): Department of Statistical Sciences, University of Bologna, Bologna, Italy Social media: https://twitter.com/tiberi_simone Background Currently, the main strategy to infer proteins is via “bottom-up” proteomics, where proteins are only measured indirectly via peptides. However, most peptides (called shared peptides) map to multiple proteins in the database; this results in ambiguous protein identifications, where various protein isoforms cannot be distinguished, and protein inference is typically abstracted at the gene-level (NB: most genes are associated to multiple isoforms).