Write, containerize and publish versioned Quarto books with Bioconductor Author(s): Jacques Serizay Affiliation(s): Institut Pasteur Recently, several bioinformatics books have focused on describing state-of-the-art genomic analysis workflows, such as functional analysis of gene sets, single-cell transcriptomics, or spatial transcriptomics. R and the Bioconductor ecosystem have long been at the forefront of computational tools for genomics. With `Bookdown` and more recently with `Quarto`, the authoring of books on bioinformatics topics has been facilitated.
ViScoreR: label-based evaluation of dimensionality reduction by detecting local distortions Author(s): David Novak, Sofie Van Gassen, Yvan Saeys Affiliation(s): FWO & Inflammation Research Center, VIB-UGent Social media: https://twitter.com/dwdnvwk Dimensionality reduction (DR) of single-cell data (flow cytometry, CyTOF, scRNA-seq, CITE-seq) is important for visualisation and, increasingly, downstream structure learning. A growing number of non-linear DR techniques (t-SNE, UMAP, PHATE, TriMap) can transform data into informative low-dimensional embeddings quickly, with different emphasis on local and global structure preservation.
Towards resolving ambiguity in promoter to gene assignment for omics data integration Author(s): Fiona Ross,Charlotte Soneson,Michael B Stadler Affiliation(s): FMI, University of Basel The analysis of transcriptional regulation often involves linking transcriptomic data to measurements of the chromatin state of the corresponding promoters, for example obtained from ChIP-seq or ATAC-seq experiments. The analysis is typically done at the gene level, which poses the question of which promoter region to choose for genes with multiple transcription start sites.
The Bioconductor Teaching Committee Author(s): Charlotte Soneson, Laurent Gatto, The Bioconductor Teaching Committee Affiliation(s): Friedrich Miescher Institute for Biomedical Research; de Duve Institute, UCLouvain Social media: https://fosstodon.org/@csoneson The Bioconductor teaching committee was founded in 2020 with the aims to provide networking opportunities and coordinate training activities within the Bioconductor community, as well as to establish a connection with The Carpentries and assemble and deliver Bioconductor-focused training material. Here, we present an overview of our current activities and describe different ways of engaging with the committee.
STEGO.R for (easy) interrogation of combined scTCR repertoire and scRNA-seq data Author(s): Kerry Mullan, My Ha, Sebastiaan Valkiers, Kris Laukens, Benson Ogunjimi, and Pieter Meysman Affiliation(s): University of Antwerp Social media: https://twitter.com/kerry_mullan Introduction. T cells are critical to protect against a broad array of aberrant threats including pathogens and cancer. The hypervariable T cell receptor (TCR), created through somatic recombination, is what allows for recognition of a diverse array of antigens.
Semi-supervised probabilistic Factor Analysis (spFA) to uncover novel axes of variation in multi-omics data sets
Semi-supervised probabilistic Factor Analysis (spFA) to uncover novel axes of variation in multi-omics data sets Author(s): Tümay Capraz, Wolfgang Huber Affiliation(s): EMBL Heidelberg High-throughput multi-omics techniques have revolutionised our understanding of how cells work at the molecular level. These powerful tools enable comprehensive analysis of genes, proteins, metabolites, and other biological molecules on a large scale, offering exciting possibilities in precision medicine and biomarker discovery. However, the data's complexity makes interpretation challenging due to its high-dimensionality.
ScalablePCA: Benchmarking principal component analysis for large-scale single-cell RNA-sequencing data
ScalablePCA: Benchmarking principal component analysis for large-scale single-cell RNA-sequencing data Author(s): Ilaria Billato,Chiara Romualdi,Gabriele Sales,Davide Risso Affiliation(s): Department of Biology, University of Padova With the advances in sequencing technology, the size and complexity of single-cell RNA-seq data are increasing, to the point that standard workflows are becoming too computationally demanding. In fact, datasets with millions of cells are becoming routine and they require workflows that operate out-of-memory. However, existing software tools for single cells do not scale well to such large datasets.
peakCombiner: An R package to curate and merge enriched genomic regions into consensus peak sets Author(s): Markus Muckenhuber, Michael Stadler, Kathleen Sprouffske Affiliation(s): NIBR, Oncology Data Science Genome-wide epigenomic data sets like ChIP-seq or ATAC-seq typically use peak calling tools to identify genomic regions of interest, called peaks, usually for multiple sample replicates and across experimental conditions. Many downstream analyses require a consensus set of genomic regions relevant to the experiment, but current tools within the R ecosystem to create combined peak sets easily and flexibly from conditions and replicates are limited.
Multi-omics integration: a regression based approach Author(s): Angelo Velle,Nicolò Gnoato,Ilaria Billato,Stefania Pirrotta,Enrica Calura,Chiara Romualdi Affiliation(s): Department of Biology, University of Padova Social media: https://twitter.com/angelo_velle In the last years an increasing number of new techniques for omics data acquisition have been developed, so today we have access to large datasets containing different omics such as gene expression, methylation, copy number variation and miRNA expression data. For example, TCGA gives access to all these kinds of data for thousands of tumor samples.
mitology: a new tool to dissect mitochondrial activity from transcriptome Author(s): Stefania Pirrotta,Laura Masatti,Nicolò Gnoato,Paolo Martini,Massimo Bonora,Enrica Calura Affiliation(s): Biology Department, University of Padova Mitochondria are a main control center for metabolism and OXPHOS. As a consequence, the phenotypic manifestations of an impaired mitochondrial function may be highly heterogeneous. Therefore, an analysis of high-throughput trascriptomic data from a disease condition may show heavy alterations in the mitochondrial activity. Further, with the new technologies of single-cell and spatial transcriptomics it is now possible to explore the mitochondrial alterations and dissect its heterogeneity at a single-cell resolution.
miaverse – microbiome analytics framework in SummarizedExperiment family Author(s): Tuomas Borman,Leo M Lahti Affiliation(s): University of Turku Because of the complex and high dimensional nature of microbiome profiling data, machine learning and other computational approaches have become an instrumental part of the researcher’s toolkit in this rapidly evolving field. There is an increasing need to develop robust and reproducible methods that take into account current and future trends in microbiome research such as continuously developing methods, multi-omics and expanding datasets.
Methodical: Redefining Promoters Based on Transcriptional Regulation By DNA Methylation Author(s): Richard Heery Affiliation(s): European Institute of Oncology DNA methylation at gene promoters is generally considered to be associated with transcriptional repression. However, lack of a clear picture of where promoter methylation is most important has obscured our understanding of this relationship and resulted in a wide variety of arbitrary promoter definitions being used in different DNA methylation studies. These vary substantially in both their length and location relative to the transcription start site (TSS) and lead to inconsistency between promoter methylation studies.
MeRgeION: a multifunctional R pipeline for small molecule LC-MS/MS data processing, searching, and organizing
MeRgeION: a multifunctional R pipeline for small molecule LC-MS/MS data processing, searching, and organizing Author(s): Youzhong Liu Affiliation(s): de Duve Institute, UCLouvain Small molecule structure elucidation using tandem mass spectrometry (MS/MS) plays a crucial role in life science, bioanalytical and pharmaceutical research. There is a pressing need for increased throughput of compound identification and transformation of historical data into information-rich spectral databases. Meanwhile, molecular networking, a recent bioinformatic framework, provides global displays and system-level understanding of complex LC-MS/MS datasets.
Inferring residue-level hydrogen deuterium exchange with ReX Author(s): Oliver Crook Affiliation(s): University of Oxford Hydrogen-Deuterium Exchange mass-spectrometry (HDX-MS) has emerged as a powerful technique to explore the conformational dynamics of proteins and protein complexes in solution. In the bottom-up approach to MS, deuterium uptake is reported at the level of peptides, which complicates interpretation and means ad-hoc approaches are used to resolve contradictions between overlapping peptides. Here we propose to leverage the overlap in peptides, the temporal component of the data and the correlation along the sequence dimension to infer residue-level uptake patterns.
High-resolution coverage analysis detects and quantifies alternative mRNA processing events Author(s): Francesco Dossena Affiliation(s): University of Milan and Human Technopole Gene expression is regulated at multiple levels, starting with transcription and maturation of RNA species in the nucleus, and continuing with protein synthesis in the cytoplasm. Post-transcriptional gene regulation (PTGR) mechanisms, such as splicing, alternative polyadenylation (APA), mRNA decay, or translational control, play a crucial role in ensuring correct protein synthesis.
From Structure to Specificity: Investigating the Molecular Framework of RNF E3 Ligases and Substrate Interactions
From Structure to Specificity: Investigating the Molecular Framework of RNF E3 Ligases and Substrate Interactions Author(s): Valentyna Tararina, Oleksandra Makhankova, Oleksandr Zholos Affiliation(s): Taras Shevchenka National University of Kyiv, Ukraine In recent years, monovalent degraders have emerged as a new strategy in drug development, offering the potential to selectively eliminate disease-causing proteins. However, most of the known molecular glues were discovered by chance and there is currently no specific approach to the development of molecular degraders.
From Shiny App to Enterprise SaaS Solution: Lessons Learned and Necessary Tech Stack Author(s): Mauro Masiero Affiliation(s): BigOmics R/Shiny is one of the quickest ways to create simple user interfaces that allows users to explore and bring bioinformatics tools to a broad range of users. However, most of the Shiny applications are monolithic and run locally, where they are composed of a single folder containing App.R and simple to no deployment structure.
fmsne: fast multi-scale neighbour embedding in R Author(s): Laurent Gatto,Cyril de Bodt Affiliation(s): UCLouvain, Belgium Social media: https://fosstodon.org/@lgatto Dimensionality reduction (DR) has been a workhorse of large scale, multivariate omics data analysis from the early days. Since the advent of single-cell RNA sequencing, non-linear approaches have taken the front stage, with t-distributed stochastic neighbour embedding (t-SNE) [1,2] being one of, if not the main player. Packages such as `Rtsne`  and `scater`  have made it easy to apply t-SNE in R/Bioconductor workflows.
DifferentialRegulation: a novel approach to identify differentially regulated genes Author(s): Simone Tiberi,Joel Meili,Charlotte Soneson,Dongze He,Hirak Sarkar,Robert Patro,Mark Robinson Affiliation(s): Department of Statistical Sciences, University of Bologna, Bologna, Italy Social media: https://twitter.com/tiberi_simone Background Technological developments have led to an explosion of high-throughput data, which reveal unprecedented perspectives on cell identity. Recently, significant attention has focused on studying cellular dynamic processes, such as cell differentiation, cell (de)activation, and gene regulation. Aim and impact We introduce DifferentialRegulation, a novel approach to investigate gene regulation from bulk and single-cell RNA-seq data.
De novo functional transcriptomics with RNA-seq and Ribo-seq Author(s): Roberto Albanese Affiliation(s): Human Technopole Our genome sequencing and assembly capabilities have hugely increased. However, annotating our genome is still challenging. Genes can produce multiple isoforms, which may have different expression levels and roles. Transcript functions can be elucidated by profiling ribosome positions with the protocol Ribo-seq. By using ribosome profiling data, it is possible to study the fates of cytoplasmic transcripts and to quantify translational levels.
CTexploreR: taking on the challenge of Cancer-Testis genes Author(s): Julie Devis,Axelle Loriot,Charles De Smet,Laurent Gatto Affiliation(s): UCLouvain Social media: https://twitter.com/JulieDevis Cancer-Testis (CT) genes are tissue-specific genes whose expression is limited to the germline. They are normally repressed in somatic tissues, but can be aberrantly activated in tumors. For many CT genes, tumoral activation is enabled by loss of promoter DNA methylation. CT genes are of great interest. First, they have clinical potential as cancer-specific antigens, and can thus be used as target for cancer immunotherapy and as cancer biomarkers.
consICA: multimodal data deconvolution, integration and elucidation of biological processes in cancer research
consICA: multimodal data deconvolution, integration and elucidation of biological processes in cancer research Author(s): Maryna Chepeleva,Tony Kaoma,Arnaud Muller,Sang-Yoon Kim,Vladimir Despotovic,Reka Toth,Petr V Nazarov Affiliation(s): Luxembourg Institute of Health Analysing cancer-related multiomics data, we need to overcome data complexity, tumor heterogeneity and technical biases which mask the important biological signals. This complexity is caused by natural variability in cell type proportions and clonal among tumor cells. Additionally, technical biases between experimental platforms may limit the direct comparison of patient data coming from different sources, especially mapping to large public datasets.
An end-to-end workflow for multiplexed image processing and analysis Author(s): Jonas Windhager, Vito Zanotelli, Daniel Schulz, Lasse Meyer, Michelle Daniel, Bernd Bodenmiller, Nils Eling Affiliation(s): University of Zurich Social media: https://twitter.com/NilsEling Highly multiplexed imaging allows the detection of dozens of biomolecules in single cells across tissue sections. Extracting biologically relevant information such as the spatial distribution of cell phenotypes from multiplexed tissue imaging data involves a number of computational tasks, including image segmentation, feature extraction, and spatially resolved single-cell analysis.
Alignment of Spatial Transcriptomics data with the alignProMises R package Author(s): Daniela Corbetta,Angela Andreella,Davide Risso,Livio Finos Affiliation(s): Department of Statistical Sciences - University of Padua Spatial transcriptomics is a recent genome-sequencing technique that provides information on the spatial organization of tissues while simultaneously obtaining gene expression data. The application of this technology could revolutionize medical research by providing insights into the genomic basis of brain diseases. However, analysis of brains of different subjects is challenging because they are not functionally aligned.
A Practical Strategy for Analysis of Large Cytometry Data through Supercells Author(s): Givanna H. Putri, George Howitt, Felix Marsh-Wakefield, Thomas Ashhurst, Belinda Phipson Affiliation(s): Walter and Eliza Hall Institute of Medical Research The rapid advancements in cytometry technologies have enabled the quantification of up to 50 proteins across millions of cells at a single-cell resolution. The analysis of cytometry data necessitates the use of computational tools for tasks such as data integration, clustering, and dimensionality reduction.
A mixed-cell control design to assess data processing in single-cell proteomics Author(s): Samuel Grégoire,Sébastien Pyr dit Ruys,Christophe Vanderaa,Didier Vertommen,Laurent Gatto Affiliation(s): UCLouvain Single-cell proteomics (SCP) aims at studying cellular heterogeneity by focusing on the functional effectors of the cells, proteins. While this is essential to identify cells undergoing subtle processes and point out underlying relevant protein and proteoform abundance patterns, assessing protein content inside a single cell is challenging. Thanks to recent breakthroughs in mass spectrometry and sample processing, it has become possible to increase the depth of proteome covered, reduce the time needed to analyse a cell and make this technology more accessible .