Single-cell omic technologies provide an unprecedented opportunity to define molecular cell states in a data-driven fashion, but present unique data integration challenges. We developed an online learning algorithm for single-cell multiomic integration, allowing highly scalable integration and the ability to incorporate new datasets without recalculating results from scratch. Furthermore, integration analyses often involve datasets with partially overlapping features, including both shared features that occur in all datasets and features exclusive to a single experiment. Previous computational integration approaches require that the input matrices share the same number of either genes or cells, and thus can use only shared features. To address this limitation, we developed a novel algorithm for ”mosaic integration” of single-cell datasets containing both shared and unshared features. Additionally, new technologies for simultaneously profiling gene expression and epigenomic state in the same cells enable investigation of the correspondence between cell states inferred from different molecular layers. To realize this potential, we developed an approach for modeling epigenetic regulation of gene expression from single-cell multiomic data, allowing us to quantify the degree of concordance or decoupling between transcriptomic and epigenomic states.
Joshua D. Welch, Ph.D.
Department of Computational Medicine and Bioinformatics
Department of Computer Science and Engineering
University of Michigan Medical School