[go: up one dir, main page]

Imputing single-cell RNA-seq data by considering cell heterogeneity and prior expression of dropouts

J Mol Cell Biol. 2021 Apr 10;13(1):29-40. doi: 10.1093/jmcb/mjaa052.

Abstract

Single-cell RNA sequencing (scRNA-seq) provides a powerful tool to determine expression patterns of thousands of individual cells. However, the analysis of scRNA-seq data remains a computational challenge due to the high technical noise such as the presence of dropout events that lead to a large proportion of zeros for expressed genes. Taking into account the cell heterogeneity and the relationship between dropout rate and expected expression level, we present a cell sub-population based bounded low-rank (PBLR) method to impute the dropouts of scRNA-seq data. Through application to both simulated and real scRNA-seq datasets, PBLR is shown to be effective in recovering dropout events, and it can dramatically improve the low-dimensional representation and the recovery of gene‒gene relationships masked by dropout events compared to several state-of-the-art methods. Moreover, PBLR also detects accurate and robust cell sub-populations automatically, shedding light on its flexibility and generality for scRNA-seq data analysis.

Keywords: dropout; imputation; low-rank; single-cell RNA-seq; systems biology.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Data Analysis
  • Genetic Heterogeneity
  • Humans
  • Mice
  • RNA-Seq / methods*
  • Single-Cell Analysis / methods*
  • Software