Deconvolving cell-type-specific gene expression profiles from bulk RNA-seq samples
by Sichen Zhu, Zhengqi Wang, Kevin Bunting, Peng Qiu
Bulk RNA sequencing (bulk RNA-seq) and single-cell RNA sequencing (scRNA-seq) are two important high-throughput sequencing platforms that have wide applications in biomedical research. Bulk RNA-seq reflects the average gene expression of all cells in the sample at a low experimental cost, whereas scRNA-seq enables transcriptomics profiling at a single-cell level, although with higher experimental costs. To integrate the strengths of both sequencing approaches and capitalize on the wealth of existing bulk RNA-seq datasets, we developed a U-Net-based deep learning algorithm, BLUE, to deconvolve bulk RNA-seq samples into cell-type proportions and cell-type-specific gene expression profiles. Built upon a U-Net backbone, BLUE leverages its powerful feature extraction and representation learning capabilities to achieve accurate predictions for cell-type-specific gene expression profiles, which significantly outperform existing deconvolution algorithms. Given the accurate prediction from BLUE, we developed an integrative framework for subtyping cancer patients and identifying cell-type-specific gene signatures that can function as prognostic biomarkers for cancer.