[go: up one dir, main page]

Skip to main content

Spectral Clustering of Single-Cell RNA-Sequencing Data by Multiple Feature Sets Affinity

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14088))

Included in the following conference series:

  • 1209 Accesses

Abstract

A critical stage in the study of single-cell RNA-sequencing (scRNA-seq) data is cell clustering. The quality of feature selection, which comes first in unsupervised clustering, directly affects the quality of the analysis that follows. It is difficult to choose high-quality characteristics since the gene expression data from scRNA-seq are high dimensional. Feature extraction is often used on gene expression data to choose highly expressed features, that is, subsets of original features. The typical ways for feature selection are to either reserve by percentage or to simply establish a specified threshold number based on experience. It is challenging to guarantee that the first-rank clustering results can be procured using these methods because they are so subjective. In this study, we propose a feature selection method scMFSA to overcome the one-dimensional shortcoming of the traditional PCA method by selecting multiple top-level feature sets. The similarity matrix constructed from each feature set is enhanced by affinity to optimize the feature learning. Lastly, studies are carried out on the actual scRNA-seq datasets using the features discovered in scMFSA. The findings indicate that when paired with clustering methods, the features chosen by scMFSA can increase the accuracy of clustering results. As a result, scMFSA can be an effective tool for researchers to employ when analyzing scRNA-seq data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Macosko, E.Z., et al.: Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161(5), 1202–1214 (2015). https://doi.org/10.1016/j.cell.2015.05.002

    Article  Google Scholar 

  2. Ge, A., Aheh, A., Eab, C.: Classification models for heart disease prediction using feature selection and PCA. Inform. Med. Unlocked 19, 100330 (2020)

    Article  Google Scholar 

  3. Kiselev, V.Y., et al.: SC3: consensus clustering of single-cell RNA-seq data. Nat. Methods 14(5), 483–486 (2017). https://doi.org/10.1038/nmeth.4236

    Article  Google Scholar 

  4. Satija, R., Farrell, J.A., Gennert, D., Schier, A.F., Regev, A.: Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33(5), 495–502 (2015). https://doi.org/10.1038/nbt.3192

    Article  Google Scholar 

  5. Lin, P., Troup, M., Ho, J.W.: CIDR: ultrafast and accurate clustering through imputation for single-cell RNA-seq data. Genome Biol. 18(1), 59 (2017). https://doi.org/10.1186/s13059-017-1188-0

    Article  Google Scholar 

  6. Hu, M.W., Kim, D.W., Liu, S., Zack, D.J., Blackshaw, S., Qian, J.: PanoView: an iterative clustering method for single-cell RNA sequencing data. PLoS Comput. Biol. 15(8), e1007040 (2019). https://doi.org/10.1371/journal.pcbi.1007040

    Article  Google Scholar 

  7. Wolf, F.A., Angerer, P., Theis, F.J.: SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19(1), 15 (2018). https://doi.org/10.1186/s13059-017-1382-0

    Article  Google Scholar 

  8. Ji, Z., Ji, H.: TSCAN: pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis. Nucleic Acids Res. 44(13), e117 (2016). https://doi.org/10.1093/nar/gkw430

    Article  Google Scholar 

  9. Jiang, L., Chen, H., Pinello, L., Yuan, G.C.: GiniClust: detecting rare cell types from single-cell gene expression data with Gini index. Genome Biol 17(1), 144 (2016). https://doi.org/10.1186/s13059-016-1010-4

    Article  Google Scholar 

  10. Lall, S., Ray, S., Bandyopadhyay, S.: RgCop-A regularized copula based method for gene selection in single cell rna-seq data. Cold Spring Harbor Lab. 17(10), e1009464 (2020)

    Google Scholar 

  11. Li, L., Tang, H., Xia, R., Dai, H., Liu, R., Chen, L.: Intrinsic entropy model for feature selection of scRNA-seq data. J. Mol. Cell Biol. 14(2), 014 (2022). https://doi.org/10.1093/jmcb/mjac008

    Article  Google Scholar 

  12. Su, K., Yu, T., Wu, H.: Accurate feature selection improves single-cell RNA-seq cell clustering. Brief Bioinform. 22(5), bbab034 (2021). https://doi.org/10.1093/bib/bbab034

  13. Peter, R.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)

    Article  MATH  Google Scholar 

  14. Zhao, L., Yan, H.: MCNF: a novel method for cancer subtyping by integrating multi-omics and clinical data. IEEE/ACM Trans. Comput. Biol. Bioinform. 17(5), 1682–1690 (2020). https://doi.org/10.1109/TCBB.2019.2910515

    Article  Google Scholar 

  15. Ting, D.T., et al.: Single-cell RNA sequencing identifies extracellular matrix gene expression by pancreatic circulating tumor cells. Cell Rep. 8(6), 1905–1918 (2014). https://doi.org/10.1016/j.celrep.2014.08.029

    Article  Google Scholar 

  16. Kolodziejczyk, A.A., et al.: Single cell RNA-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell 17(4), 471–485 (2015). https://doi.org/10.1016/j.stem.2015.09.011

    Article  Google Scholar 

  17. Grover, A., et al.: Single-cell RNA sequencing reveals molecular and functional platelet bias of aged haematopoietic stem cells. Nat. Commun. 7, 11075 (2016). https://doi.org/10.1038/ncomms11075

    Article  Google Scholar 

  18. Maaten, L.J.P.V.D., Hinton, G.E.: Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  19. Liu, Q., Fu, W., Qin, J., Wei, X.Z., Gao, H.: Distributed k-means algorithm for sensor networks based on multi-agent consensus theory. In: 2016 IEEE International Conference on Industrial Technology (ICIT) (2016)

    Google Scholar 

  20. Liu, H., Zhao, R., Fang, H., Cheng, F., Fu, Y., Liu, Y.Y.: Entropy-based consensus clustering for patient stratification. Bioinformatics 33(17), 2691–2698 (2017). https://doi.org/10.1093/bioinformatics/btx167

    Article  Google Scholar 

  21. Jiang, H., Sohn, L.L., Huang, H., Chen, L.: Single cell clustering based on cell-pair differentiability correlation and variance analysis. Bioinformatics (Oxford, England) 36(21), 3684–3694 (2018)

    Google Scholar 

  22. Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3(3), 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

  23. Meilă, M.: Comparing clusterings—an information based distance. J. Multivariate Anal. 98(5), 873–895 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  24. Zhang, S.H., Wong, H.S., Shen, Y.: Generalized adjusted rand indices for cluster ensembles. Pattern Recognit. 45(6), 2214–2226 (2012). https://doi.org/10.1016/j.patcog.2011.11.017

    Article  MATH  Google Scholar 

  25. Zhang, D.J., Gao, Y.L., Zhao, J.X., Zheng, C.H., Liu, J.X.: A new graph autoencoder-based consensus-guided model for scRNA-seq cell type detection. IEEE Trans. Neural Netw. Learn. Syst. PP (2022). https://doi.org/10.1109/tnnls.2022.3190289

Download references

Acknowledgements

This work has been supported by the National Natural Science Foundation of China (61902216 and 61972226).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, Y., Li, F., Shang, J., Ge, D., Ren, Q., Li, S. (2023). Spectral Clustering of Single-Cell RNA-Sequencing Data by Multiple Feature Sets Affinity. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science, vol 14088. Springer, Singapore. https://doi.org/10.1007/978-981-99-4749-2_23

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4749-2_23

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4748-5

  • Online ISBN: 978-981-99-4749-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics