[go: up one dir, main page]

Skip to main content

Detecting Inconsistent Vulnerable Software Version in Security Vulnerability Reports

  • Conference paper
  • First Online:
Frontiers in Cyber Security (FCS 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1558))

Included in the following conference series:

  • 700 Accesses

Abstract

At present, the vulnerability database research has mainly focused on whether the disclosed information is accurate. However, the information differences between the various vulnerability databases have received little attention.

This article proposes a WITTY (softWare versIon inconsisTency measuremenT sYstem) to detect the differences between the affected software versions of NVD and different language vulnerability databases (including English CVE, OpenWall, Chinese CNNVD, CNVD, and other eight databases). WITTY can enable Our large-scale quantitative information consistency. We introduce named entity recognition (NER) and relation extraction (RE) based on deep learning. We present custom design into named entity recognition (NER) and relation extraction (RE) based on deep learning, enabling WITTY to recognize previously invisible software names and versions based on sentence structure and context. Ground-truth shows that the system has a high accuracy rate (95.3% accuracy rate, 89.9% recall rate). We use data from 8 vulnerability databases in the past 21 years, involving 554,725 vulnerability reports. The results show that they are inconsistent. The software version is prevalent. The average exact match rate of English vulnerability databases CVE, OpenWall, and other vulnerability databases with cve is only 22.1%. The average exact match rate of Chinese CNNVD and CNVD is 49.5%, and the excat match rate of Russian vulnerability databases is 25.8%.

This work was supported by the National Key Research and Development Program of China (2018YFB0804701), the Key Research and Development Program of Hainan Province (ZDYF202012), Guangxi Key Laboratory of Cryptography and Information Security (No. GCIS202123).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Top 10 cybersecurity incidents in global government agencies. https://www.secrss.com/articles/23835. Accessed Feb 2020

  2. Munir, R., Disso, J.P., Awan, I., Mufti, M.R.: A quantitative measure of the security risk level of enterprise networks. In: 2013 Eighth International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 437–442. IEEE (2013)

    Google Scholar 

  3. Mu, D., et al.: Understanding the reproducibility of crowd-reported security vulnerabilities. In: 27th USENIX Security Symposium (USENIX Security 2018), pp. 919–936 (2018)

    Google Scholar 

  4. Nappa, A., Johnson, R., Bilge, L., Caballero, J., Dumitras, T.: The attack of the clones: a study of the impact of shared code on vulnerability patching. In: 2015 IEEE Symposium on Security and Privacy, pp. 692–708. IEEE (2015)

    Google Scholar 

  5. Dong, Y., Guo, W., Chen, Y., Xing, X., Zhang, Y., Wang, G.: Towards the detection of inconsistencies in public security vulnerability reports. In: 28th USENIX Security Symposium (USENIX Security 2019), pp. 869–885 (2019)

    Google Scholar 

  6. CVE and NVD Relationship. https://cve.mitre.org/about/cve_and_nvd_relationship.html. Accessed Feb 2020

  7. CVE List. https://cve.mitre.org/cve/. Accessed Feb 2020

  8. NVD data feeds. https://nvd.nist.gov/vuln/data-feeds. Accessed Feb 2020

  9. Exploitdb. https://www.exploit-db.com/. Accessed Feb 2020

  10. Securityfocus. https://www.securityfocus.com/vulnerabilities. Accessed Feb 2020

  11. Openwall. http://www.openwall.com/. Accessed Feb 2020

  12. CNNVD. https://www.cnvd.org.cn/. Accessed Feb 2020

  13. CNVD. http://www.cnnvd.org.cn/. Accessed Feb 2020

  14. BDU. https://bdu.fstec.ru/threat. Accessed Feb 2020

  15. Breu, S., Premraj, R., Sillito, J., Zimmermann, T.: Information needs in bug reports: improving cooperation between developers and users. In: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, pp. 301–310 (2010)

    Google Scholar 

  16. CVE and CVE relationship. https://cve.mitre.org/about/cve_and_nvd_relationship.html. Accessed Feb 2020

  17. Chaparro, O., et al.: Detecting missing information in bug descriptions. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pp. 396–407 (2017)

    Google Scholar 

  18. You, W., et al.: SemFuzz: semantics-based automatic generation of proof-of-concept exploits. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 2139–2154 (2017)

    Google Scholar 

  19. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)

  20. Yang, Z., Salakhutdinov, R., Cohen, W.W.: Transfer learning for sequence tagging with hierarchical recurrent networks. arXivpreprint arXiv:1703.06345 (2017)

  21. Are there references available for CVE entries? https://cve.mitre.org/about/faqs.html#cve_entry_references. Accessed Feb 2020

    Google Scholar 

  22. Lin, Y., Shen, S., Liu, Z., Luan, H., Sun, M.: Neural relation extraction with selective attention over instances. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2124–2133 (2016)

    Google Scholar 

  23. Zhou, P., et al.: Attention-based bidirectional long short-term memory net-works for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (volume 2: Short papers), pp. 207–212 (2016)

    Google Scholar 

  24. Giorgi, J., Wang, X., Sahar, N., Shin, W.Y., Bader, G.D., Wang, B.: End-to-end named entity recognition and relation extraction using pre-trained language models. arXiv preprint arXiv:1912.13415 (2019)

  25. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)

  26. Dong, C., Zhang, J., Zong, C., Hattori, M., Di, H.: Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 239–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50496-4_20

    Chapter  Google Scholar 

  27. Levow, G.-A.: The third international Chinese language processing bakeoff: word segmentation and named entity recognition. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, pp. 108–117 (2006)

    Google Scholar 

  28. Zhang, Y., Yang, J.: Chinese NER using lattice LSTM. arXiv preprint arXiv:1805.02023 (2018)

  29. Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labelled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 1003–1011 (2009)

    Google Scholar 

  30. CPE dictionary. https://nvd.nist.gov/products/cpe. Accessed Feb 2020

  31. YEDDA. https://github.com/QiaoShiA/YEDDA-python3.8. Accessed Feb 2020

  32. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)

    Google Scholar 

  33. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. arXiv preprint arXiv:1409.3215 (2014)

  34. Shuang, K., Zhang, Z., Loo, J., Su, S.: Convolution–deconvolution word embedding: an end-to-end multi-prototype fusion embedding method for natural language processing. Inf. Fusion 53, 112–122 (2020)

    Google Scholar 

  35. Yu, H., An, J., Yoon, J., Kim, H., Ko, Y.: Simple methods to overcome the limitations of general word representations in natural language processing tasks. Comput. Speech Lang. 59, 91–113 (2020)

    Article  Google Scholar 

  36. Nan, Y., Yang, Z., Wang, X., Zhang, Y., Zhu, D., Yang, M.: Finding clues for your secrets: semantics-driven, learning-based privacy discovery in mobile apps. In: NDSS (2018)

    Google Scholar 

  37. Andow, B., et al.: PolicyLint: investigating internal privacy policy contradictions on google play. In: 28th USENIX Security Symposium (USENIX Security 2019), pp. 585–602 (2019)

    Google Scholar 

  38. Frigault, M., Wang, L., Singhal, A., Jajodia, S.: Measuring network security using dynamic Bayesian network. In: Proceedings of the 4th ACM Workshop on Quality of Protection, pp. 23–30 (2008)

    Google Scholar 

  39. Khosravi-Farmad, M., Rezaee, R., Harati, A., Bafghi, A.G.: Network security risk mitigation using Bayesian decision networks. In: 2014 4th International Conference on Computer and Knowledge Engineering (ICCKE), pp. 267–272. IEEE (2014)

    Google Scholar 

  40. Liao, X., Yuan, K., Wang, X., Li, Z., Xing, L., Beyah, R.: Acing the IOC game: toward automatic discovery and analysis of open-source cyber threat intelligence. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 755–766 (2016)

    Google Scholar 

  41. Zhang, S., Ou, X., Caragea, D.: Predicting cyber risks through national vulnerability database. Inf. Secur. J. Global Persp. 24(4–6), 194–206 (2015)

    Article  Google Scholar 

  42. Allodi, L., Massacci, F.: Comparing vulnerability severity and exploits using case-control studies. ACM Trans. Inf. Syst. Secur. (TISSEC) 17(1), 1–20 (2014)

    Article  Google Scholar 

  43. Khosravi-Farmad, M., Rezaee, R., Bafghi, A.G.: Considering temporal and environmental characteristics of vulnerabilities in network security risk assessment. In: 2014 11th International ISC Conference on Information Security and Cryptology, pp. 186–191. IEEE (2014)

    Google Scholar 

  44. Nguyen, V.H., Massacci, F.: The (un) reliability of NVD vulnerable versions data: an empirical experiment on google chrome vulnerabilities. In: Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security, pp. 493–49 (2013)

    Google Scholar 

  45. Christey, S., Martin, B.: Buying into the bias: why vulnerability statistics suck. BlackHat, Las Vegas, USA, Technical report, vol. 1 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuejun Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ren, H. et al. (2022). Detecting Inconsistent Vulnerable Software Version in Security Vulnerability Reports. In: Cao, C., Zhang, Y., Hong, Y., Wang, D. (eds) Frontiers in Cyber Security. FCS 2021. Communications in Computer and Information Science, vol 1558. Springer, Singapore. https://doi.org/10.1007/978-981-19-0523-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-0523-0_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-0522-3

  • Online ISBN: 978-981-19-0523-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics