Detecting Inconsistent Vulnerable Software Version in Security Vulnerability Reports

Hansong Ren^9,11,
Xuejun Li⁹,
Liao Lei¹¹,
Guoliang Ou^9,11,
Hongyu Sun^9,11,
Gaofei Wu^9,10,
Xiao Tian^9,11,
Jinglu Hu¹² &
…
Yuqing Zhang^9,11,13

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1558))

Included in the following conference series:

International Conference on Frontiers in Cyber Security

700 Accesses

Abstract

At present, the vulnerability database research has mainly focused on whether the disclosed information is accurate. However, the information differences between the various vulnerability databases have received little attention.

This article proposes a WITTY (softWare versIon inconsisTency measuremenT sYstem) to detect the differences between the affected software versions of NVD and different language vulnerability databases (including English CVE, OpenWall, Chinese CNNVD, CNVD, and other eight databases). WITTY can enable Our large-scale quantitative information consistency. We introduce named entity recognition (NER) and relation extraction (RE) based on deep learning. We present custom design into named entity recognition (NER) and relation extraction (RE) based on deep learning, enabling WITTY to recognize previously invisible software names and versions based on sentence structure and context. Ground-truth shows that the system has a high accuracy rate (95.3% accuracy rate, 89.9% recall rate). We use data from 8 vulnerability databases in the past 21 years, involving 554,725 vulnerability reports. The results show that they are inconsistent. The software version is prevalent. The average exact match rate of English vulnerability databases CVE, OpenWall, and other vulnerability databases with cve is only 22.1%. The average exact match rate of Chinese CNNVD and CNVD is 49.5%, and the excat match rate of Russian vulnerability databases is 25.8%.

This work was supported by the National Key Research and Development Program of China (2018YFB0804701), the Key Research and Development Program of Hainan Province (ZDYF202012), Guangxi Key Laboratory of Cryptography and Information Security (No. GCIS202123).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Few-Sample Named Entity Recognition for Security Vulnerability Reports by Fine-Tuning Pre-trained Language Models

Automated CPE Labeling of CVE Summaries with Machine Learning

Comprehensive vulnerability aspect extraction

Article 08 February 2024

References

Top 10 cybersecurity incidents in global government agencies. https://www.secrss.com/articles/23835. Accessed Feb 2020
Munir, R., Disso, J.P., Awan, I., Mufti, M.R.: A quantitative measure of the security risk level of enterprise networks. In: 2013 Eighth International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 437–442. IEEE (2013)
Google Scholar
Mu, D., et al.: Understanding the reproducibility of crowd-reported security vulnerabilities. In: 27th USENIX Security Symposium (USENIX Security 2018), pp. 919–936 (2018)
Google Scholar
Nappa, A., Johnson, R., Bilge, L., Caballero, J., Dumitras, T.: The attack of the clones: a study of the impact of shared code on vulnerability patching. In: 2015 IEEE Symposium on Security and Privacy, pp. 692–708. IEEE (2015)
Google Scholar
Dong, Y., Guo, W., Chen, Y., Xing, X., Zhang, Y., Wang, G.: Towards the detection of inconsistencies in public security vulnerability reports. In: 28th USENIX Security Symposium (USENIX Security 2019), pp. 869–885 (2019)
Google Scholar
CVE and NVD Relationship. https://cve.mitre.org/about/cve_and_nvd_relationship.html. Accessed Feb 2020
CVE List. https://cve.mitre.org/cve/. Accessed Feb 2020
NVD data feeds. https://nvd.nist.gov/vuln/data-feeds. Accessed Feb 2020
Exploitdb. https://www.exploit-db.com/. Accessed Feb 2020
Securityfocus. https://www.securityfocus.com/vulnerabilities. Accessed Feb 2020
Openwall. http://www.openwall.com/. Accessed Feb 2020
CNNVD. https://www.cnvd.org.cn/. Accessed Feb 2020
CNVD. http://www.cnnvd.org.cn/. Accessed Feb 2020
BDU. https://bdu.fstec.ru/threat. Accessed Feb 2020
Breu, S., Premraj, R., Sillito, J., Zimmermann, T.: Information needs in bug reports: improving cooperation between developers and users. In: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, pp. 301–310 (2010)
Google Scholar
CVE and CVE relationship. https://cve.mitre.org/about/cve_and_nvd_relationship.html. Accessed Feb 2020
Chaparro, O., et al.: Detecting missing information in bug descriptions. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pp. 396–407 (2017)
Google Scholar
You, W., et al.: SemFuzz: semantics-based automatic generation of proof-of-concept exploits. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 2139–2154 (2017)
Google Scholar
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
Yang, Z., Salakhutdinov, R., Cohen, W.W.: Transfer learning for sequence tagging with hierarchical recurrent networks. arXivpreprint arXiv:1703.06345 (2017)
Are there references available for CVE entries? https://cve.mitre.org/about/faqs.html#cve_entry_references. Accessed Feb 2020
Google Scholar
Lin, Y., Shen, S., Liu, Z., Luan, H., Sun, M.: Neural relation extraction with selective attention over instances. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2124–2133 (2016)
Google Scholar
Zhou, P., et al.: Attention-based bidirectional long short-term memory net-works for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (volume 2: Short papers), pp. 207–212 (2016)
Google Scholar
Giorgi, J., Wang, X., Sahar, N., Shin, W.Y., Bader, G.D., Wang, B.: End-to-end named entity recognition and relation extraction using pre-trained language models. arXiv preprint arXiv:1912.13415 (2019)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Dong, C., Zhang, J., Zong, C., Hattori, M., Di, H.: Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 239–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50496-4_20
Chapter Google Scholar
Levow, G.-A.: The third international Chinese language processing bakeoff: word segmentation and named entity recognition. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, pp. 108–117 (2006)
Google Scholar
Zhang, Y., Yang, J.: Chinese NER using lattice LSTM. arXiv preprint arXiv:1805.02023 (2018)
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labelled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 1003–1011 (2009)
Google Scholar
CPE dictionary. https://nvd.nist.gov/products/cpe. Accessed Feb 2020
YEDDA. https://github.com/QiaoShiA/YEDDA-python3.8. Accessed Feb 2020
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. arXiv preprint arXiv:1409.3215 (2014)
Shuang, K., Zhang, Z., Loo, J., Su, S.: Convolution–deconvolution word embedding: an end-to-end multi-prototype fusion embedding method for natural language processing. Inf. Fusion 53, 112–122 (2020)
Google Scholar
Yu, H., An, J., Yoon, J., Kim, H., Ko, Y.: Simple methods to overcome the limitations of general word representations in natural language processing tasks. Comput. Speech Lang. 59, 91–113 (2020)
Article Google Scholar
Nan, Y., Yang, Z., Wang, X., Zhang, Y., Zhu, D., Yang, M.: Finding clues for your secrets: semantics-driven, learning-based privacy discovery in mobile apps. In: NDSS (2018)
Google Scholar
Andow, B., et al.: PolicyLint: investigating internal privacy policy contradictions on google play. In: 28th USENIX Security Symposium (USENIX Security 2019), pp. 585–602 (2019)
Google Scholar
Frigault, M., Wang, L., Singhal, A., Jajodia, S.: Measuring network security using dynamic Bayesian network. In: Proceedings of the 4th ACM Workshop on Quality of Protection, pp. 23–30 (2008)
Google Scholar
Khosravi-Farmad, M., Rezaee, R., Harati, A., Bafghi, A.G.: Network security risk mitigation using Bayesian decision networks. In: 2014 4th International Conference on Computer and Knowledge Engineering (ICCKE), pp. 267–272. IEEE (2014)
Google Scholar
Liao, X., Yuan, K., Wang, X., Li, Z., Xing, L., Beyah, R.: Acing the IOC game: toward automatic discovery and analysis of open-source cyber threat intelligence. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 755–766 (2016)
Google Scholar
Zhang, S., Ou, X., Caragea, D.: Predicting cyber risks through national vulnerability database. Inf. Secur. J. Global Persp. 24(4–6), 194–206 (2015)
Article Google Scholar
Allodi, L., Massacci, F.: Comparing vulnerability severity and exploits using case-control studies. ACM Trans. Inf. Syst. Secur. (TISSEC) 17(1), 1–20 (2014)
Article Google Scholar
Khosravi-Farmad, M., Rezaee, R., Bafghi, A.G.: Considering temporal and environmental characteristics of vulnerabilities in network security risk assessment. In: 2014 11th International ISC Conference on Information Security and Cryptology, pp. 186–191. IEEE (2014)
Google Scholar
Nguyen, V.H., Massacci, F.: The (un) reliability of NVD vulnerable versions data: an empirical experiment on google chrome vulnerabilities. In: Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security, pp. 493–49 (2013)
Google Scholar
Christey, S., Martin, B.: Buying into the bias: why vulnerability statistics suck. BlackHat, Las Vegas, USA, Technical report, vol. 1 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Cyber Engineering, Xidian University, Xian, 710126, Shaanxi, China
Hansong Ren, Xuejun Li, Guoliang Ou, Hongyu Sun, Gaofei Wu, Xiao Tian & Yuqing Zhang
Guangxi Key Laboratory of Cryptography and Information Security, Guilin University of Electronic Technology, Guilin, 541010, Guangxi, China
Gaofei Wu
National Computer Network Intrusion Prevention Center, University of Chinese Academy of Sciences, Beijing, 100049, China
Hansong Ren, Liao Lei, Guoliang Ou, Hongyu Sun, Xiao Tian & Yuqing Zhang
School of Information, Production and Systems, Waseda University, Tokyo, 169-805, Japan
Jinglu Hu
College of Computer and Cyberspace Security, Hainan University, Haikou, 570228, China
Yuqing Zhang

Authors

Hansong Ren
View author publications
You can also search for this author in PubMed Google Scholar
Xuejun Li
View author publications
You can also search for this author in PubMed Google Scholar
Liao Lei
View author publications
You can also search for this author in PubMed Google Scholar
Guoliang Ou
View author publications
You can also search for this author in PubMed Google Scholar
Hongyu Sun
View author publications
You can also search for this author in PubMed Google Scholar
Gaofei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Tian
View author publications
You can also search for this author in PubMed Google Scholar
Jinglu Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yuqing Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuejun Li .

Editor information

Editors and Affiliations

Hainan University, Haikou, China
Chunjie Cao
National Computer Network Intrusion Protection Center (NCNIPC), Beijing, China
Yuqing Zhang
Illinois Institute of Technology, Chicago, IL, USA
Yuan Hong
Nankai University, Tianjin, China
Ding Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ren, H. et al. (2022). Detecting Inconsistent Vulnerable Software Version in Security Vulnerability Reports. In: Cao, C., Zhang, Y., Hong, Y., Wang, D. (eds) Frontiers in Cyber Security. FCS 2021. Communications in Computer and Information Science, vol 1558. Springer, Singapore. https://doi.org/10.1007/978-981-19-0523-0_6

Download citation

DOI: https://doi.org/10.1007/978-981-19-0523-0_6
Published: 01 March 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0522-3
Online ISBN: 978-981-19-0523-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Detecting Inconsistent Vulnerable Software Version in Security Vulnerability Reports

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Few-Sample Named Entity Recognition for Security Vulnerability Reports by Fine-Tuning Pre-trained Language Models

Automated CPE Labeling of CVE Summaries with Machine Learning

Comprehensive vulnerability aspect extraction

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Detecting Inconsistent Vulnerable Software Version in Security Vulnerability Reports

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Few-Sample Named Entity Recognition for Security Vulnerability Reports by Fine-Tuning Pre-trained Language Models

Automated CPE Labeling of CVE Summaries with Machine Learning

Comprehensive vulnerability aspect extraction

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation