Abstract
Music is one of the truest forms of art. People listen to music both as a form of entertainment and means of relaxation. Every country or region in the world has its own form and style of music. Bangladesh is no exception as it has a great history of music with a great tradition of song writings over centuries. Although songs are very popular among the enthusiasts, authors of them get little recognition. As a result, author identification from songs, more specifically from lyrics, is an important and realistic possibility. Authorship attribution is one of the ways of identifying the author from a linguistic corpus. This paper demonstrates a guideline to identify the author of a Bengali song from the lyrics of that song using machine learning. It presents the first work on machine learning-based computational approach for author attribution from the lyrics of Bengali songs. Six methods of machine learning were used for the author identification, and high accuracy had been achieved from these methods while applied to the data sets D2A, D4A, and D7A, which were built from Bengali song lyrics. It is observed that the Naive Bayes (NB) classifier provides higher accuracy in comparison with the other methods as it shows 93.9, 85, and 86.7% of accuracy while considering the stop words for our three data sets, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
AlSallal M, Iqbal R, Palade V, Amin S, Chang V (2017) An integrated approach for intrinsic plagiarism detection. Future Generation Computer Systems
Zheng R, Li J, Chen H, Huang Z (2006) A framework for authorship identification of online messages: writing-style features and classification techniques. J Am Soc Inf Sci Technol 57(3):378–393
Corrêa DC, Rodrigues FA (2016) A survey on symbolic data-based music genre classification. Expert Syst Appl 60:190–210
Deng JJ, Leung CH, Milani A, Chen L (2015) Emotional states associated with music: Classification, prediction of changes, and consideration in recommendation. ACM Trans Interact Intell Syst (TiiS) 5(1), Â 4
Roblek D, Eck D (2018) Machine learning to generate music from text (July 5 2018) US Patent App. 15/394,895
Goienetxea I, MartÃnez-Otzeta JM, Sierra B, Mendialdua I (2018) Towards the use of similarity distances to music genre classification: a comparative study. PloS one 13(2):e0191417
Malheiro R, Panda R, Gomes P, Paiva RP (2018) Emotionally-relevant features for classification and regression of music lyrics. IEEE Trans Affect Comput (2), 240–254
Stamatatos E (2009) A survey of modern authorship attribution methods. J Am Soc Inf Sci Technol 60(3):538–556
Chaski CE (2005) Whos at the keyboard? authorship attribution in digital evidence investigations. Int J Digital Evid 4(1):1–13
De Vel O, Anderson A, Corney M, Mohay G (2001) Mining e-mail content for author identification forensics. ACM Sigmod Rec 30(4):55–64
Schein AI, Caver JF, Honaker RJ, Martell CH (2010) Author attribution evaluation with novel topic cross-validation. In: KDIR, Citeseer, pp 206–215
Mara M (2014) Artist attribution via song lyrics
Mayer R, Neumayer R, Rauber A (2008) Rhyme and style features for musical genre classification by song lyrics. In: ISMIR, pp 337–342
Fell M, Sporleder C (2014) Lyrics-based analysis and classification of music. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers, pp 620–631
Rish I (2001) An empirical study of the naive bayes classifier
Sun S, Luo C, Chen J (2017) A review of natural language processing techniques for opinion mining systems. Inf Fusion 36:10–25
Sidorov G, Velasquez F, Stamatatos E, Gelbukh A, Chanona-Hernández L (2014) Syntactic n-grams as machine learning features for natural language processing. Expert Syst Appl 41(3):853–860
Bandhakavi A, Wiratunga N, Padmanabhan D, Massie S (2017) Lexicon based feature extraction for emotion text classification. Pattern Recognit Lett 93:133–142
Allan J, Papka R, Lavrenko V (2017) On-line new event detection and tracking. In: ACM SIGIR forum, vol 51. ACM, pp 185–193
Zhai C, Lafferty J (2014) A study of smoothing methods for language models applied to adhoc information retrieval. In: ACM SIGIR forum, vol 51. ACM, pp 268–276
Jing LP, Huang HK, Shi HB (2002) Improved feature selection approach tfidf in text mining. In: 2002 proceedings international conference on machine learning and cybernetics, vol  2. IEEE, pp 944–946
Zhang T (2004) Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: Proceedings of the twenty-first international conference on machine learning. ICML ’04, New York, NY, USA. ACM, pp 116–
Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3), 27:1–27:27
Weston J, Watkins C (1999) Support vector machines for multi-class pattern recognition
Kibriya AM, Frank E, Pfahringer B, Holmes G (2004) Multinomial naive bayes for text categorization revisited. In: Australasian joint conference on artificial intelligence. Springer, Berlin, pp 488–499
Draper NR, Smith H (2014) Applied regression analysis. Wiley, New York
Freund Y, Schapire RE (1999) Large margin classification using the perceptron algorithm. Mach Learn 37(3):277–296
Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y (2006) Online passive-aggressive algorithms. J Mach Learn Res 7:551–585
Powers DMW (2011) Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. Int J Mach Learn Technol 2(1):37–63
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ontika, N.N., Kabir, M.F., Islam, A., Ahmed, E., Huda, M.N. (2020). A Computational Approach to Author Identification from Bengali Song Lyrics. In: Uddin, M.S., Bansal, J.C. (eds) Proceedings of International Joint Conference on Computational Intelligence. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-13-7564-4_31
Download citation
DOI: https://doi.org/10.1007/978-981-13-7564-4_31
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-7563-7
Online ISBN: 978-981-13-7564-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)