Abstract
The multimedia-based e-Learning methodology provides virtual classrooms to students. The teacher uploads learning materials, programming assignments and quizzes on university’ Learning Management System (LMS). The students learn lessons from uploaded videos and then solve the given programming tasks and quizzes. The source code plagiarism is a serious threat to academia. However, identifying similar source code fragments between different programming languages is a challenging task. To solve the problem, this paper proposed a new plagiarism detection technique between C++ and Java source codes based on semantics in multimedia-based e-Learning and smart assessment methodology. First, it transforms source codes into tokens to calculate semantic similarity in token by token comparison. After that, it finds semantic similarity in scalar value for the complete source codes written in C++ and Java. To analyse the experiment, we have taken the dataset consists of four (4) case studies of Factorial, Bubble Sort, Binary Search and Stack data structure in both C++ and Java. The entire experiment is done in R Studio with R version 3.4.2. The experimental results show better semantic similarity results for plagiarism detection based on comparison.
Similar content being viewed by others
References
Abdelrahman YA, Khalid A, Osman IM (2017) A method for arabic documents plagiarism detection. Int J Comput Sci Inf Secur 15(2):79
Alrabaee S et al (2015) Sigma: a semantic integrated graph matching approach for identifying reused functions in binary code. Digit Investig 12:S61–S71
Bakker T (2014) Plagiarism detection in source code. PhD dissertation, Universiteit Leiden, 7, pp 1–35
Bandara U, Wijayrathna G (2012) Detection of source code plagiarism using machine learning approach. Int J Comput Theory Eng 4(5):674
Berry MW, Browne M (2005) Understanding search engines: mathematical modeling and text retrieval. SIAM
Buddrus F, Schödel J (1998) Cappuccino—A C++ to Java translator. In Proceedings of the 1998 ACM symposium on Applied Computing. ACM
Chen X et al (2004) Shared information and program plagiarism detection. IEEE Trans Inf Theory 50(7):1545–1551
Cosma G, Joy M. (2006) Source-code plagiarism: a UK academic perspective
Cosma G, Joy M (2012) An approach to source-code plagiarism detection and investigation using latent semantic analysis. IEEE Trans Comput 61(3):379–394
de Klerk S, Eggen TJ, Veldkamp BP (2014) A blending of computer-based assessment and performance-based assessment: Multimedia-Based Performance Assessment (MBPA). The introduction of a new method of assessment in Dutch Vocational Education and Training (VET). Cadmo, pp 39–56. doi:https://doi.org/10.3280/CAD2014-001006
Farhan M, Aslam M, Jabbar S, Khalid S (2016) Multimedia based qualitative assessment methodology in eLearning: student teacher engagement analysis. Multimed Tools Appl 77:4909–4923
Farhan M, Aslam M, Jabbar S, Khalid S, Kim M (2017) Real-time imaging-based assessment model for improving teaching performance and student experience in e-learning. J Real-Time Image Proc 13(3):491–504
Farhan M, Jabbar S, Aslam M, Ahmad A, Iqbal MM, Khan M, Martinez-Enriquez AM (2017) A real-time data mining approach for interaction analytics assessment: IoT based student interaction framework. Int J Parallel Prog 12:1–18
Farhan M et al (2018) IoT-based students interaction framework using attention-scoring assessment in eLearning. Futur Gener Comput Syst 79:909–919
Jhi Y-C et al (2015) Program characterization using runtime values and its application to software plagiarism detection. IEEE Trans Softw Eng 41(9):925–943
Kashyap V et al. (2017) Source forager: a search engine for similar source code. arXiv preprint arXiv:1706.02769
Kaur R, Singh S (2014) Clone detection in software source code using operational similarity of statements. ACM SIGSOFT Softw Eng Notes 39(3):1–5
Kawamitsu N et al. (2014) Identifying source code reuse across repositories using LCS-based source code similarity. In Source Code Analysis and Manipulation (SCAM), 2014 I.E. 14th International Working Conference on. IEEE
Kim J et al. (2016) Measuring source code similarity by finding similar subgraph with an incremental genetic algorithm. In Proceedings of the 2016 on Genetic and Evolutionary Computation Conference. ACM
Lau RW et al (2014) Recent development in multimedia e-learning technologies. World Wide Web 17(2):189–198
Lazar F-M, Banias O (2014) Clone detection algorithm based on the Abstract Syntax Tree approach. In 2014 I.E. 9th International Symposium on Applied Computational Intelligence and Informatics (SACI). IEEE
Lu Q, Wang Y (2017) Detection technology of malicious code based on semantic. Multimed Tools Appl 76(19):19543–19555
Luo L. et al. (2017) Semantics-based obfuscation-resilient binary code similarity comparison with applications to software and algorithm plagiarism detection. IEEE Trans Softw Eng
Malabarba S, Devanbu P, Stearns A (1999) MoHCA-Java: a tool for C++ to Java conversion support. In Proceedings of the 21st international conference on Software engineering. ACM
Malik KR et al (2016) Big-data: transformation from heterogeneous data to semantically-enriched simplified data. Multimed Tools Appl 75(20):12727–12747
Marshall CZ, Buchanan EM (2017) Latent semantic analysis applied to authorship questions in textual analysis
McGill TJ, Klobas JE, Renzi S (2014) Critical success factors for the continuation of e-learning initiatives. Internet High Educ 22:24–36
Ohno A, Murao H (2011) A two-step in-class source code plagiarism detection method utilizing improved CM algorithm and SIM. Int J Innov Comput Inform Control 7(8):4729–4739
Pawelczak D (2013) Online detection of source-code plagiarism in undergraduate programming courses. In Proceedings of the International Conference on Frontiers in Education: Computer Science and Computer Engineering (FECS). The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp)
Ragkhitwetsagul C (2016) Measuring code similarity in large-scaled code Corpora. In 2016 I.E. International Conference on software maintenance and evolution (ICSME). IEEE
Roy CK, Cordy JR (2007) A survey on software clone detection research. Queen’s Sch Comput TR 541(115):64–68
Sajnani H. et al. (2016) SourcererCC: scaling code clone detection to big-code. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). IEEE
ShanmughaSundaram M, Subramani S (2015) A measurement of similarity to identify identical code clones. Int Arab J Inform Technol 12:735–740
Shirota Y, Chakraborty B (2015) Visual explanation of mathematics in Latent semantic analysis. In 2015 IIAI 4th International Congress on IEEE Advanced Applied Informatics (IIAI-AAI)
Son J-W et al (2013) An application for plagiarized source code detection based on a parse tree kernel. Eng Appl Artif Intell 26(8):1911–1918
Song H-J, Park S-B, Park SY (2015) Computation of program source code similarity by composition of parse tree and call graph. Math Prob Eng. 2015
Stemler SE (2015) Content analysis. Emerging Trends in the Social and Behavioral Sciences: An Interdisciplinary, Searchable, and Linkable Resource
Van Rysselberghe F, Demeyer S (2004) Evaluating clone detection techniques from a refactoring perspective. In 19th International Conference on Automated Software Engineering, 2004. Proceedings. IEEE
White M et al. (2016) Deep learning code fragments for code clone detection. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. ACM
Yang F-P, Jiau HC, Ssu K-F (2014) Beyond plagiarism: an active learning method to analyze causes behind code-similarity. Comput Educ 70:161–172
Yu B, Xu Z-b, C-h L (2008) Latent semantic analysis for text categorization using neural network. Knowl-Based Syst 21(8):900–904
Zhang D (2005) Interactive multimedia-based e-learning: a study of effectiveness. Am J Dist Educ 19(3):149–162
Zhang D et al (2004) Can e-learning replace classroom learning? Commun ACM 47(5):75–79
Zhiyuan Z (2017) Latent semantic analysis
Acknowledgements
This work was supported by the National Key Research and Development Program (2016QY06X1205, 2016YFB0800605), and the Technology Research and Development Program of Sichuan, China (18DYF2039, 17ZDYF2583).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ullah, F., Wang, J., Farhan, M. et al. Plagiarism detection in students’ programming assignments based on semantics: multimedia e-learning based smart assessment methodology. Multimed Tools Appl 79, 8581–8598 (2020). https://doi.org/10.1007/s11042-018-5827-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5827-6