An LSTM-Based Plagiarism Detection via Attention Mechanism and a Population-Based Approach for Pre-training Parameters with Imbalanced Classes
2021 (English)In: Lect. Notes Comput. Sci., Springer Science and Business Media Deutschland GmbH , 2021, p. 690-701Conference paper, Published paper (Refereed)
Abstract [en]
Plagiarism is one of the leading problems in academic and industrial environments, which its goal is to find the similar items in a typical document or source code. This paper proposes an architecture based on a Long Short-Term Memory (LSTM) and attention mechanism called LSTM-AM-ABC boosted by a population-based approach for parameter initialization. Gradient-based optimization algorithms such as back-propagation (BP) are widely used in the literature for learning process in LSTM, attention mechanism, and feed-forward neural network, while they suffer from some problems such as getting stuck in local optima. To tackle this problem, population-based metaheuristic (PBMH) algorithms can be used. To this end, this paper employs a PBMH algorithm, artificial bee colony (ABC), to moderate the problem. Our proposed algorithm can find the initial values for model learning in all LSTM, attention mechanism, and feed-forward neural network, simultaneously. In other words, ABC algorithm finds a promising point for starting BP algorithm. For evaluation, we compare our proposed algorithm with both conventional and population-based methods. The results clearly show that the proposed method can provide competitive performance.
Place, publisher, year, edition, pages
Springer Science and Business Media Deutschland GmbH , 2021. p. 690-701
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 13110 LNCS
Keywords [en]
Artificial bee colony, Attention mechanism, Back-propagation, LSTM, Plagiarism, Feedforward neural networks, Intellectual property, Learning algorithms, Optimization, Academic environment, Attention mechanisms, Back Propagation, Feed forward neural net works, Imbalanced class, Industrial environments, Meta-heuristics algorithms, Plagiarism detection, Pre-training, Training parameters, Long short-term memory
National Category
Energy Engineering
Identifiers
URN: urn:nbn:se:mdh:diva-56879DOI: 10.1007/978-3-030-92238-2_57Scopus ID: 2-s2.0-85121899875ISBN: 9783030922375 (print)OAI: oai:DiVA.org:mdh-56879DiVA, id: diva2:1626798
Conference
28th International Conference on Neural Information Processing, ICONIP 2021
2022-01-122022-01-122022-03-14Bibliographically approved