Machine Learning and Deep Learning Based Phishing Websites Detection: The Current Gaps and Next Directions

Authors

DOI:

https://doi.org/10.18488/76.v9i1.2983

Abstract

There are many phishing websites detection techniques in literature, namely white-listing, black-listing, visual-similarity, heuristic-based, and others. However, detecting zero-hour or newly designed phishing website attacks is an inherent property of machine learning and deep learning techniques. By considering a promising solution of machine learning and deep learning techniques, researchers have made a great deal of effort to tackle the this problem, which persists due to attackers constantly devising novel strategies to exploit vulnerability or gaps in existing anti-phishing measures. In this study, an extensive effort has been made to rigorously review recent studies focusing on Machine Learning and Deep Learning Based Phishing Websites Detection to excavate the root cause of the aforementioned problems and offer suitable solutions. The study followed the significant criterion to search, download, and screen relevant studies, then to evaluate criterion-based selected studies. The findings show that significant research gaps are available in the rigorously reviewed studies. These gaps are mainly related to imbalanced dataset usage, improper selection of dataset source(s), the unjustified reason for using specific train-test dataset split ratio, scientific disputes on website features inclusion and exclusion, lack of universal consensus on phishing website lifespans and on what is defining a small dataset size, and run-time analysis issues. The study clearly presented a summary of the comparative analysis performed on each reviewed research work so that future researchers could use it as a structured guideline to develop a novel solution for anti-phishing website attacks.

Keywords:

Deep learning, Machine learning, Phishing website attack, Phishing website detection, Anti-phishing website, Legitimate website , Phishing website datasets, Phishing website features.

Abstract Video

Downloads

Crossref
Scopus
11
Qazani M.R.C. (2025)
Mechanical properties estimation of multi-layer friction stir plug welded aluminium plates using time-series neural network models. Soft Computing, 29(2), 1147-1168.
10.1007/s00500-025-10429-x
Luo Q. (2024)
Improved learning efficiency of deep Monte-Carlo for complex imperfect-information card games. Applied Soft Computing, 158,
10.1016/j.asoc.2024.111545
Thakkar A. (2024)
Fusion of linear and non-linear dimensionality reduction techniques for feature reduction in LSTM-based Intrusion Detection System. Applied Soft Computing, 154,
10.1016/j.asoc.2024.111378
Shen S. (2024)
Car drag coefficient prediction using long–short term memory neural network and LASSO. Measurement: Journal of the International Measurement Confederation, 225,
10.1016/j.measurement.2023.113982
Adane K. (2024)
ML and DL-based Phishing Website Detection: The Effects of Varied Size Datasets and Informative Feature Selection Techniques. Journal of Artificial Intelligence and Technology, 4(1), 18-30.
10.37965/jait.2023.0269
Adane K. (2024)
Intelligent Phishing Website Detection before and after Multiple Informative Feature Selection Techniques: Machine Learning Approach. International Journal of Information Science and Management, 22(1), 31-62.
10.22034/ijism.2023.1977974.0
Adane K. (2023)
Single and Hybrid-Ensemble Learning-Based Phishing Website Detection: Examining Impacts of Varied Nature Datasets and Informative Feature Selection Technique. Digital Threats: Research and Practice, 4(3),
10.1145/3611392
Khosravani Pour L. (2023)
Language recognition by convolutional neural networks. Scientia Iranica, 30(1D), 116-123.
10.24200/sci.2022.59110.6064
Omar A.R. (2023)
From Phishing Behavior Analysis and Feature Selection to Enhance Prediction Rate in Phishing Detection. International Journal of Advanced Computer Science and Applications, 14(5), 1033-1044.
10.14569/IJACSA.2023.01405107
Adane K. (2023)
Phishing Website Detection with and Without Proper Feature Selection Techniques: Machine Learning Approach. Lecture Notes on Data Engineering and Communications Technologies, 158, 745-756.
10.1007/978-3-031-24475-9_61
Qazani M.R.C. (2022)
Estimation of tool–chip contact length using optimized machine learning in orthogonal cutting. Engineering Applications of Artificial Intelligence, 114,
10.1016/j.engappai.2022.105118

Published

2022-05-06

How to Cite

Adane, K. ., & Beyene, B. . (2022). Machine Learning and Deep Learning Based Phishing Websites Detection: The Current Gaps and Next Directions . Review of Computer Engineering Research, 9(1), 13–29. https://doi.org/10.18488/76.v9i1.2983

Issue

Section

Articles