no-2

Phishing email detection using temporal behavioral modeling and transformer architectures

Authors:
Hien Dang Thi, Duyen Tran Thi
Pages:
0
View:
19
Position:
7/7
Download:
6
This study addresses the challenge of effectively detecting phishing email campaigns that exhibit increasingly sophisticated and sequential behaviors. The primary objective is to develop a detection approach that not only analyzes individual emails but also captures behavioral patterns across sequences of related messages. To achieve this goal, a hybrid deep learning framework is proposed that integrates advanced neural architectures for multi-email sequence analysis. Specifically, the model employs DistilBERT to extract semantic representations from email content, while a Bidirectional Long Short-Term Memory (BiLSTM) network is utilized to model temporal dependencies within consecutive email streams. The training dataset is constructed by aggregating four publicly available phishing and spam corpora, including CEAS_08, Nazario, Nigerian Fraud, and SpamAssassin, resulting in a cleaned dataset of 46,616 emails spanning the period from 2000 to 2022. In addition, two heuristic scoring...
This study addresses the challenge of effectively detecting phishing email campaigns that exhibit increasingly sophisticated and sequential behaviors. The primary objective is to develop a detection approach that not only analyzes individual emails but also captures behavioral patterns across sequences of related messages. To achieve this goal, a hybrid deep learning framework is proposed that integrates advanced neural architectures for multi-email sequence analysis. Specifically, the model employs DistilBERT to extract semantic representations from email content, while a Bidirectional Long Short-Term Memory (BiLSTM) network is utilized to model temporal dependencies within consecutive email streams. The training dataset is constructed by aggregating four publicly available phishing and spam corpora, including CEAS_08, Nazario, Nigerian Fraud, and SpamAssassin, resulting in a cleaned dataset of 46,616 emails spanning the period from 2000 to 2022. In addition, two heuristic scoring metrics—Urgency_score and Suspicious_score—are introduced to quantify latent phishing-related cues commonly observed in malicious emails. Experimental results demonstrate that the proposed framework achieves an accuracy of 99.36% and an AUC-ROC of 0.9991 on the validation set, outperforming several baseline approaches. Furthermore, ablation experiments verify the contribution of each model component, while sensitivity analysis provides empirical justification for the selected sequence window size.
Relate

Vinh University journal of science

Tạp chí khoa học Trường Đại học Vinh

ISSN: 1859 - 2228

Governing body: Vinh University

  • Address: 182 Le Duan - Vinh City - Nghe An province
  • Phone: (+84) 238.3855.452 - Fax: (+84) 238.3855.269
  • Email: vinhuni@vinhuni.edu.vn
  • Website: https://vinhuni.edu.vn

 

License: 163/GP-BTTTT issued by the Minister of Information and Communications on May 10, 2023

Open Access License: Creative Commons CC BY NC 4.0

 

CONTACT

Editor-in-Chief: Assoc. Prof., Dr. Tran Ba Tien
Email: tientb@vinhuni.edu.vn

Deputy editor-in-chief: Assoc. Prof., Dr. Phan Van Tien
Email: vantienkxd@vinhuni.edu.vn

Sub-Editor: Dr. Do Mai Trang
Email: domaitrang@vinhuni.edu.vn

Editorial assistant: Msc. Le Tuan Dung, Msc. Phan The Hoa, Msc. Pham Thi Quynh Nga, Msc. Tran Thi Thai

  • Address: 4th Floor, Executive Building, No. 182, Le Duan street, Vinh city, Nghe An province.
  • Phone: (+84) 238-385-6700 | Hotline: (+84) 97-385-6700
  • Email: editors@vujs.vn
  • Website: https://vujs.vn

img