This study addresses the challenge of effectively detecting phishing email campaigns that exhibit increasingly sophisticated and sequential behaviors. The primary objective is to develop a detection approach that not only analyzes individual emails but also captures behavioral patterns across sequences of related messages. To achieve this goal, a hybrid deep learning framework is proposed that integrates advanced neural architectures for multi-email sequence analysis. Specifically, the model employs DistilBERT to extract semantic representations from email content, while a Bidirectional Long Short-Term Memory (BiLSTM) network is utilized to model temporal dependencies within consecutive email streams. The training dataset is constructed by aggregating four publicly available phishing and spam corpora, including CEAS_08, Nazario, Nigerian Fraud, and SpamAssassin, resulting in a cleaned dataset of 46,616 emails spanning the period from 2000 to 2022. In addition, two heuristic scoring...
This study addresses the challenge of effectively detecting phishing email campaigns that exhibit increasingly sophisticated and sequential behaviors. The primary objective is to develop a detection approach that not only analyzes individual emails but also captures behavioral patterns across sequences of related messages. To achieve this goal, a hybrid deep learning framework is proposed that integrates advanced neural architectures for multi-email sequence analysis. Specifically, the model employs DistilBERT to extract semantic representations from email content, while a Bidirectional Long Short-Term Memory (BiLSTM) network is utilized to model temporal dependencies within consecutive email streams. The training dataset is constructed by aggregating four publicly available phishing and spam corpora, including CEAS_08, Nazario, Nigerian Fraud, and SpamAssassin, resulting in a cleaned dataset of 46,616 emails spanning the period from 2000 to 2022. In addition, two heuristic scoring metrics—Urgency_score and Suspicious_score—are introduced to quantify latent phishing-related cues commonly observed in malicious emails. Experimental results demonstrate that the proposed framework achieves an accuracy of 99.36% and an AUC-ROC of 0.9991 on the validation set, outperforming several baseline approaches. Furthermore, ablation experiments verify the contribution of each model component, while sensitivity analysis provides empirical justification for the selected sequence window size.