Thi Thanh Thuy Pham

Work place: Academy of People Security, 125 Tran Phu Street, Ha Dong District, 12100, Ha Noi, Vietnam

E-mail: thanh-thuy.pham@mica.edu.vn

Website: https://orcid.org/0000-0003-3985-3599

Research Interests: Applied computer science, Computer systems and computational processes, Computer Architecture and Organization, Computer Graphics and Visualization, Information Security, Network Security

Biography

Thi Thanh Thuy Pham is main lecturer at Faculty of Information Security, Academy of People Security, Hanoi, Vietnam. She received an Engineering degree in Cryptography Techniques from Academy of Cryptography Techniques, Vietnam; a master’s degree in Communication and Information Processing, and PhD degree in Computer Science from Hanoi University of Science and Technology. She is interested in machine learning and deep learning applied to network and information security; computer vision.

Author Articles
Evaluation of GAN-based Models for Phishing URL Classifiers

By Thi Thanh Thuy Pham Tuan Dung Pham Viet Cuong Ta

DOI: https://doi.org/10.5815/ijcnis.2023.02.01, Pub. Date: 8 Apr. 2023

Phishing attacks by malicious URL/web links are common nowadays. The user data, such as login credentials and credit card numbers can be stolen by their careless clicking on these links. Moreover, this can lead to installation of malware on the target systems to freeze their activities, perform ransomware attack or reveal sensitive information. Recently, GAN-based models have been attractive for anti-phishing URLs. The general motivation is using Generator network (G) to generate fake URL strings and Discriminator network (D) to distinguish the real and the fake URL samples. This is operated in adversarial way between G and D so that the synthesized URL samples by G become more and more similar to the real ones. From the perspective of cybersecurity defense, GAN-based motivation can be exploited for D as a phishing URL detector or classifier. This means after training GAN on both malign and benign URL strings, a strong classifier/detector D can be achieved. From the perspective of cyberattack, the attackers would like to to create fake URLs that are as close to the real ones as possible to perform phishing attacks. This makes them easier to fool users and detectors. In the related proposals, GAN-based models are mainly exploited for anti-phishing URLs. There have been no evaluations specific for GAN-generated fake URLs. The attacker can make use of these URL strings for phishing attacks. In this work, we propose to use TLD (Top-level Domain) and SSIM (Structural Similarity Index Score) scores for evaluation the GAN-synthesized URL strings in terms of the structural similariy with the real ones. The more similar in the structure of the GAN-generated URLs are to the real ones, the more likely they are to fool the classifiers. Different GAN models from basic GAN to others GAN extensions of DCGAN, WGAN, SEQGAN are explored in this work. We show from the intensive experiments that D classifier of basic GAN and DCGAN surpasses other GAN models of WGAN and SegGAN. The effectiveness of the fake URL patterns generated from SeqGAN is the best compared to other GAN models in both structural similarity and the ability in deceiving the phishing URL classifiers of LSTM (Long Short Term Memory) and RF (Random Forest).

[...] Read more.
Other Articles