https://www.mdu.se/

mdu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Behind the Bait: Delving into PhishTank's hidden data
School of Software, Northwestern Polytechnical University, Shaanxi, Xian, 710072, China .
School of Software, Tsinghua University, Beijing, China.
Department of Computer Science, School of Physics, Engineering & Computer Science, University of Hertfordshire, Hatfield, United Kingdom.
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.ORCID iD: 0000-0003-0611-2655
2024 (English)In: Data in Brief, E-ISSN 2352-3409, Vol. 52, article id 109959Article in journal (Refereed) Published
Abstract [en]

Phishing constitutes a form of social engineering that aims to deceive individuals through email communication. Extensive prior research has underscored phishing as one of the most commonly employed attack vectors for infiltrating organizational networks. A prevalent method involves misleading the target by employing phishing URLs concealed through hyperlink strategies. PhishTank, a website employing the concept of crowd-sourcing, aggregates phishing URLs and subsequently verifies their authenticity. In the course of this study, we leveraged a Python script to extract data from the PhishTank website, amassing a comprehensive dataset comprising over 190,0000 phishing URLs. This dataset is a valuable resource that can be harnessed by both researchers and practitioners for enhancing phish- ing filters, fortifying firewalls, security education, and refining training and testing models, among other applications. 

Place, publisher, year, edition, pages
Elsevier Inc. , 2024. Vol. 52, article id 109959
Keywords [en]
Artificial intelligence, Computer security, Dataset, Email security, Phished URL, Social engineering, Web security, Application programs, Computer crime, Electronic mail, Hypertext systems, Security of data, Statistical tests, Attack vector, E-mails security, Email communication, Hyperlinks, Organizational network, Phishing, Websites
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:mdh:diva-65239DOI: 10.1016/j.dib.2023.109959ISI: 001142588900001Scopus ID: 2-s2.0-85180539147OAI: oai:DiVA.org:mdh-65239DiVA, id: diva2:1824004
Available from: 2024-01-03 Created: 2024-01-03 Last updated: 2024-01-31Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Afzal, Wasif

Search in DiVA

By author/editor
Afzal, Wasif
By organisation
Embedded Systems
In the same journal
Data in Brief
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 158 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf