mdh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Using Artificial Intelligence to Verify Authorship of Anonymous Social Media Posts
Mälardalen University, School of Innovation, Design and Engineering.
2017 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

The widespread use of social media, along with the possibilities to conceal one’s identity in the fibrillation of ubiquitous technology, combined with crime and terrorism becoming digitized, has increased the need of possibilities to find out who hides behind an anonymous alias. This report deals with authorship verification of posts written on Twitter, with the purpose of investigating whether it is possible to develop an auxiliary tool that can be used in crime investigation activities. The main research question in this report is whether a set of tweets written by an anonymous user can be matched to another set of tweets written by a known user, and, based on their linguistic styles, if it is possible to calculate a probability of whether the authors are the same. The report also examines the question of how linguistic styles can be extracted for use in an artificially intelligent classification, and how much data is needed to get adequate results. The subject matter is interesting as the work described in this report concerns a potential future scenario where digital crimes are difficult to investigate with traditional network-based tracking techniques. The approach to the problem is to evaluate traditional methods of feature extraction in natural language processing, and by classifying the features using a type of recurrent neural network called Long Short-Term Memory. While the best result in an experiment that was carried out achieved an accuracy of 93.32%, the overall results showed that the choice of representation, and amount of data used, is crucial. This thesis complements the existing knowledge as very short texts, in the form of social media posts, are in focus.

Place, publisher, year, edition, pages
2017. , 43 p.
National Category
Computer Science
Identifiers
URN: urn:nbn:se:mdh:diva-35551OAI: oai:DiVA.org:mdh-35551DiVA: diva2:1105458
Subject / course
Computer Science
Presentation
2017-06-02, 11:00 (English)
Available from: 2017-06-19 Created: 2017-06-04 Last updated: 2017-06-19Bibliographically approved

Open Access in DiVA

fulltext(3060 kB)35 downloads
File information
File name FULLTEXT01.pdfFile size 3060 kBChecksum SHA-512
7fa6e48f8c79c79a5c8a2a2213bc09634938c4b3559b5a22745e39d02973348961cf4b0421031365d349bea6cc39ae70141042ee55476ee168b4796322d595d2
Type fulltextMimetype application/pdf

By organisation
School of Innovation, Design and Engineering
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 35 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 99 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf