INESC TEC uses statistical and psycholinguistic indicators to detect fake news

In early May, the European Commission registered more than 2.700 daily articles with fake news related to covid-19 on social networks, including false or misleading posts. Those responsible for said European body even considered misinformation the “disease of the century”.

26th May 2020

Considering this, INESC TEC’s Centre for Research in Advanced Computing Systems (CRACS) is developing the project “Detecting Fake News Automatically”. This project will support common users, namely journalists, in analysing and identifying information that’s most likely false, as well as filtering the most relevant content on social networks.

“The system extracts different information from the posts, particularly evidence considered relevant. The indicators (more than 100) can be psycholinguistic (for instance, trying to figure out which is the most predominant emotion in the text) or statistical (the frequency of verbs, adjectives or references to entities). Then, they are filtered by a model of automatic learning that learned to draw distinctions through previously known cases that had already been referred to as fake news. Based on this learning, the model will classify the new post according to the likelihood of being fake news or not”, explained Álvaro Figueira, INESC TEC researcher and professor at the Department of Computer Science of the Faculty of Sciences of the University of Porto (FCUP).

Using data mining techniques, machine learning, natural language processing, recognition of mentioned entities, analysis of emotions, among others, the researchers hope that the solution developed will offer a greater degree of security, thus ensuring the accuracy of the contents on social networks.

“Our goal is developing a system that resorts to posts’ written message and all the information associated with them i.e. likes, shares and comments, as well as information about the user who published them. We believe that this additional information, associated with the message conveyed, helps increase the confidence in the classification given by the system”, added the researcher.

Detecting fake news in the context of covid-19

This project began as an output of the REMINDS project, which focused on the creation of a system capable of automatically detecting which social media posts (Facebook and Twitter) are most relevant to the general public, according to journalistic criteria. This was particularly relevant after the US elections in 2016, a time when the issue of fake news escalated, leading tech companies and the scientific community to work on a solution. In the current pandemic context, there are several variables, and it is necessary to adapt the system to new challenges.

“One of the main challenges that the project faces is the change of domain and chronological context in which fake news can occur. For example, fake news in a political context have some textual and lexical properties that are different from fake news about health issues. Therefore, trying to develop a system that is capable of capturing this type of diversity has been a challenging task. The covid-19 pandemic, having a specific domain and context, has been a very interesting case study in the fake news universe. We believe that the system will be able to adapt and detect fake news in any domain, thus contributing to moderate this type of content on social networks”, concludes Álvaro Figueira.

Within the scope of this project, INESC TEC researcher Nuno Guimarães is currently developing the PhD thesis entitled “Analyzing and Developing Veracity Indicators for Building an Automatic Detector of Fake News Online” – with INESC TEC researchers Álvaro Figueira and Luís Torgo in charge of supervising the thesis elaborated at FCUP.

The INESC TEC researcher mentioned in this news piece is associated with UP-FCUP and INESC TEC.