Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by CRACS

2020

Identifying journalistically relevant social media texts using human and automatic methodologies

Authors
Guimaraes, N; Miranda, F; Figueira, A;

Publication
INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING

Abstract
Social networks have provided the means for constant connectivity and fast information dissemination. In addition, real-time posting allows a new form of citizen journalism, where users can report events from a witness perspective. Therefore, information propagates through the network at a faster pace than traditional media reports it. However, relevant information is a small percentage of all the content shared. Our goal is to develop and evaluate models that can automatically detect journalistic relevance. To do it, we need solid and reliable ground truth data with a significantly large quantity of annotated posts, so that the models can learn to detect relevance over all the spectrum. In this article, we present and confront two different methodologies: an automatic and a human approach. Results on a test data set labelled by experts' show that the models trained with automatic methodology tend to perform better in contrast to the ones trained using human annotated data.

2020

Contribution of Social Tagging to Clustering Effectiveness Using as Interpretant the User's Community

Authors
Cunha, E; Figueira, A;

Publication
TRENDS AND INNOVATIONS IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1

Abstract
In this article we discuss how social tagging can be used to improve the methodology used for clustering evaluation. We analyze the impact of the integration of tags in the clustering process and its effectiveness. Following the semiotic theory, the own nature of tags allows the reflection of which ones should be considered depending on the interpretant (community of users, or tag writer). Using a case with the community of users as the interpretant, our novel clustering algorithm (k-C), which is based on community detection on a network of tags, was compared with the standard k-means algorithm. The results indicate that the k-C algorithm created more effective clusters. © 2020, The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG.

2020

Analysis and Detection of Unreliable Users in Twitter: Two Case Studies

Authors
Guimaraes, N; Figueira, A; Torgo, L;

Publication
Communications in Computer and Information Science

Abstract
The emergence of online social networks provided users with an easy way to publish and disseminate content, reaching broader audiences than previous platforms (such as blogs or personal websites) allowed. However, malicious users started to take advantage of these features to disseminate unreliable content through the network like false information, extremely biased opinions, or hate speech. Consequently, it becomes crucial to try to detect these users at an early stage to avoid the propagation of unreliable content in social networks’ ecosystems. In this work, we introduce a methodology to extract large corpus of unreliable posts using Twitter and two databases of unreliable websites (OpenSources and Media Bias Fact Check). In addition, we present an analysis of the content and users that publish and share several types of unreliable content. Finally, we develop supervised models to classify a twitter account according to its reliability. The experiments conducted using two different data sets show performance above 94% using Decision Trees as the learning algorithm. These experiments, although with some limitations, provide some encouraging results for future research on detecting unreliable accounts on social networks. © 2020, Springer Nature Switzerland AG.

2020

A METHODOLOGY TO ASSESS LEARNING PATTERNS IN ONLINE COURSES MEDIATED BY AN LMS

Authors
Figueira, A;

Publication
EDULEARN20 Proceedings

Abstract

2020

REPORT ON THE SELF-STUDY BEHAVIOR IN LEARNING FROM VIDEO LECTURES DURING A CONFINEMENT PERIOD

Authors
Figueira, A;

Publication
EDULEARN20 Proceedings

Abstract

2020

Knowledge-based Reliability Metrics for Social Media Accounts

Authors
Guimaraes, N; Figueira, A; Torgo, L;

Publication
PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES (WEBIST)

Abstract
The growth of social media as an information medium without restrictive measures on the creation of new accounts led to the rise of malicious agents with the intend to diffuse unreliable information in the network, ultimately affecting the perception of users in important topics such as political and health issues. Although the problem is being tackled within the domain of bot detection, the impact of studies in this area is still limited due to 1) not all accounts that spread unreliable content are bots, 2) human-operated accounts are also responsible for the diffusion of unreliable information and 3) bot accounts are not always malicious (e.g. news aggregators). Also, most of these methods are based on supervised models that required annotated data and updates to maintain their performance through time. In this work, we build a framework and develop knowledge-based metrics to complement the current research in bot detection and characterize the impact and behavior of a Twitter account, independently of the way it is operated (human or bot). We proceed to analyze a sample of the accounts using the metrics proposed and evaluate the necessity of these metrics by comparing them with the scores from a bot detection system. The results show that the metrics can characterize different degrees of unreliable accounts, from unreliable bot accounts with a high number of followers to human-operated accounts that also spread unreliable content (but with less impact on the network). Furthermore, evaluating a sample of the accounts with a bot detection system shown that bots compose around 11% of the sample of unreliable accounts extracted and that the bot score is not correlated with the proposed metrics. In addition, the accounts that achieve the highest values in our metrics present different characteristics than the ones that achieve the highest bot score. This provides evidence on the usefulness of our metrics in the evaluation of unreliable accounts in social networks. Copyright

  • 44
  • 200