Publications

Publications by CRACS

2020

Identifying journalistically relevant social media texts using human and automatic methodologies

Authors
Guimaraes, N; Miranda, F; Figueira, A;

Publication
INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING

Abstract
Social networks have provided the means for constant connectivity and fast information dissemination. In addition, real-time posting allows a new form of citizen journalism, where users can report events from a witness perspective. Therefore, information propagates through the network at a faster pace than traditional media reports it. However, relevant information is a small percentage of all the content shared. Our goal is to develop and evaluate models that can automatically detect journalistic relevance. To do it, we need solid and reliable ground truth data with a significantly large quantity of annotated posts, so that the models can learn to detect relevance over all the spectrum. In this article, we present and confront two different methodologies: an automatic and a human approach. Results on a test data set labelled by experts' show that the models trained with automatic methodology tend to perform better in contrast to the ones trained using human annotated data.

CloseRead Abstract

2020

Contribution of Social Tagging to Clustering Effectiveness Using as Interpretant the User's Community

Authors
Cunha, E; Figueira, A;

Publication
TRENDS AND INNOVATIONS IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1

Abstract
In this article we discuss how social tagging can be used to improve the methodology used for clustering evaluation. We analyze the impact of the integration of tags in the clustering process and its effectiveness. Following the semiotic theory, the own nature of tags allows the reflection of which ones should be considered depending on the interpretant (community of users, or tag writer). Using a case with the community of users as the interpretant, our novel clustering algorithm (k-C), which is based on community detection on a network of tags, was compared with the standard k-means algorithm. The results indicate that the k-C algorithm created more effective clusters. © 2020, The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG.

CloseRead Abstract

2020

A METHODOLOGY TO ASSESS LEARNING PATTERNS IN ONLINE COURSES MEDIATED BY AN LMS

Authors
Figueira, A;

Publication
EDULEARN20 Proceedings

Abstract

2020

REPORT ON THE SELF-STUDY BEHAVIOR IN LEARNING FROM VIDEO LECTURES DURING A CONFINEMENT PERIOD

Authors
Figueira, A;

Publication
EDULEARN20 Proceedings

Abstract

2020

Knowledge-based Reliability Metrics for Social Media Accounts

Authors
Guimaraes, N; Figueira, A; Torgo, L;

Publication
PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES (WEBIST)

Abstract
The growth of social media as an information medium without restrictive measures on the creation of new accounts led to the rise of malicious agents with the intend to diffuse unreliable information in the network, ultimately affecting the perception of users in important topics such as political and health issues. Although the problem is being tackled within the domain of bot detection, the impact of studies in this area is still limited due to 1) not all accounts that spread unreliable content are bots, 2) human-operated accounts are also responsible for the diffusion of unreliable information and 3) bot accounts are not always malicious (e.g. news aggregators). Also, most of these methods are based on supervised models that required annotated data and updates to maintain their performance through time. In this work, we build a framework and develop knowledge-based metrics to complement the current research in bot detection and characterize the impact and behavior of a Twitter account, independently of the way it is operated (human or bot). We proceed to analyze a sample of the accounts using the metrics proposed and evaluate the necessity of these metrics by comparing them with the scores from a bot detection system. The results show that the metrics can characterize different degrees of unreliable accounts, from unreliable bot accounts with a high number of followers to human-operated accounts that also spread unreliable content (but with less impact on the network). Furthermore, evaluating a sample of the accounts with a bot detection system shown that bots compose around 11% of the sample of unreliable accounts extracted and that the bot score is not correlated with the proposed metrics. In addition, the accounts that achieve the highest values in our metrics present different characteristics than the ones that achieve the highest bot score. This provides evidence on the usefulness of our metrics in the evaluation of unreliable accounts in social networks.

CloseRead Abstract

2020

FOCAS: Penalising friendly citations to improve author ranking

Authors
Silva, J; Aparicio, D; Ribeiro, P; Silva, F;

Publication
PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20)

Abstract
Scientific impact is commonly associated with the number of citations received. However, an author can easily boost his own citation count by (i) publishing articles that cite his own previous work (self-citations), (ii) having co-authors citing his work (co-author citations), or (iii) exchanging citations with authors from other research groups (reciprocated citations). Even though these friendly citations inflate an author's perceived scientific impact, author ranking algorithms do not normally address them. They, at most, remove self-citations. Here we present Friends-Only Citations AnalySer (FOCAS), a method that identifies friendly citations and reduces their negative effect in author ranking algorithms. FOCAS combines the author citation network with the co-authorship network in order to measure author proximity and penalises citations between friendly authors. FOCAS is general and can be regarded as an independent module applied while running (any) PageRank-like author ranking algorithm. FOCAS can be tuned to use three different criteria, namely authors' distance, citation frequency, and citation recency, or combinations of these. We evaluate and compare FOCAS against eight state-of-the-art author ranking algorithms. We compare their rankings with a ground-truth of best paper awards. We test our hypothesis on a citation and co-authorship network comprised of seven Information Retrieval top-conferences. We observed that FOCAS improved author rankings by 25% on average and, in one case, leads to a gain of 46%.

CloseRead Abstract