Cookies Policy
We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out More
Close
  • Menu
Publications

Publications by CRACS

2017

A LEARNING AND SOCIAL MANAGEMENT SYSTEM – VERSION 3.0

Authors
Figueira, A; Oliveira, L;

Publication
INTED2017 Proceedings

Abstract

2017

Detecting Journalistic Relevance on Social Media: A two-case study using automatic surrogate features

Authors
Figueira, A; Guimarães, N;

Publication
Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017, Sydney, Australia, July 31 - August 03, 2017

Abstract

2017

Journalistic Relevance Classification in Social Network Messages: an Exploratory Approach

Authors
Sandim, M; Fortuna, P; Figueira, A; Oliveira, L;

Publication
COMPLEX NETWORKS & THEIR APPLICATIONS V

Abstract
Social networks are becoming a wide repository of information, some of which may be of interest for general audiences. In this study we investigate which features may be extracted from single posts propagated throughout a social network, and that are indicative of its relevance, from a journalistic perspective. We then test these features with a set of supervised learning algorithms in order to evaluate our hypothesis. The main results indicate that if a text fragment is pointed out as being interesting, meaningful for the majority of people, reliable and with a wide scope, then it is more likely to be considered as relevant. This approach also presents promising results when validated with several well-known learning algorithms.

2017

Predicting the Relevance of Social Media Posts Based on Linguistic Features and Journalistic Criteria

Authors
Pinto, A; Oliveira, HG; Figueira, A; Alves, AO;

Publication
NEW GENERATION COMPUTING

Abstract
An overwhelming quantity of messages is posted in social networks every minute. To make the utilization of these platforms more productive, it is imperative to filter out information that is irrelevant to the general audience, such as private messages, personal opinions or well-known facts. This work is focused on the automatic classification of public social text according to its potential relevance, from a journalistic point of view, hopefully improving the overall experience of using a social network. Our experiments were based on a set of posts with several criteria, including the journalistic relevance, assessed by human judges. To predict the latter, we rely exclusively on linguistic features, extracted by Natural Language Processing tools, regardless the author of the message and its profile information. In our first approach, different classifiers and feature engineering methods were used to predict relevance directly from the selected features. In a second approach, relevance was predicted indirectly, based on an ensemble of classifiers for other key criteria when defining relevance-controversy, interestingness, meaningfulness, novelty, reliability and scope-also in the dataset. The first approach achieved a F (1)-score of 0.76 and an Area under the ROC curve (AUC) of 0.63. But the best results were achieved by the second approach, with the best learned model achieving a F (1)-score of 0.84 with an AUC of 0.78. This confirmed that journalistic relevance can indeed be predicted by the combination of the selected criteria, and that linguistic features can be exploited to classify the latter.

2017

Communication and resource usage analysis in online environments: An integrated social network analysis and data mining perspective

Authors
Figueira, Alvaro;

Publication
2017 IEEE Global Engineering Education Conference, EDUCON 2017, Athens, Greece, April 25-28, 2017

Abstract

2017

The Complementary Nature of Different NLP Toolkits for Named Entity Recognition in Social Media

Authors
Batista, F; Figueira, A;

Publication
Progress in Artificial Intelligence - 18th EPIA Conference on Artificial Intelligence, EPIA 2017, Porto, Portugal, September 5-8, 2017, Proceedings

Abstract
In this paper we study the combined use of four different NLP toolkits—Stanford CoreNLP, GATE, OpenNLP and Twitter NLP tools—in the context of social media posts. Previous studies have shown performance comparisons between these tools, both on news and social media corporas. In this paper, we go further by trying to understand how differently these toolkits predict Named Entities, in terms of their precision and recall for three different entity types, and how they can complement each other in this task in order to achieve a combined performance superior to each individual one. Experiments on two publicly available datasets from the workshops WNUT-2015 and #MSM2013 show that using an ensemble of toolkits can improve the recognition of specific entity types - up to 10.62% for the entity type Person, 1.97% for the type Location and 1.31% for the type Organization, depending on the dataset and the criteria used for the voting. Our results also showed improvements of 3.76% and 1.69%, in each dataset respectively, on the average performance of the three entity types. © Springer International Publishing AG 2017.

  • 1
  • 52