Cookies Policy
We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out More
Close
  • Menu
Interest
Topics
Details

Details

  • Name

    João Cordeiro
  • Cluster

    Computer Science
  • Role

    Affiliated Researcher
  • Since

    01st December 2012
Publications

2019

Socialnetcrawler - Online social network crawler

Authors
Pais, S; Cordeiro, J; Martins, R; Albardeiro, M;

Publication
11th International Conference on Management of Digital EcoSystems, MEDES 2019

Abstract
The emergence and popularization of online social networks suddenly made available a large amount of data from social organization, interaction and human behavior. All this information opens new perspectives and challenges to the study of social systems, being of interest to many fields. Although most online social networks are recent, a vast amount of scientific papers was already published on this topic, dealing with a broad range of analytical methods and applications. Therefore, the development of a tool capable of gather tailored information from social networks is something that can help a lot of researchers on their work, especially in the area of Natural Language Processing (NLP). Nowadays, the daily base medium where people use more often text language lays precisely on social networks. Therefore, the ubiquitous crawling of social networks is of the utmost importance for researchers. Such a tool will allow the researcher to get the relevant needed information, allowing a faster research in what really matters, without loosing time on the development of his own crawler. In this paper, we present an extensive analysis of the existing social networks and their APIs, and also describe the conception and design of a social network crawler which will help NLP researchers. © 2019 Association for Computing Machinery.

2019

Association and temporality between news and tweets

Authors
Moutinho, V; Brazdil, P; Cordeiro, J;

Publication
IC3K 2019 - Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management

Abstract
With the advent of social media, the boundaries of mainstream journalism and social networks are becoming blurred. User-generated content is increasing, and hence, journalists dedicate considerable time searching platforms such as Facebook and Twitter to announce, spread, and monitor news and crowd check information. Many studies have looked at social networks as news sources, but the relationship and interconnections between this type of platform and news media have not been thoroughly investigated. In this work, we have studied a series of news articles and examined a set of related comments on a social network during a period of six months. Specifically, a sample of articles from generalist Portuguese news sources published on the first semester of 2016 was clustered, and the resulting clusters were then associated with tweets of Portuguese users with the recourse to a similarity measure. Focusing on a subset of clusters, we have performed a temporal analysis by examining the evolution of the two types of documents (articles and tweets) and the timing of when they appeared. It appears that for some stories, namely Brexit and the European Football Cup, the publishing of news articles intensifies on key dates (event-oriented), while the discussion on social media is more balanced throughout the months leading up to those events. Copyright

2018

ECIR 2018: Text2Story Workshop - Narrative Extraction from Texts

Authors
Jorge, A; Campos, R; Jatowt, A; Nunes, S; Rocha, C; Cordeiro, JP; Pasquali, A; Mangaravite, V;

Publication
SIGIR Forum

Abstract

2018

Extracting Adverse Drug Effects from User Experiences: A Baseline

Authors
Abrantes, D; Cordeiro, J;

Publication
Proceedings - IEEE Symposium on Computer-Based Medical Systems

Abstract
It has been proved that pharmacovigilance benefits from the analysis and extraction of user generated data from blogs, medical forums or other social networks, regarding adverse effect mentions or complaints that occur from taking certain drugs. Data mining, machine learning, pattern recognition, content summarization and natural language processing techniques are often used in this field with promising results. However, there are still several difficulties concerning the extraction, as the highly domain-specific vocabulary presents a few challenges. This is mainly because patients like to use idiomatic or vernacular expressions along with descriptive symptom explanations, which tend to deviate from grammatical rules or expected terms. To address this issue, we propose a well-curated baseline. We believe that building a specific lexicon, identifying common linguistic patterns and observing certain phrasal structures is key to first understanding how a user generates contents online. From there, we can later develop sets of tailored rules that will allow data classification/extraction systems to potentially improve their efficiency at these tasks. © 2018 IEEE.

2015

Fractal Beauty in Text

Authors
Cordeiro, J; Inacio, PRM; Fernandes, DAB;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
This paper assesses if text possesses fractal properties, namely if several attributes that characterize sentences are self-similar. In order to do that, seven corpora were analyzed using several statistical tools, so as to determine if the empirical sequences for the attributes were Gaussian and self-similar. The Kolmogorov-Smirnov goodness-of-fit test and two Hurst parameter estimators were employed. The results show that there is a fractal beauty in the text produced by humans and suggest that its quality is directly proportional to the self-similarity degree.