Cookies Policy
We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out More
Close
  • Menu
Interest
Topics
Details

Details

  • Name

    João Cordeiro
  • Cluster

    Computer Science
  • Role

    Affiliated Researcher
  • Since

    01st December 2012
Publications

2018

ECIR 2018: Text2Story Workshop - Narrative Extraction from Texts

Authors
Jorge, A; Campos, R; Jatowt, A; Nunes, S; Rocha, C; Cordeiro, JP; Pasquali, A; Mangaravite, V;

Publication
SIGIR Forum

Abstract

2018

Extracting Adverse Drug Effects from User Experiences: A Baseline

Authors
Abrantes, D; Cordeiro, J;

Publication
Proceedings - IEEE Symposium on Computer-Based Medical Systems

Abstract
It has been proved that pharmacovigilance benefits from the analysis and extraction of user generated data from blogs, medical forums or other social networks, regarding adverse effect mentions or complaints that occur from taking certain drugs. Data mining, machine learning, pattern recognition, content summarization and natural language processing techniques are often used in this field with promising results. However, there are still several difficulties concerning the extraction, as the highly domain-specific vocabulary presents a few challenges. This is mainly because patients like to use idiomatic or vernacular expressions along with descriptive symptom explanations, which tend to deviate from grammatical rules or expected terms. To address this issue, we propose a well-curated baseline. We believe that building a specific lexicon, identifying common linguistic patterns and observing certain phrasal structures is key to first understanding how a user generates contents online. From there, we can later develop sets of tailored rules that will allow data classification/extraction systems to potentially improve their efficiency at these tasks. © 2018 IEEE.

2015

Fractal Beauty in Text

Authors
Cordeiro, J; Inacio, PRM; Fernandes, DAB;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
This paper assesses if text possesses fractal properties, namely if several attributes that characterize sentences are self-similar. In order to do that, seven corpora were analyzed using several statistical tools, so as to determine if the empirical sequences for the attributes were Gaussian and self-similar. The Kolmogorov-Smirnov goodness-of-fit test and two Hurst parameter estimators were employed. The results show that there is a fractal beauty in the text produced by humans and suggest that its quality is directly proportional to the self-similarity degree.

2013

Rule Induction for Sentence Reduction

Authors
Cordeiro, J; Dias, G; Brazdil, P;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2013

Abstract
Sentence Reduction has recently received a great attention from the research community of Automatic Text Summarization. Sentence Reduction consists in the elimination of sentence components such as words, part-of-speech tags sequences or chunks without highly deteriorating the information contained in the sentence and its grammatical correctness. In this paper, we present an unsupervised scalable methodology for learning sentence reduction rules. Paraphrases are first discovered within a collection of automatically crawled Web News Stories and then textually aligned in order to extract interchangeable text fragment candidates, in particular reduction cases. As only positive examples exist, Inductive Logic Programming (ILP) provides an interesting learning paradigm for the extraction of sentence reduction rules. As a consequence, reduction cases are transformed into first order logic clauses to supply a massive set of suitable learning instances and an ILP learning environment is defined within the context of the Aleph framework. Experiments evidence good results in terms of irrelevancy elimination, syntactical correctness and reduction rate in a real-world environment as opposed to other methodologies proposed so far.

2012

Sumarização Automática de Texto

Authors
Ângelo Santos; João Cordeiro

Publication
UBIvol.0

Abstract
The work developed in this thesis aimed to explore several approaches of extractive text summarization through the implementation of computational methods based on textual statistics and graph theory. A hybrid method was also implemented based on the combination of the previous approaches with other features like keyword extraction and sentence position in the text.