Publications

Publications by LIAAD

2019

BRIGHT - Drift-Aware Demand Predictions for Taxi Networks

Authors
Saadallah, A; Moreira Matias, L; Sousa, R; Khiari, J; Jenelius, E; Gama, J;

Publication
2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019)

Abstract
The dynamic behavior of urban mobility patterns makes matching taxi supply with demand as one of the biggest challenges in this industry. Recently, the increasing availability of massive broadcast GPS data has encouraged the exploration of this issue under different perspectives. One possible solution is to build a data-driven real-time taxi-dispatching recommender system. However, existing systems are based on strong assumptions such as stationary demand distributions and finite training sets, which make them inadequate for modeling the dynamic nature of the network. In this paper, we propose BRIGHT: a drift-aware supervised learning framework which aims to provide accurate predictions for short-term horizon taxi demand quantities through a creative ensemble of time series analysis methods that handle distinct types of concept drift. A large experimental set-up which includes three real-world transportation networks and a synthetic test-bed with artificially inserted concept drifts, was employed to illustrate the advantages of BRIGHT when compared to S.o.A methods for this problem.

CloseRead Abstract

2019

Impact of Genealogical Features in Transthyretin Familial Amyloid Polyneuropathy Age of Onset Prediction

Authors
Pedroto, M; Jorge, A; Mendes Moreira, J; Coelho, T;

Publication
PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY AND BIOINFORMATICS

Abstract
Transthyretin Familial Amyloid Polyneuropathy (TTR-FAP) is a neurological genetic disease that propagates from one family generation to the next. The disease can have severe effects on the life of patients after the first symptoms (onset) appear. Accurate prediction of the age of onset for these patients can help the management of the impact. This is, however, a challenging problem since both familial and non-familial characteristics may or may not affect the age of onset. In this work, we assess the importance of sets of genealogical features used for Predicting the Age of Onset of TTR-FAP Patients. We study three sets of features engineered from clinical and genealogical data records obtained from Portuguese patients. These feature sets, referred to as Patient, First Level and Extended Level Features, represent sets of characteristics related to each patient's attributes and their familial relations. They were compiled by a Medical Research Center working with TTR-FAP patients. Our results show the importance of genealogical data when clinical records have no information related with the ancestor of the patient, namely its Gender and Age of Onset. This is suggested by the improvement of the estimated predictive error results after combining First and Extended Level with the Patients Features.

CloseRead Abstract

2019

Report on the Second International Workshop on Narrative Extraction from Texts (Text2Story 2019)

Authors
Jorge, AM; Campos, R; Jatowt, A; Bhatia, S;

Publication
SIGIR Forum

Abstract
Building upon the success of the first edition, we organize the second edition of the Text2Story Workshop on Narrative Extraction from Texts in conjunction with the 41 st European Conference on Information Retrieval (ECIR 2019) on April 14, 2019. Our objective is to further consolidate the efforts of the community and reflect upon the progress made since the last edition. Although the understanding of natural language has improved over the last couple of years – with research works emerging on the grounds of information extraction and text mining – the problem of constructing consistent narrative structures is yet to be solved. It is expected that the state-of-the-art has been advancing in pursuit of methods that automatically identify, interpret and relate the different elements of narratives which are often spread among different sources. In the second edition of the workshop, we foster the discussion of recent advances in the link between Information Retrieval (IR) and formal narrative representations from text. © Springer Nature Switzerland AG 2019.

CloseRead Abstract

2019

Guest Editorial: Special Issue on Data Mining for Geosciences

Authors
Jorge, A; Lopes, RL; Larrazabal, G; Nikhalat Jahromi, H;

Publication
DATA MINING AND KNOWLEDGE DISCOVERY

Abstract

2019

Classifying Heart Sounds Using Images of Motifs, MFCC and Temporal Features

Authors
Nogueira, DM; Ferreira, CA; Gomes, EF; Jorge, AM;

Publication
JOURNAL OF MEDICAL SYSTEMS

Abstract
Cardiovascular disease is the leading cause of death in the world, and its early detection is a key to improving long-term health outcomes. The auscultation of the heart is still an important method in the medical process because it is very simple and cheap. To detect possible heart anomalies at an early stage, an automatic method enabling cardiac health low-cost screening for the general population would be highly valuable. By analyzing the phonocardiogram signals, it is possible to perform cardiac diagnosis and find possible anomalies at an early-term. Therefore, the development of intelligent and automated analysis tools of the phonocardiogram is very relevant. In this work, we use simultaneously collected electrocardiograms and phonocardiograms from the Physionet Challenge database with the main objective of determining whether a phonocardiogram corresponds to a normal or abnormal physiological state. Our main contribution is the methodological combination of time domain features and frequency domain features of phonocardiogram signals to improve cardiac disease automatic classification. This novel approach is developed using both features. First, the phonocardiogram signals are segmented with an algorithm based on a logistic regression hidden semi-Markov model, which uses electrocardiogram signals as a reference. Then, two groups of features from the time and frequency domain are extracted from the phonocardiogram segments. One group is based on motifs and the other on Mel-frequency cepstral coefficients. After that, we combine these features into a two-dimensional time-frequency heat map representation. Lastly, a binary classifier is applied to both groups of features to learn a model that discriminates between normal and abnormal phonocardiogram signals. In the experiments, three classification algorithms are used: Support Vector Machines, Convolutional Neural Network, and Random Forest. The best results are achieved when both time and Mel-frequency cepstral coefficients features are considered using a Support Vector Machines with a radial kernel.

CloseRead Abstract

2019

Guest Editorial

Authors
Jorge, A; Lopes, RL; Larrazabal, G; Nikhalat-Jahromi, H;

Publication
Data Mining and Knowledge Discovery

Abstract