Publications

Publications by Alípio Jorge

2018

Discovering a taste for the unusual: exceptional models for preference mining

Authors
de Sá, CR; Duivesteijn, W; Azevedo, P; Jorge, AM; Soares, C; Knobbe, A;

Publication
MACHINE LEARNING

Abstract
Exceptional preferences mining (EPM) is a crossover between two subfields of data mining: local pattern mining and preference learning. EPM can be seen as a local pattern mining task that finds subsets of observations where some preference relations between labels significantly deviate from the norm. It is a variant of subgroup discovery, with rankings of labels as the target concept. We employ several quality measures that highlight subgroups featuring exceptional preferences, where the focus of what constitutes exceptional' varies with the quality measure: two measures look for exceptional overall ranking behavior, one measure indicates whether a particular label stands out from the rest, and a fourth measure highlights subgroups with unusual pairwise label ranking behavior. We explore a few datasets and compare with existing techniques. The results confirm that the new task EPM can deliver interesting knowledge.

CloseRead Abstract

2019

Impact of Genealogical Features in Transthyretin Familial Amyloid Polyneuropathy Age of Onset Prediction

Authors
Pedroto, M; Jorge, A; Mendes Moreira, J; Coelho, T;

Publication
PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY AND BIOINFORMATICS

Abstract
Transthyretin Familial Amyloid Polyneuropathy (TTR-FAP) is a neurological genetic disease that propagates from one family generation to the next. The disease can have severe effects on the life of patients after the first symptoms (onset) appear. Accurate prediction of the age of onset for these patients can help the management of the impact. This is, however, a challenging problem since both familial and non-familial characteristics may or may not affect the age of onset. In this work, we assess the importance of sets of genealogical features used for Predicting the Age of Onset of TTR-FAP Patients. We study three sets of features engineered from clinical and genealogical data records obtained from Portuguese patients. These feature sets, referred to as Patient, First Level and Extended Level Features, represent sets of characteristics related to each patient's attributes and their familial relations. They were compiled by a Medical Research Center working with TTR-FAP patients. Our results show the importance of genealogical data when clinical records have no information related with the ancestor of the patient, namely its Gender and Age of Onset. This is suggested by the improvement of the estimated predictive error results after combining First and Extended Level with the Patients Features.

CloseRead Abstract

2018

Predicting Age of Onset in TTR-FAP Patients with Genealogical Features

Authors
Pedroto, M; Jorge, A; Moreira, JM; Coelho, T;

Publication
CBMS

Abstract
This work describes a problem oriented approach to analyze and predict the Age of Onset of Patients diagnosed with Transthyretin Familial Amyloid Polyneuropathy (TTR-FAP). We constructed, from a set of clinical and familial records, three sets of features which represent different characteristics of a patient, before becoming symptomatic. Using those features, we tested a set of machine learning regression methods, namely Decision Tree (Regression Tree), Elastic Net, Lasso, Linear Regression, Random Forest Regressor, Ridge Regression and Support Vector Machine Regressor (SVM). Later, we defined a baseline model that represents the current medical practice to serve as a guideline for us to measure the accuracy of our approach. Our results show a significant improvement of machine learning methods when compared with the current baseline.

CloseRead Abstract

2018

ECIR 2018: Text2Story Workshop - Narrative Extraction from Texts

Authors
Jorge, A; Campos, R; Jatowt, A; Nunes, S; Rocha, C; Cordeiro, JP; Pasquali, A; Mangaravite, V;

Publication
SIGIR Forum

Abstract
The 1st International Workshop on Narrative Extraction from Texts (Text2Story 2018) was held in conjunction with the 40th European Conference on Information Retrieval, ECIR 2018, Grenoble on the 26 th March 2018. The workshop aimed to help foster the collaboration of researchers on a wide range of multidisciplinary issues related to the text-to-narrativestructure. The program consisted of two keynote talks, six research presentations, a poster session and a slot for demo presentations. This report briefly summarizes the workshop. More information about the workshop is available at http://text2story18.inesctec.pt

CloseRead Abstract

2018

Online Gradient Boosting for Incremental Recommender Systems

Authors
Vinagre, J; Jorge, AM; Gama, J;

Publication
DS

Abstract
Ensemble models have been proven successful for batch recommendation algorithms, however they have not been well studied in streaming applications. Such applications typically use incremental learning, to which standard ensemble techniques are not trivially applicable. In this paper, we study the application of three variants of online gradient boosting to top-N recommendation tasks with implicit data, in a streaming data environment. Weak models are built using a simple incremental matrix factorization algorithm for implicit feedback. Our results show a significant improvement of up to 40% over the baseline standalone model. We also show that the overhead of running multiple weak models is easily manageable in stream-based applications.

CloseRead Abstract

2018

A Study on Contextual Influences on Automatic Playlist Continuation

Authors
Gatzioura, A; Marrè, MS; Jorge, AM;

Publication
CCIA

Abstract
Recommender systems still mainly base their reasoning on pairwise interactions or information on individual entities, like item attributes or ratings, without properly evaluating the multiple dimensions of the recommendation problem. However, in many cases, like in music, items are rarely consumed in isolation, thus users rather need a set of items, selected to work well together, serving a specific purpose, while having some cognitive properties as a whole, related to their perception of quality and satisfaction, under given circumstances. In this paper, we introduce the term of playlist concept in order to capture the implicit characteristics of joint music item selections, related to their context, scope and general perception by the users. Although playlist consumptions may be associated with contextual attributes, these may be of various types, differently influencing users' preferences, based on their character and emotional state, therefore differently reflected on their final selections. We highlight on the use of this term in HybA, our hybrid recommender system, to identify clusters of similar playlists able to capture inherit characteristics and semantic properties, not explicitly described in them. The experimental results presented, show that this conceptual clustering results in playlist continuations of improved quality, compared to using explicit contextual parameters, or the commonly used collaborative filtering technique.

CloseRead Abstract