Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Alípio Jorge

2018

Online bagging for recommender systems

Authors
Vinagre, J; Jorge, AM; Gama, J;

Publication
EXPERT SYSTEMS

Abstract
Ensemble methods have been successfully used in the past to improve recommender systems; however, they have never been studied with incremental recommendation algorithms. Many online recommender systems deal with continuous, potentially fast, and unbounded flows of databig data streamsand often need to be responsive to fresh user feedback, adjusting recommendations accordingly. This is clear in tasks such as social network feeds, news recommender systems, automatic playlist completion, and other similar applications. Batch ensemble approaches are not suitable to perform continuous learning, given the complexity of retraining new models on demand. In this paper, we adapt a general purpose online bagging algorithm for top-N recommendation tasks and propose two novel online bagging methods specifically tailored for recommender systems. We evaluate the three approaches, using an incremental matrix factorization algorithm for top-N recommendation with positive-only user feedback data as the base model. Our results show that online bagging is able to improve accuracy up to 55% over the baseline, with manageable computational overhead.

2018

Discovering a taste for the unusual: exceptional models for preference mining

Authors
de Sá, CR; Duivesteijn, W; Azevedo, P; Jorge, AM; Soares, C; Knobbe, A;

Publication
MACHINE LEARNING

Abstract
Exceptional preferences mining (EPM) is a crossover between two subfields of data mining: local pattern mining and preference learning. EPM can be seen as a local pattern mining task that finds subsets of observations where some preference relations between labels significantly deviate from the norm. It is a variant of subgroup discovery, with rankings of labels as the target concept. We employ several quality measures that highlight subgroups featuring exceptional preferences, where the focus of what constitutes exceptional' varies with the quality measure: two measures look for exceptional overall ranking behavior, one measure indicates whether a particular label stands out from the rest, and a fourth measure highlights subgroups with unusual pairwise label ranking behavior. We explore a few datasets and compare with existing techniques. The results confirm that the new task EPM can deliver interesting knowledge.

2019

Impact of Genealogical Features in Transthyretin Familial Amyloid Polyneuropathy Age of Onset Prediction

Authors
Pedroto, M; Jorge, A; Mendes Moreira, J; Coelho, T;

Publication
PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY AND BIOINFORMATICS

Abstract
Transthyretin Familial Amyloid Polyneuropathy (TTR-FAP) is a neurological genetic disease that propagates from one family generation to the next. The disease can have severe effects on the life of patients after the first symptoms (onset) appear. Accurate prediction of the age of onset for these patients can help the management of the impact. This is, however, a challenging problem since both familial and non-familial characteristics may or may not affect the age of onset. In this work, we assess the importance of sets of genealogical features used for Predicting the Age of Onset of TTR-FAP Patients. We study three sets of features engineered from clinical and genealogical data records obtained from Portuguese patients. These feature sets, referred to as Patient, First Level and Extended Level Features, represent sets of characteristics related to each patient's attributes and their familial relations. They were compiled by a Medical Research Center working with TTR-FAP patients. Our results show the importance of genealogical data when clinical records have no information related with the ancestor of the patient, namely its Gender and Age of Onset. This is suggested by the improvement of the estimated predictive error results after combining First and Extended Level with the Patients Features.

2018

Predicting Age of Onset in TTR-FAP Patients with Genealogical Features

Authors
Pedroto, M; Jorge, A; Moreira, JM; Coelho, T;

Publication
CBMS

Abstract
This work describes a problem oriented approach to analyze and predict the Age of Onset of Patients diagnosed with Transthyretin Familial Amyloid Polyneuropathy (TTR-FAP). We constructed, from a set of clinical and familial records, three sets of features which represent different characteristics of a patient, before becoming symptomatic. Using those features, we tested a set of machine learning regression methods, namely Decision Tree (Regression Tree), Elastic Net, Lasso, Linear Regression, Random Forest Regressor, Ridge Regression and Support Vector Machine Regressor (SVM). Later, we defined a baseline model that represents the current medical practice to serve as a guideline for us to measure the accuracy of our approach. Our results show a significant improvement of machine learning methods when compared with the current baseline.

2018

ECIR 2018: Text2Story Workshop - Narrative Extraction from Texts

Authors
Jorge, A; Campos, R; Jatowt, A; Nunes, S; Rocha, C; Cordeiro, JP; Pasquali, A; Mangaravite, V;

Publication
SIGIR Forum

Abstract
The 1st International Workshop on Narrative Extraction from Texts (Text2Story 2018) was held in conjunction with the 40th European Conference on Information Retrieval, ECIR 2018, Grenoble on the 26 th March 2018. The workshop aimed to help foster the collaboration of researchers on a wide range of multidisciplinary issues related to the text-to-narrativestructure. The program consisted of two keynote talks, six research presentations, a poster session and a slot for demo presentations. This report briefly summarizes the workshop. More information about the workshop is available at http://text2story18.inesctec.pt

2018

Online Gradient Boosting for Incremental Recommender Systems

Authors
Vinagre, J; Jorge, AM; Gama, J;

Publication
DS

Abstract
Ensemble models have been proven successful for batch recommendation algorithms, however they have not been well studied in streaming applications. Such applications typically use incremental learning, to which standard ensemble techniques are not trivially applicable. In this paper, we study the application of three variants of online gradient boosting to top-N recommendation tasks with implicit data, in a streaming data environment. Weak models are built using a simple incremental matrix factorization algorithm for implicit feedback. Our results show a significant improvement of up to 40% over the baseline standalone model. We also show that the overhead of running multiple weak models is easily manageable in stream-based applications.

  • 10
  • 46