Publications

Publications by LIAAD

2009

Parallel ILP for distributed-memory architectures

Authors
Fonseca, NA; Srinivasan, A; Silva, F; Camacho, R;

Publication
MACHINE LEARNING

Abstract
The growth of machine-generated relational databases, both in the sciences and in industry, is rapidly outpacing our ability to extract useful information from them by manual means. This has brought into focus machine learning techniques like Inductive Logic Programming (ILP) that are able to extract human-comprehensible models for complex relational data. The price to pay is that ILP techniques are not efficient: they can be seen as performing a form of discrete optimisation, which is known to be computationally hard; and the complexity is usually some super-linear function of the number of examples. While little can be done to alter the theoretical bounds on the worst-case complexity of ILP systems, some practical gains may follow from the use of multiple processors. In this paper we survey the state-of-the-art on parallel ILP. We implement several parallel algorithms and study their performance using some standard benchmarks. The principal findings of interest are these: (1) of the techniques investigated, one that simply constructs models in parallel on each processor using a subset of data and then combines the models into a single one, yields the best results; and (2) sequential (approximate) ILP algorithms based on randomized searches have lower execution times than (exact) parallel algorithms, without sacrificing the quality of the solutions found.

CloseRead Abstract

2009

Learning cost-sensitive decision trees to support medical diagnosis

Authors
Freitas, A; Costa Pereira, A; Brazdil, P;

Publication
Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development: Innovative Methods and Applications

Abstract
Classification plays an important role in medicine, especially for medical diagnosis. Real-world medical applications often require classifiers that minimize the total cost, including costs for wrong diagnosis (misclassifications costs) and diagnostic test costs (attribute costs). There are indeed many reasons for considering costs in medicine, as diagnostic tests are not free and health budgets are limited. In this chapter, the authors have defined strategies for cost-sensitive learning. They have developed an algorithm for decision tree induction that considers various types of costs, including test costs, delayed costs and costs associated with risk. Then they have applied their strategy to train and to evaluate cost-sensitive decision trees in medical data. Generated trees can be tested following some strategies, including group costs, common costs, and individual costs. Using the factor of "risk" it is possible to penalize invasive or delayed tests and obtain patient-friendly decision trees. © 2010, IGI Global.

CloseRead Abstract

2009

Cost-sensitive learning in medicine

Authors
Freitas, A; Brazdil, P; Costa Pereira, A;

Publication
Data Mining and Medical Knowledge Management: Cases and Applications

Abstract
This chapter introduces cost-sensitive learning and its importance in medicine. Health managers and clinicians often need models that try to minimize several types of costs associated with healthcare, including attribute costs (e.g. the cost of a specific diagnostic test) and misclassification costs (e.g. the cost of a false negative test). In fact, as in other professional areas, both diagnostic tests and its associated misclassification errors can have significant financial or human costs, including the use of unnecessary resource and patient safety issues. This chapter presents some concepts related to cost-sensitive learning and cost-sensitive classification and its application to medicine. Different types of costs are also present, with an emphasis on diagnostic tests and misclassification costs. In addition, an overview of research in the area of cost-sensitive learning is given, including current methodological approaches. Finally, current methods for the cost-sensitive evaluation of classifiers are discussed. © 2009, IGI Global.

CloseRead Abstract

2009

Shopping centre image dynamics of a new entrant

Authors
Brito, PQ;

Publication
International Journal of Retail and Distribution Management

Abstract
Purpose: The purpose of this paper is to investigate how and to what extent the attributes of a new shopping centre entrant evolve during the first seven months of operation, and the implications this has for the incumbents. To capture the strategic relevance of those changes a consumer image tracking analytical tool is developed and applied. Design/methodology/approach: Qualitative research followed by a longitudinal survey. Hypothesis testing approach and descriptive analysis. Findings: The correlates between the magnitudes of shopping centre attribute perception variations, the level of self-confidence in image evaluation, shopping centre frequency of visits, degree of the "halo effect", shopping centre and store consumer's preferences are analysed. Only the self-confidence and store preference did not evolve with the image magnitude changes as hypothesised. Research limitations/implications: The assessment of shopping centre image changes over time, as well as the factors underlying those changes help managers to plan strategy. Some monitoring procedures are proposed and their implications for both marketing and shopping centre operations are discussed. Originality/value: By incorporating the time dimension, the true nature of image variation can only be captured if the identification of attributes, and the amount, intensity and direction of changes are obtained, measured and analysed together. The magnitude of image variation is more associated with a shopping centre than with its stores. © Emerald Group Publishing Limited.

CloseRead Abstract

2009

On Mining Protein Unfolding Simulation Data with Inductive Logic Programming

Authors
Camacho, R; Alves, A; Silva, CG; Brito, RMM;

Publication
2ND INTERNATIONAL WORKSHOP ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (IWPACBB 2008)

Abstract
The detailed study of folding and unfolding events in proteins is becoming central to develop rational therapeutic strategies against maladies such as Alzheimer and Parkinson disease. A promising approach to study the unfolding processes of proteins is through computer simulations. However, these computer simulations generate huge amounts of data that require computational methods for their analysis. In this paper we report on the use of Inductive Logic Programming (ILP) techniques to analyse the trajectories of protein unfolding simulations. The paper describes ongoing work on one of several problems of interest in the protein unfolding setting. The problem we address here is that of explaining what makes secondary structure elements to break down during the unfolding process. We tackle such problem collecting examples of contexts where secondary structures break and (automatically) constructing rules that may be used to suggest the explanations.

CloseRead Abstract

2009

Assessing the Eligibility of Kidney Transplant Donors

Authors
Reinaldo, F; Fernandes, C; Rahman, MA; Malucelli, A; Camacho, R;

Publication
MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION

Abstract
Organ transplantation is a highly complex decision process that requires expert, decisions. The major problem ill a transplantation procedure is the possibility of the receiver's immune system attack and destroy the transplanted tissue. It is therefore of capital importance to find a donor with the highest possible compatibility with the receiver, and thus reduce rejection. Finding a good donor is not a straightforward task because a complex network of relations exist's between the immunological and the clinical variables that, influence the receivers acceptance of the transplanted organ. Currently the process of analyzing these variables involves a careful study by the clinical transplant team. The number and complexity of the relations between variables make the manual process very slow. Ill this paper we propose and compare two Machine Learning algorithms that might help the transplant team ill improving and Speeding up their decisions. We achieve that objective by analyzing past real cases and constructing models as set, of rules. Such models are accurate and understandable by experts.

CloseRead Abstract