Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2010

Active Testing Strategy to Predict the Best Classification Algorithm via Sampling and Metalearning

Autores
Leite, R; Brazdil, P;

Publicação
ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE

Abstract
Currently many classification algorithms exist and there is no algorithm that would outperform all the others in all tasks. Therefore it is of interest to determine which classification algorithm is the best one for a given task. Although direct comparisons can be made for any given problem using a cross-validation evaluation, it is desirable to avoid this, as the computational costs are significant. We describe a method which relies on relatively fast pairwise comparisons involving two algorithms. This method exploits sampling landmarks, that is information about learning curves besides classical data characteristics. One key feature of this method is an iterative procedure for extending the series of experiments used to gather new information in the form of sampling landmarks. Metalearning plays also a vital role. The comparisons between various pairs of algorithm are repeated and the result is represented in the form of a partially ordered ranking. Evaluation is done by comparing the partial order of algorithm that has been predicted to the partial order representing the supposedly correct result. The results of our analysis show that the method has good performance and could be of help in practical applications.

FecharLer Abstract

2010

Paraphrase alignment for synonym evidence discovery

Autores
Grigonyte, G; Cordeiro, J; Dias, G; Moraliyski, R; Brazdil, P;

Publicação
Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference

Abstract
We describe a new unsupervised approach for synonymy discovery by aligning paraphrases in monolingual domain corpora. For that purpose, we identify phrasal terms that convey most of the concepts within domains and adapt a methodology for the automatic extraction and alignment of paraphrases to identify paraphrase casts from which valid synonyms are discovered. Results performed on two different domain corpora show that general synonyms as well as synonymic expressions can be identified with a 67.27% precision.

FecharLer Abstract

2010

The "digitalisation" of youth: How do they manage and integrate digital technologies?

Autores
Brito, PQ;

Publicação
Handbook of Research on Digital Media and Advertising: User Generated Content Consumption

Abstract
The digitalization of youth signifies their complete immersion, active participation and involvement in the production, consumption and sharing of digital content using various interconnected/interfaced digital devices in their social network interactions. A prerequisite to successful commercial communication with young people is having a good understanding of new media, along with their social and psychological framework. The behaviour, motivation and emotions of youth in general and in relation to digital technologies, especially the meaning attached to mobile phones, the Internet (mainly social network sites) and games (computer-based and portable) should also be addressed if advertisers aim to reach this target group. © 2011, IGI Global.

FecharLer Abstract

2010

Distributed Informal Information Systems for Innovation: An Empirical Study of the Role of Social Networks

Autores
Vasconcelos, V; Campos, P;

Publicação
ENTERPRISE INFORMATION SYSTEMS PT II

Abstract
Web 2.0 and Enterprise 2.0 concepts offer a whole new set of collaborative tools that allow new approaches to market research, in order to explore continuously and ever fast-growing social and media environments. Simultaneously, the exponential growth of online social networks, along with a combination of computer-based tools, is contributing to the construction of new kinds of research communities, in which respondents interact with researchers as well as with each other. Furthermore, by studying the networks, researchers are able to manage multiple data sources - user-generated contents. The main purpose of this paper is to propose a new concept of Distributed Informal Information Systems for Innovation that arises from the interaction of the accumulated stock of knowledge emerging at the individual (micro) level. A descriptive study is to unveil and report when and how market research professionals use social networks for their work, creating, therefore, distributed information systems for innovation.

FecharLer Abstract

2010

The Impact of Pre-processing on the Classification of MEDLINE Documents

Autores
Goncalves, CA; Goncalves, CT; Camacho, R; Oliveira, E;

Publicação
PATTERN RECOGNITION IN INFORMATION SYSTEMS

Abstract
The amount of information available in the MEDLINE database makes it very hard for a researcher to retrieve a reasonable amount of relevant documents using a simple query language interface. Automatic Classification of documents may be a valuable technology to help reducing the amount of documents retrieved for each query. To accomplish this process it is of capital importance to use appropriate pre-processing techniques on the data. The main goal of this study is to analyse the impact of pre-processing techniques in text Classification of MEDLINE documents. We have assessed the effect of combining different pre-processing techniques together with several classification algorithms available in the WEKA tool. Our experiments show that the application of pruning, stemming and WordNet reduces significantly the number of attributes and improves the accuracy of the results.

FecharLer Abstract

2010

Machine learning support for kidney transplantation decision making

Autores
Reinaldo, F; Rahman, MA; Alves, CF; Malucelli, A; Camacho, R;

Publicação
ISB 2010 Proceedings - International Symposium on Biocomputing

Abstract
Organ transplantation is a highly complex decision process that requires expert decisions. The major problem in a transplantation procedure is the possibility of the receiver's immune system attack and destroy the transplanted tissue. It is therefore of capital importance to find a donor with the highest possible compatibility with the receiver, and thus reduce rejection. Finding a good donor is not a straight-forward task because a complex network of relations exists between the immunological and the clinical variables that influence the receiver's acceptance of the transplanted organ. Currently the process of analysis of these variables involves a careful study by the clinical transplant team. The number and complexity of causal dependencies among variables make the manual process very slow. In this paper we assess the usefulness of Machine Learning algorithms as a tool to improve and speed up the decisions of a transplant team. We achieve that objective by analysing past real cases and constructing models as set of rules. Such models are accurate and understandable by experts. Copyright 2010 ACM.

FecharLer Abstract