1993
Autores
Torgo, L;
Publicação
Machine Learning: ECML-93, European Conference on Machine Learning, Vienna, Austria, April 5-7, 1993, Proceedings
Abstract
This paper introduces a new concept learning system. Its main features are presented and discussed. The controlled use of redundancy is one of the main characteristics of the program. Redundancy, in this system, is used to deal with several types of uncertainty existing in real domains. The problem of the use of redundancy is addressed, namely its influence on accuracy and comprehensibility. Extensive experiments were carried out on three real world domains. These experiments showed clearly the advantages of the use of redundancy. © Springer-Verlag Berlin Heidelberg 1993.
2000
Autores
Torgo, L; da Costa, JP;
Publicação
MACHINE LEARNING: ECML 2000
Abstract
This paper presents a new method that deals with a supervised learning task usually known as multivariate regression. The main distinguishing feature of this new technique is the use of a clustering method to obtain sub-sets of the training data before the learning phase. After this "resampling" process a different regression model is fitted to each found cluster. We call the resulting method clustered partial linear regression. Predictions using this technique are preceded by a cluster membership query for each test case. The cluster membership probability of a test case is used as a weight in an averaging process that calculates the final prediction. This averaging process involves the predictions of the regression models associated to the clusters for which the test case may belong. We have tested this general multi-strategy approach using several regression techniques and we have observed significant accuracy gains in several data sets. We have also compared our method to bagging that also uses an averaging process to obtain predictions. This experiment showed that the two methods are significantly different. Finally, we present a comparison of our method with several state-of-the-art regression methods.
2010
Autores
Fürnkranz, J; Chan, PK; Craw, S; Sammut, C; Uther, W; Ratnaparkhi, A; Jin, X; Han, J; Yang, Y; Morik, K; Dorigo, M; Birattari, M; Stützle, T; Brazdil, P; Vilalta, R; Giraud-Carrier, C; Soares, C; Rissanen, J; Baxter, RA; Bruha, I; Baxter, RA; Webb, GI; Torgo, L; Banerjee, A; Shan, H; Ray, S; Tadepalli, P; Shoham, Y; Powers, R; Shoham, Y; Powers, R; Webb, GI; Ray, S; Scott, S; Blockeel, H; De Raedt, L;
Publicação
Encyclopedia of Machine Learning
Abstract
2010
Autores
Buhmann, MD; Melville, P; Sindhwani, V; Quadrianto, N; Buntine, WL; Torgo, L; Zhang, X; Stone, P; Struyf, J; Blockeel, H; Driessens, K; Miikkulainen, R; Wiewiora, E; Peters, J; Tedrake, R; Roy, N; Morimoto, J; Flach, PA; Fürnkranz, J;
Publicação
Encyclopedia of Machine Learning
Abstract
2007
Autores
Torgo, L;
Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS
Abstract
This paper describes an approach to fraud detection targeted at applications where this task is followed by a posterior human analysis of the signaled frauds. This is a frequent setup on fraud detection applications (e.g. credit card misuse, telecom fraud, etc.). In real world applications this human inspection is usually constrained by limited resources. In this context, standard fraud detection methods that simply tag each case as being (or not) a possible fraud are not very useful if the number of tagged cases surpasses the available resources. A much more useful approach is to produce a ranking of fraud that can be used to optimize the available inspection resources by first addressing the cases with higher rank. In this paper we propose a method that produces such ranking. The method is based on the output of standard agglomerative hierarchical clustering algorithms, resulting in no significant additional computational costs. Our comparisons with a state of the art method provide convincing evidence of the competitiveness of our proposal.
2001
Autores
De Almeida, P; Torgo, L;
Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Abstract
Most of the existing data mining approaches to time series prediction use as training data an embed of the most recent values of the time series, following the traditional linear auto-regressive methodologies. However, in many time series prediction tasks the alternative approach that uses derivative features constructed from the raw data with the help of domain theories can produce significant prediction accuracy improvements. This is particularly noticeable when the available data includes multivariate information although the aim is still the prediction of one particular time series. This latter situation occurs frequently in financial time series prediction. This paper presents a method of feature construction based on domain knowledge that uses multivariate time series information. We show that this method improves the accuracy of next-day stock quotes prediction when compared with the traditional embed of historical values extracted from the original data. © Springer-Verlag Berlin Heidelberg 2001.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.