Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por João Mendes Moreira

2012

Text categorization using an ensemble classifier based on a mean co-association matrix

Autores
Moreira Matias, L; Mendes Moreira, J; Gama, J; Brazdil, P;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Text Categorization (TC) has attracted the attention of the research community in the last decade. Algorithms like Support Vector Machines, Naïve Bayes or k Nearest Neighbors have been used with good performance, confirmed by several comparative studies. Recently, several ensemble classifiers were also introduced in TC. However, many of those can only provide a category for a given new sample. Instead, in this paper, we propose a methodology - MECAC - to build an ensemble of classifiers that has two advantages to other ensemble methods: 1) it can be run using parallel computing, saving processing time and 2) it can extract important statistics from the obtained clusters. It uses the mean co-association matrix to solve binary TC problems. Our experiments revealed that our framework performed, on average, 2.04% better than the best individual classifier on the tested datasets. These results were statistically validated for a significance level of 0.05 using the Friedman Test. © 2012 Springer-Verlag.

2012

Online predictive model for taxi services

Autores
Moreira Matias, L; Gama, J; Ferreira, M; Mendes Moreira, J; Damas, L;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
In recent years, both companies and researchers have been exploring intelligent data analysis to increase the profitability of the taxi industry. Intelligent systems for online taxi dispatching and time saving route finding have been built to do so. In this paper, we propose a novel methodology to produce online predictions regarding the spatial distribution of passenger demand throughout taxi stand networks. We have done so by assembling two well-known time series short-term forecast models: the time-varying Poisson models and ARIMA models. Our tests were performed using data gathered over a period of 6 months and collected from 63 taxi stands within the city of Porto, Portugal. Our results demonstrate that this model is a true major contribution to the driver mobility intelligence: 78% of the 253745 demanded taxi services were correctly forecasted in a 30 minutes horizon. © Springer-Verlag Berlin Heidelberg 2012.

2012

An Online Recommendation System for the Taxi Stand choice Problem

Autores
Moreira Matias, L; Fernandes, R; Gama, J; Ferreira, M; Mendes Moreira, J; Damas, L;

Publicação
2012 IEEE VEHICULAR NETWORKING CONFERENCE (VNC)

Abstract
Nowadays, Informed Driving is crucial to the transportation industry. We present an online recommendation model to help the driver to decide about the best stand to head in each moment, minimizing the waiting time. Our approach uses time series forecasting techniques to predict the spatiotemporal distribution in real-time. Then, we combine this information with the live current network status to produce our output. Our online test-beds were carried out using data obtained from a fleet of 441 vehicles running in the city of Porto, Portugal. We demonstrate that our approach can be a major contribution to this industry: 395.361/506.873 of the services dispatched were correctly predicted. Our tests also highlighted that a fleet equipped with such framework surpassed a fleet that is not: they experienced an average waiting time to pick-up a passenger 5% lower than its competitor.

2012

Bus bunching detection by mining sequences of headway deviations

Autores
Moreira Matias, L; Ferreira, C; Gama, J; Mendes Moreira, J; De Sousa, JF;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
In highly populated urban zones, it is common to notice headway deviations (HD) between pairs of buses. When these events occur in a bus stop, they often cause bus bunching (BB) in the following bus stops. Several proposals have been suggested to mitigate this problem. In this paper, we propose to find BBS (Bunching Black Spots) - sequences of bus stops where systematic HD events cause the formation of BB. We run a sequence mining algorithm, named PrefixSpan, to find interesting events available in time series. We prove that we can accurately model the BB trip usual pattern like a frequent sequence mining problem. The subsequences proved to be a promising way of identify the route' schedule points to adjust in order to mitigate such events. © 2012 Springer-Verlag.

2012

Comparing state-of-the-art regression methods for long term travel time prediction

Autores
Mendes Moreira, J; Jorge, AM; de Sousa, JF; Soares, C;

Publicação
INTELLIGENT DATA ANALYSIS

Abstract
Long-term travel time prediction (TTP) can be an important planning tool for both freight transport and public transport companies. In both cases it is expected that the use of long-term TTP can improve the quality of the planned services by reducing the error between the actual and the planned travel times. However, for reasons that we try to stretch out along this paper, long-term TTP is almost not mentioned in the scientific literature. In this paper we discuss the relevance of this study and compare three non-parametric state-of-the-art regression methods: Projection Pursuit Regression (PPR), Support Vector Machine (SVM) and Random Forests (RF). For each one of these methods we study the best combination of input parameters. We also study the impact of different methods for the pre-processing tasks (feature selection, example selection and domain values definition) in the accuracy of those algorithms. We use bus travel time's data from a bus dispatch system. From an off-the-shelf point-of-view, our experiments show that RF is the most promising approach from the three we have tested. However, it is possible to obtain more accurate results using PPR but with extra pre-processing work, namely on example selection and domain values definition.

2012

Ensemble Approaches for Regression: A Survey

Autores
Mendes Moreira, J; Soares, C; Jorge, AM; De Sousa, JF;

Publicação
ACM COMPUTING SURVEYS

Abstract
The goal of ensemble regression is to combine several models in order to improve the prediction accuracy in learning problems with a numerical target variable. The process of ensemble learning can be divided into three phases: the generation phase, the pruning phase, and the integration phase. We discuss different approaches to each of these phases that are able to deal with the regression problem, categorizing them in terms of their relevant characteristics and linking them to contributions from different fields. Furthermore, this work makes it possible to identify interesting areas for future research.

  • 20
  • 22