Publications

Publications by João Gama

2012

Online predictive model for taxi services

Authors
Moreira Matias, L; Gama, J; Ferreira, M; Mendes Moreira, J; Damas, L;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
In recent years, both companies and researchers have been exploring intelligent data analysis to increase the profitability of the taxi industry. Intelligent systems for online taxi dispatching and time saving route finding have been built to do so. In this paper, we propose a novel methodology to produce online predictions regarding the spatial distribution of passenger demand throughout taxi stand networks. We have done so by assembling two well-known time series short-term forecast models: the time-varying Poisson models and ARIMA models. Our tests were performed using data gathered over a period of 6 months and collected from 63 taxi stands within the city of Porto, Portugal. Our results demonstrate that this model is a true major contribution to the driver mobility intelligence: 78% of the 253745 demanded taxi services were correctly forecasted in a 30 minutes horizon. © Springer-Verlag Berlin Heidelberg 2012.

CloseRead Abstract

2012

An Online Recommendation System for the Taxi Stand choice Problem

Authors
Moreira Matias, L; Fernandes, R; Gama, J; Ferreira, M; Mendes Moreira, J; Damas, L;

Publication
2012 IEEE VEHICULAR NETWORKING CONFERENCE (VNC)

Abstract
Nowadays, Informed Driving is crucial to the transportation industry. We present an online recommendation model to help the driver to decide about the best stand to head in each moment, minimizing the waiting time. Our approach uses time series forecasting techniques to predict the spatiotemporal distribution in real-time. Then, we combine this information with the live current network status to produce our output. Our online test-beds were carried out using data obtained from a fleet of 441 vehicles running in the city of Porto, Portugal. We demonstrate that our approach can be a major contribution to this industry: 395.361/506.873 of the services dispatched were correctly predicted. Our tests also highlighted that a fleet equipped with such framework surpassed a fleet that is not: they experienced an average waiting time to pick-up a passenger 5% lower than its competitor.

CloseRead Abstract

2005

An adaptive predictive model for student modeling

Authors
Castillo, G; Gama, J; Breda, AM;

Publication
Advances in Web-Based Education: Personalized Learning Environments

Abstract
This chapter presents an adaptive predictive model for a student modeling prediction task in the context of an adaptive educational hypermedia system (AEHS). The task, that consists in determining what kind of learning resources are more appropriate to a particular learning style, presents two issues that are critical. The first is related to the uncertainty of the information about the student's learning style acquired by psychometric instruments. The second is related to the changes over time of the student's preferences (concept drift). To approach this task, we propose a probabilistic adaptive predictive model that includes a method to handle concept drift based on statistical quality control. We claim that our approach is able to adapt quickly to changes in the student's preferences and that it should be successfully used in similar user modeling prediction tasks, where uncertainty and concept drift are presented. © 2006, Idea Group Inc.

CloseRead Abstract

2012

Bus bunching detection by mining sequences of headway deviations

Authors
Moreira Matias, L; Ferreira, C; Gama, J; Mendes Moreira, J; De Sousa, JF;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
In highly populated urban zones, it is common to notice headway deviations (HD) between pairs of buses. When these events occur in a bus stop, they often cause bus bunching (BB) in the following bus stops. Several proposals have been suggested to mitigate this problem. In this paper, we propose to find BBS (Bunching Black Spots) - sequences of bus stops where systematic HD events cause the formation of BB. We run a sequence mining algorithm, named PrefixSpan, to find interesting events available in time series. We prove that we can accurately model the BB trip usual pattern like a frequent sequence mining problem. The subsequences proved to be a promising way of identify the route' schedule points to adjust in order to mitigate such events. © 2012 Springer-Verlag.

CloseRead Abstract

2010

Monitoring Incremental Histogram Distribution for Change Detection in Data Streams

Authors
Sebastiao, R; Gama, J; Rodrigues, PP; Bernardes, J;

Publication
KNOWLEDGE DISCOVERY FROM SENSOR DATA

Abstract
Histograms are a common technique for density estimation and they have been widely used as a tool in exploratory data analysis. Learning histograms from static and stationary data is a well known topic. Nevertheless, very few works discuss this problem when we have a continuous flow of data generated from dynamic environments. The scope of this paper is to detect changes from high-speed time-changing data streams. To address this problem, we construct histograms able to process examples once at the rate they arrive. The main goal of this work is continuously maintain a histogram consistent with the current status of the nature. We study strategies to detect changes in the distribution generating examples, and adapt the histogram to the most recent data by forgetting outdated data. We use the Partition Incremental Discretization algorithm that was designed to learn histograms from high-speed data streams. We present a method to detect whenever a change in the distribution generating examples occurs. The base idea consists of monitoring distributions from two different time windows: the reference window, reflecting the distribution observed in the past; and the current window which receives the most recent data. The current window is cumulative and can have a fixed or an adaptive step depending on the distance between distributions. We compared both distributions using Kullback-Leibler divergence, defining a threshold for change detection decision based on the asymmetry of this measure. We evaluated our algorithm with controlled artificial data sets and compare the proposed approach with nonparametric tests. We also present results with real word data sets from industrial and medical domains. Those results suggest that an adaptive window's step exhibit high probability in change detection and faster detection rates, with few false positives alarms.

CloseRead Abstract

2011

Correcting streaming predictions of an electricity load forecast system using a prediction reliability estimate

Authors
Bosnic, Z; Rodrigues, PP; Kononenko, I; Gama, J;

Publication
Advances in Intelligent and Soft Computing

Abstract
Accurately predicting values for dynamic data streams is a challenging task in decision and expert systems, due to high data flow rates, limited storage and a requirement to quickly adapt a model to new data. We propose an approach for correcting predictions for data streams which is based on a reliability estimate for individual regression predictions. In our work, we implement the proposed technique and test it on a real-world problem: prediction of the electricity load for a selected European geographical region. For predicting the electricity load values we implement two regression models: the neural network and the k nearest neighbors algorithm. The results show that our method performs better than the referential method (i.e. the Kalman filter), significantly improving the original streaming predictions to more accurate values. © 2011 Springer-Verlag Berlin Heidelberg.

CloseRead Abstract