2016
Autores
Sousa, R; Gama, J;
Publicação
ADVANCES IN INTELLIGENT DATA ANALYSIS XV
Abstract
Most data streams systems that use online Multi-target regression yield vast amounts of data which is not targeted. Targeting this data is usually impossible, time consuming and expensive. Semi-supervised algorithms have been proposed to use this untargeted data (input information only) for model improvement. However, most algorithms are adapted to work on batch mode for classification and require huge computational and memory resources. Therefore, this paper proposes an semi-supervised algorithm for online processing systems based on AMRules algorithm that handle both targeted and untargeted data and improves the regression model. The proposed method was evaluated through a comparison between a scenario where the untargeted examples are not used on the training and a scenario where some untargeted examples are used. Evaluation results indicate that the use of the untargeted examples improved the target predictions by improving the model.
2018
Autores
Sousa, R; Gama, J;
Publicação
33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING
Abstract
This paper describes the development of a Co-training (semi-supervised approach) method that uses multiple learners for single target regression on data streams. The experimental evaluation was focused on the comparison between a realistic supervised scenario (all unlabelled examples are discarded) and scenarios where unlabelled examples are used to improve the regression model. Results present fair evidences of error measure reduction by using the proposed Co-training method. However, the error reduction still is relatively small.
2019
Autores
Sousa, R; Antunes, J; Coutinho, F; Silva, E; Santos, J; Ferreira, H;
Publicação
INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY
Abstract
This paper proposes the linear frequency cepstral coefficients as highly discriminative features for anomaly detection in ball bearings using vibration sensor data. These features are based on cepstral analysis and are capable of encoding the patterns of a spectral magnitude profile. Incipient damages on bearings can grow rapidly under normal use resulting in vibration and harsh noise. If left undetected, this damage will worsen, leading to high maintenance costs or even injury. Multiple interferences in an industrial environment contaminate the signal, making it a challenge to correctly identify the bearings' condition. Many studies have attempted to overcome this issue at the signal level. However, the discriminative capacity of the current vibration signal features is still vulnerable to interference, which motivates this work. In order to demonstrate the benefits of these features, we (1) show that they are computationally efficient and suitable for real-time incremental training; (2) conduct discriminative analysis by evaluating the separability performance and comparing it with the state of the art; and (3) test the robustness of the proposed features under noise interference, which is ideal for use in the harsh operating conditions of industrial machinery. The data was obtained from a laboratory workbench setting that reproduces bearing fault scenarios. Results show that the proposed features are fast, competitive when compared to state-of-the-art features, and resilient to high levels of interference. Despite the higher performance when using the quadratic model, the proposed features remain highly discriminative when used with several other discriminant function.
2019
Autores
Saadallah, A; Matias, LM; Sousa, R; Khiari, J; Jenelius, E; Gama, J;
Publicação
35th IEEE International Conference on Data Engineering, ICDE 2019, Macao, China, April 8-11, 2019
Abstract
The dynamic behavior of urban mobility patterns makes matching taxi supply with demand as one of the biggest challenges in this industry. Recently, the increasing availability of massive broadcast GPS data has encouraged the exploration of this issue under different perspectives. One possible solution is to build a data-driven real-time taxi-dispatching recommender system. However, existing systems are based on strong assumptions such as stationary demand distributions and finite training sets, which make them inadequate for modeling the dynamic nature of the network. In this paper, we propose BRIGHT: a drift-aware supervised learning framework which aims to provide accurate predictions for short-term horizon taxi demand quantities through a creative ensemble of time series analysis methods that handle distinct types of concept drift. A large experimental set-up which includes three real-world transportation networks and a synthetic test-bed with artificially inserted concept drifts, was employed to illustrate the advantages of BRIGHT when compared to S.o.A methods for this problem. © 2019 IEEE.
2020
Autores
Saadallah, A; Moreira Matias, L; Sousa, R; Khiari, J; Jenelius, E; Gama, J;
Publicação
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Abstract
Massive data broadcast by GPS-equipped vehicles provide unprecedented opportunities. One of the main tasks in order to optimize our transportation networks is to build data-driven real-time decision support systems. However, the dynamic environments where the networks operate disallow the traditional assumptions required to put in practice many off-the-shelf supervised learning algorithms, such as finite training sets or stationary distributions. In this paper, we propose BRIGHT: a drift-aware supervised learning framework to predict demand quantities. BRIGHT aims to provide accurate predictions for short-term horizons through a creative ensemble of time series analysis methods that handles distinct types of concept drift. By selecting neighborhoods dynamically, BRIGHT reduces the likelihood of overfitting. By ensuring diversity among the base learners, BRIGHT ensures a high reduction of variance while keeping bias stable. Experiments were conducted using three large-scale heterogeneous real-world transportation networks in Porto (Portugal), Shanghai (China), and Stockholm (Sweden), as well as with controlled experiments using synthetic data where multiple distinct drifts were artificially induced. The obtained results illustrate the advantages of BRIGHT in relation to state-of-the-art methods for this task.
2019
Autores
Saadallah, A; Moreira Matias, L; Sousa, R; Khiari, J; Jenelius, E; Gama, J;
Publicação
2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019)
Abstract
The dynamic behavior of urban mobility patterns makes matching taxi supply with demand as one of the biggest challenges in this industry. Recently, the increasing availability of massive broadcast GPS data has encouraged the exploration of this issue under different perspectives. One possible solution is to build a data-driven real-time taxi-dispatching recommender system. However, existing systems are based on strong assumptions such as stationary demand distributions and finite training sets, which make them inadequate for modeling the dynamic nature of the network. In this paper, we propose BRIGHT: a drift-aware supervised learning framework which aims to provide accurate predictions for short-term horizon taxi demand quantities through a creative ensemble of time series analysis methods that handle distinct types of concept drift. A large experimental set-up which includes three real-world transportation networks and a synthetic test-bed with artificially inserted concept drifts, was employed to illustrate the advantages of BRIGHT when compared to S.o.A methods for this problem.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.