2018
Autores
Baghoussi, Y; Mendes Moreira, J; Emmerich, MTM;
Publicação
2018 10TH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS)
Abstract
Transportation systems are very complex systems due to the characteristics of their components such as buses. Nowadays, buses are set up to follow a particular schedule that is very sensitive to the changes that occur inside the system. These schedules must frequently be updated, if necessary, due to many reasons. Among these reasons, we have the population growth inside the cities as well as traffic and congestions caused by unforeseen events. To solve the problem of system variability, companies such as the Public Transport Company in the city of Porto (STCP) usually fixes bus schedules with headways adapted to each type of bus lines (i.e., high/low-frequency bus lines). In this work, we adopt a robust optimization model from literature to improve the bus schedules using Automatic Vehicle Location Data collected along the year in the city of Porto. We apply the model to a high-frequency bus line case study. We present the model imperfections and propose new updates.
2018
Autores
Garcia, KD; de Carvalho, ACPLF; Moreira, JM;
Publicação
Intelligent Data Engineering and Automated Learning - IDEAL 2018 - 19th International Conference, Madrid, Spain, November 21-23, 2018, Proceedings, Part I
Abstract
Data stream is a challenging research topic in which data can continuously arrive with a probability distribution that may change over time. Depending on the changes in the data distribution, different phenomena can occur, for example, a concept drift. A concept drift occurs when the concepts associated with a dataset change when new data arrive. This paper proposes a new method based on k-Nearest Neighbors that implements a sliding window requiring less instances stored for training than existing methods. For such, a clustering approach is used to summarize data by placing labeled instances considered similar in the same cluster. Besides, instances close to the uncertainty border of existing classes are also stored, in a sliding window, to adapt the model to concept drift. The proposed method is experimentally compared with state-of-the-art classifiers from the data stream literature, regarding accuracy and processing time. According to the experimental results, the proposed method has better accuracy and less time consumption when fewer information about the concepts are stored in a single sliding window. © 2018, Springer Nature Switzerland AG.
2018
Autores
Barbosa, P; Garcia, KD; Moreira, JM; de Carvalho, ACPLF;
Publicação
Intelligent Data Engineering and Automated Learning - IDEAL 2018 - 19th International Conference, Madrid, Spain, November 21-23, 2018, Proceedings, Part I
Abstract
Human Activity Recognition has been primarily investigated as a machine learning classification task forcing it to handle with two main limitations. First, it must assume that the testing data has an equal distribution with the training sample. However, the inherent structure of an activity recognition systems is fertile in distribution changes over time, for instance, a specific person can perform physical activities differently from others, and even sensors are prone to misfunction. Secondly, to model the pattern of activities carried out by each user, a significant amount of data is needed. This is impractical especially in the actual era of Big Data with effortless access to public repositories. In order to deal with these problems, this paper investigates the use of Transfer Learning, specifically Unsupervised Domain Adaptation, within human activity recognition systems. The yielded experiment results reveal a useful transfer of knowledge and more importantly the convenience of transfer learning within human activity recognition. Apart from the delineated experiments, our work also contributes to the field of transfer learning in general through an exhaustive survey on transfer learning for human activity recognition based on wearables. © 2018, Springer Nature Switzerland AG.
2018
Autores
Baghoussi, Y; Moreira, JM;
Publicação
Intelligent Data Engineering and Automated Learning - IDEAL 2018 - 19th International Conference, Madrid, Spain, November 21-23, 2018, Proceedings, Part I
Abstract
We present a method for improving the prediction accuracy using multiple predictive algorithms. Several techniques have been developed to tackle this issue such as bagging, boosting and stacking. In contrary to the first two that, usually, generate homogeneous ensembles of classifiers, stacking techniques have demonstrated success using heterogeneous ensembles. In our method, we adopt the stacking mechanism. Several models are generated using different learning algorithms. Forward stepwise selection is implemented to link each instance to its appropriate learning model. Experiments with three datasets benchmarked with four learning schemes show that this novel method improves prediction accuracy and can serve as a bridge to transfer knowledge between tasks given the same feature space but different data distributions. © 2018, Springer Nature Switzerland AG.
2018
Autores
Moreira, JM; de Carvalho, ACPdLF; Horváth, T;
Publicação
A General Introduction to Data Analytics
Abstract
A guide to the principles and methods of data analysis that does not require knowledge of statistics or programming A General Introduction to Data Analytics is an essential guide to understand and use data analytics. This book is written using easy-to-understand terms and does not require familiarity with statistics or programming. The authors—noted experts in the field—highlight an explanation of the intuition behind the basic data analytics techniques. The text also contains exercises and illustrative examples. Thought to be easily accessible to non-experts, the book provides motivation to the necessity of analyzing data. It explains how to visualize and summarize data, and how to find natural groups and frequent patterns in a dataset. The book also explores predictive tasks, be them classification or regression. Finally, the book discusses popular data analytic applications, like mining the web, information retrieval, social network analysis, working with text, and recommender systems. The learning resources offer: A guide to the reasoning behind data mining techniques A unique illustrative example that extends throughout all the chapters Exercises at the end of each chapter and larger projects at the end of each of the text’s two main parts Together with these learning resources, the book can be used in a 13-week course guide, one chapter per course topic. The book was written in a format that allows the understanding of the main data analytics concepts by non-mathematicians, non-statisticians and non-computer scientists interested in getting an introduction to data science. A General Introduction to Data Analytics is a basic guide to data analytics written in highly accessible terms. © 2019 John Wiley & Sons, Inc.
2018
Autores
Bhanu, M; Priya, S; Dandapat, SK; Chandra, J; Moreira, JM;
Publicação
Advanced Data Mining and Applications - 14th International Conference, ADMA 2018, Nanjing, China, November 16-18, 2018, Proceedings
Abstract
An efficient traffic-network is an essential demand for any smart city. Usually, city traffic forms a huge network with millions of locations and trips. Traffic flow prediction using such large data is a classical problem in intelligent transportation system (ITS). Many existing models such as ARIMA, SVR, ANN etc, are deployed to retrieve important characteristics of traffic-network and for forecasting mobility. However, these methods suffer from the inability to handle higher data dimensionality. The tensor-based approach has recently gained success over the existing methods due to its ability to decompose high dimension data into factor components. We present a modified Tucker decomposition method which predicts traffic mobility by approximating very large networks so as to handle the dimensionality problem. Our experiments on two big-city traffic-networks show that our method reduces the forecasting error, for up to 7 days, by around 80% as compared to the existing state of the art methods. Further, our method also efficiently handles the data dimensionality problem as compared to the existing methods. © 2018, Springer Nature Switzerland AG.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.