About
Areas of research:
- Knowledge discovery
- Supervised learning
- Multiple predictive models
- Applied knowledge discovery
- Intelligent transportation systems
- Planning and operations of public transports
Areas of research: - Knowledge discovery Supervised learning Multiple predictive models Applied knowledge discovery - Intelligent transportation systems Planning and operations of public transports
Areas of research:
- Knowledge discovery
- Intelligent transportation systems
2023
Authors
Mendes-Neves, T; Seca, D; Sousa, R; Ribeiro, C; Mendes-Moreira, J;
Publication
Computational Economics
Abstract
2023
Authors
Kumar, R; Moreira, JM; Chandra, J;
Publication
APPLIED INTELLIGENCE
Abstract
Intelligent transportation systems (ITS) are gaining attraction in large cities for better traffic management. Traffic forecasting is an important part of ITS, but a difficult one due to the intricate spatiotemporal relationships of traffic between different locations. Despite the fact that remote or far sensors may have temporal and spatial similarities with the predicting sensor, existing traffic forecasting research focuses primarily on modeling correlations between neighboring sensors while disregarding correlations between remote sensors. Furthermore, existing methods for capturing spatial dependencies, such as graph convolutional networks (GCNs), are unable to capture the dynamic spatial dependence in traffic systems. Self-attention-based techniques for modeling dynamic correlations of all sensors currently in use overlook the hierarchical features of roads and have quadratic computational complexity. Our paper presents a new Dynamic Graph Convolution LSTM Network (DyGCN-LSTM) to address the aforementioned limitations. The novelty of DyGCN-LSTM is that it can model the underlying non-linear spatial and temporal correlations of remotely located sensors at the same time. Experimental investigations conducted using four real-world traffic data sets show that the suggested approach is superior to state-of-the-art benchmarks by 25% in terms of RMSE.
2023
Authors
Sousa, AO; Veloso, DT; Goncalves, HM; Faria, JP; Mendes Moreira, J; Graca, R; Gomes, D; Castro, RN; Henriques, PC;
Publication
IEEE ACCESS
Abstract
Software estimation is a vital yet challenging project management activity. Various methods, from empirical to algorithmic, have been developed to fit different development contexts, from plan-driven to agile. Recently, machine learning techniques have shown potential in this realm but are still underexplored, especially for individual task estimation. We investigate the use of machine learning techniques in predicting task effort and duration in software projects to assess their applicability and effectiveness in production environments, identify the best-performing algorithms, and pinpoint key input variables (features) for predictions. We conducted experiments with datasets of various sizes and structures exported from three project management tools used by partner companies. For each dataset, we trained regression models for predicting the effort and duration of individual tasks using eight machine learning algorithms. The models were validated using k-fold cross-validation and evaluated with several metrics. Ensemble algorithms like Random Forest, Extra Trees Regressor, and XGBoost consistently outperformed non-ensemble ones across the three datasets. However, the estimation accuracy and feature importance varied significantly across datasets, with a Mean Magnitude of Relative Error (MMRE) ranging from 0.11 to 9.45 across the datasets and target variables. Nevertheless, even in the worst-performing dataset, effort estimates aggregated to the project level showed good accuracy, with MMRE = 0.23. Machine learning algorithms, especially ensemble ones, seem to be a viable option for estimating the effort and duration of individual tasks in software projects. However, the quality of the estimates and the relevant features may depend largely on the characteristics of the available datasets and underlying projects. Nevertheless, even when the accuracy of individual estimates is poor, the aggregated estimates at the project level may present a good accuracy due to error compensation.
2023
Authors
Pedroto, M; Coelho, T; Jorge, A; Mendes Moreira, J;
Publication
FRONTIERS IN NEUROLOGY
Abstract
IntroductionHereditary transthyretin amyloidosis (ATTRv amyloidosis) is a rare neurological hereditary disease clinically characterized as severe, progressive, and life-threatening while the age of onset represents the moment in time when the first symptoms are felt. In this study, we present and discuss our results on the study, development, and evaluation of an approach that allows for time-to-event prediction of the age of onset, while focusing on genealogical feature construction. Materials and methodsThis research was triggered by the need to answer the medical problem of when will an asymptomatic ATTRv patient show symptoms of the disease. To do so, we defined and studied the impact of 77 features (ranging from demographic and genealogical to familial disease history) we studied and compared a pool of prediction algorithms, namely, linear regression (LR), elastic net (EN), lasso (LA), ridge (RI), support vector machines (SV), decision tree (DT), random forest (RF), and XGboost (XG), both in a classification as well as a regression setting; we assembled a baseline (BL) which corresponds to the current medical knowledge of the disease; we studied the problem of predicting the age of onset of ATTRv patients; we assessed the viability of predicting age of onset on short term horizons, with a classification framing, on localized sets of patients (currently symptomatic and asymptomatic carriers, with and without genealogical information); and we compared the results with an out-of-bag evaluation set and assembled in a different time-frame than the original data in order to account for data leakage. ResultsCurrently, we observe that our approach outperforms the BL model, which follows a set of clinical heuristics and represents current medical practice. Overall, our results show the supremacy of SV and XG for both the prediction tasks although impacted by data characteristics, namely, the existence of missing values, complex data, and small-sized available inputs. DiscussionWith this study, we defined a predictive model approach capable to be well-understood by medical professionals, compared with the current practice, namely, the baseline approach (BL), and successfully showed the improvement achieved to the current medical knowledge.
2023
Authors
Bhanu, M; Roy, S; Priya, S; Mendes Moreira, J; Chandra, J;
Publication
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
Abstract
Predicting taxi demands in large cities can help in better traffic management as well as ensure better commuter satisfaction for an intelligent transportation system. However, the traffic demands across different locations have varying spatio-temporal correlations that are difficult to model. Despite the ability of the existing Deep Neural Network (DNN) models to capture the non-linearity in spatial and temporal characteristics of the demand time-series, capturing spatio-temporal characteristics in different real-world scenarios like varying historic and prediction time frame, spatio-temporal variations due to noise or missing data, etc. still remain a big challenge for the state-of-the-art models. In this paper, we introduce Encoder-ApproXimator (EnAppX), an encoder-decoder DNN-based model that uses Chebyshev function approximation in the decoding stage for taxi demand times-series prediction and can better estimate the time-series in the presence of large spatio-temporal variations. Opposed to any existing state-of-the-art model, the proposed model approximates complete spatiotemporal characteristics in the frequency domain which in turn enables the model to make a robust and improved prediction in different scenarios. Validation over two real-world taxi datasets from different cities shows a considerable improvement of around 23% in RMSE scores compared to the state-of-the-art baseline model. Unlike several existing state-of-the-art models, EnAppX also produces improved prediction accuracy across two regions for both to and fro demands.
Supervised Thesis
2022
Author
Maria José Gomes Pedroto
Institution
UP-FEUP
2022
Author
Luís Jorge Machado da Cunha Meireles
Institution
UP-FEUP
2022
Author
Paulo Jorge Silva Ferreira
Institution
UP-FEUP
2022
Author
Duarte Miguel de Novo Faria
Institution
UP-FEUP
2022
Author
Akilu Rilwan Muhammad
Institution
UP-FEUP
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.