Publications

Publications by LIAAD

2023

Applying Machine Learning to Estimate the Effort and Duration of Individual Tasks in Software Projects

Authors
Sousa, AO; Veloso, DT; Gonçalves, HM; Faria, JP; Mendes Moreira, J; Graça, R; Gomes, D; Castro, RN; Henriques, PC;

Publication
IEEE ACCESS

Abstract
Software estimation is a vital yet challenging project management activity. Various methods, from empirical to algorithmic, have been developed to fit different development contexts, from plan-driven to agile. Recently, machine learning techniques have shown potential in this realm but are still underexplored, especially for individual task estimation. We investigate the use of machine learning techniques in predicting task effort and duration in software projects to assess their applicability and effectiveness in production environments, identify the best-performing algorithms, and pinpoint key input variables (features) for predictions. We conducted experiments with datasets of various sizes and structures exported from three project management tools used by partner companies. For each dataset, we trained regression models for predicting the effort and duration of individual tasks using eight machine learning algorithms. The models were validated using k-fold cross-validation and evaluated with several metrics. Ensemble algorithms like Random Forest, Extra Trees Regressor, and XGBoost consistently outperformed non-ensemble ones across the three datasets. However, the estimation accuracy and feature importance varied significantly across datasets, with a Mean Magnitude of Relative Error (MMRE) ranging from 0.11 to 9.45 across the datasets and target variables. Nevertheless, even in the worst-performing dataset, effort estimates aggregated to the project level showed good accuracy, with MMRE = 0.23. Machine learning algorithms, especially ensemble ones, seem to be a viable option for estimating the effort and duration of individual tasks in software projects. However, the quality of the estimates and the relevant features may depend largely on the characteristics of the available datasets and underlying projects. Nevertheless, even when the accuracy of individual estimates is poor, the aggregated estimates at the project level may present a good accuracy due to error compensation.

CloseRead Abstract

2023

An encoder framework for taxi-demand prediction using spatio-temporal function approximation

Authors
Bhanu, M; Roy, S; Priya, S; Mendes Moreira, J; Chandra, J;

Publication
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE

Abstract
Predicting taxi demands in large cities can help in better traffic management as well as ensure better commuter satisfaction for an intelligent transportation system. However, the traffic demands across different locations have varying spatio-temporal correlations that are difficult to model. Despite the ability of the existing Deep Neural Network (DNN) models to capture the non-linearity in spatial and temporal characteristics of the demand time-series, capturing spatio-temporal characteristics in different real-world scenarios like varying historic and prediction time frame, spatio-temporal variations due to noise or missing data, etc. still remain a big challenge for the state-of-the-art models. In this paper, we introduce Encoder-ApproXimator (EnAppX), an encoder-decoder DNN-based model that uses Chebyshev function approximation in the decoding stage for taxi demand times-series prediction and can better estimate the time-series in the presence of large spatio-temporal variations. Opposed to any existing state-of-the-art model, the proposed model approximates complete spatiotemporal characteristics in the frequency domain which in turn enables the model to make a robust and improved prediction in different scenarios. Validation over two real-world taxi datasets from different cities shows a considerable improvement of around 23% in RMSE scores compared to the state-of-the-art baseline model. Unlike several existing state-of-the-art models, EnAppX also produces improved prediction accuracy across two regions for both to and fro demands.

CloseRead Abstract

2023

ST-A<sub><i>G</i></sub>P: Spatio-Temporal aggregator predictor model for multi-step taxi-demand prediction in cities

Authors
Bhanu, M; Priya, S; Moreira, JM; Chandra, J;

Publication
APPLIED INTELLIGENCE

Abstract
Taxi demand prediction in a city is a highly demanded smart city research application for better traffic strategies formulation. It is essential for the interest of the commuters and the taxi companies both to have an accurate measure of taxi demands at different regions of a city and at varying time intervals. This reduces the cost of resources, efforts and meets the customers' satisfaction at its best. Modern predictive models have shown the potency of Deep Neural Networks (DNN) in this domain over any traditional, statistical, or Tensor-Based predictive models in terms of accuracy. The recent DNN models using leading technologies like Convolution Neural Networks (CNN), Graph Convolution Networks (GCN), ConvLSTM, etc. are not able to efficiently capture the existing spatio-temporal characteristics in taxi demand time-series. The feature aggregation techniques in these models lack channeling and uniqueness causing less distinctive but overlapping feature space which results in a compromised prediction performance having high error propagation possibility. The present work introduces Spatio-Temporal Aggregator Predictor (ST-A(G)P), a DNN model which aggregates spatio-temporal features into (1) non-redundant and (2) highly distinctive feature space and in turn helps (3) reduce noise propagation for a high performing multi-step predictive model. The proposed model integrates the effective feature engineering techniques of machine learning approach with the non-linear capability of a DNN model. Consequently, the proposed model is able to use only the informative features responsible for the objective task with reduce noise propagation. Unlike, existing DNN models, ST-A(G)P is able to induce these qualities of feature aggregation without the use of Multi-Task Learning (MTL) approach or any additional supervised attention that existing models need for their notable performance. A considerable high-performance gain of 25 - 37% on two real-world city taxi datasets by ST-A(G)P over the state-of-art models on standard benchmark metrics establishes the efficacy of the proposed model over the existing ones.

CloseRead Abstract

2023

A Survey of Advanced Computer Vision Techniques for Sports

Authors
Neves, TM; Meireles, L; Moreira, JM;

Publication
CoRR

Abstract

2023

A Hybrid BRKGA for Joint Scheduling Production, Transport, and Storage/Retrieval in Flexible Job Shops

Authors
Homayouni, SM; Fontes, DBMM; Fontes, FACC;

Publication
PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION

Abstract
This paper addresses the joint scheduling of production operations, transport tasks, and storage/retrieval activities in flexible job shop systems where the production operations and transport tasks can be done by one of the several resources available. Jobs need to be retrieved from storage and delivered to a load/unload area, from there, they are transported to and between the machines where their operations are processed on. Once all operations of a job are processed, the job is taken back to the load/unload area and then returned to the storage cell. Therefore, the problem under study requires, concurrently, solving job routing, machine scheduling, transport allocation, vehicle scheduling, and shuttle schedule. To this end, we propose a hybrid biased random-key genetic algorithm (BRKGA) in which the mutation operator resorts to six local search heuristics. The computational experiments conducted on a set of benchmark instances show the effectiveness of the proposed mutation operator.

CloseRead Abstract

2023

Job Deterioration Effects in Job-shop Scheduling Problems

Authors
Campinho, DG; Fontes, DBMM; Ferreira, AFP; Fontes, FACC;

Publication
IEEM

Abstract
This article addresses the significant issue of job deterioration effects in job-shop scheduling problems and aims to create awareness on its impact within the manufacturing industry. While previous studies have explored deteriorating effects in various production configurations, research on scheduling problems in complex settings, particularly job-shop, is very limited. Thus, we address and optimize the impact of job deterioration in a generic job-shop scheduling problem (JSP). The JSP with job deterioration is harder than the classical JSP as the processing time of an operation is only known when the operation is started. Hence, we propose a biased random key genetic algorithm to find good quality solutions quickly. Through computational experiments, the effectiveness of the algorithm and its multi-population variant is demonstrated. Further, we investigate several deterioration functions, including linear, exponential, and sigmoid. Job deterioration increases operations' processing time, which leads to an increase in the total production time (makespan). Therefore, the management should not ignore deterioration effects as they may lead to a decrease in productivity, to an increase in production time, costs, and waste production, as well to a deterioration in the customer relations due to frequent disruptions and delays. Finally, the computational results reported clearly show that the proposed approach is capable of mitigating (almost nullifying) such impacts.

CloseRead Abstract