2023
Autores
Gonçalves, CA; Vieira, AS; Gonçalves, CT; Borrajo, L; Camacho, R; Iglesias, EL;
Publicação
Hybrid Artificial Intelligent Systems - 18th International Conference, HAIS 2023, Salamanca, Spain, September 5-7, 2023, Proceedings
Abstract
The rapid growth of the scientific literature makes text classification essential specially in the biomedical research domain to help researchers to focus on the latest findings in a fast and efficient way. The potential benefits of using text semantic enrichment to enhance the biomedical document classification is presented in this study. We show the importance of enriching the corpora with semantic information to improve the full-text classification. The approach involves the semantic enrichment of a Medline corpus with a Semantic Repository (SemRep) which extracts semantic predications from biomedical text. The study also addresses the problem of treating highly dimensional data while maintaining the semantic structure of the corpus. Experimental results lead to the sustained conclusion that better results are achieved with full-text instead of using only abstracts and titles. We also conclude that the application of enriched techniques to full-texts significantly improves the task of text classification providing a significant contribution for the biomedical text mining research. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
2023
Autores
Andrade, L; Camacho, R; Oliveira, J;
Publicação
2023 13TH INTERNATIONAL CONFERENCE ON BIOSCIENCE, BIOCHEMISTRY AND BIOINFORMATICS, ICBBB 2023
Abstract
As the major cause of deaths worldwide, cardiovascular diseases are responsible for about 17.9 million deaths per year 1. Research on new technologies and methodologies allowed the acquisition of reliable data in several high income countries, however, in various developing countries, due to poverty and common scarcity of resources, this has not been reached yet. In this work, cardiovascular data acquired using cardiac auscultation is going to be used to detect cardiac murmurs through an innovative deep learning approach. The proposed screening algorithm was built using pre-trained models comprising Residual Neural Networks, namely Resnet50, and Visual Geometry Groups, such as VGG16 and VGG19. Furthermore, and up to our knowledge, our proposal is the first one that characterizes heart murmurs based on their frequency components, i.e. the murmur pitch. Such analysis may be used to augment the system's capability on detecting heart diseases. A novel decision-making function was also proposed regarding the murmur's pitch. From our experiments, low-pitch murmurs were more difficult to detect, with final f1-score values nearing the 0.40 value mark for all three models, while high-pitch murmurs presented an higher f1-score value of about 0.80. This might be due to the fact that the low-pitch share their respective frequency range with the normal and fundamental heart sounds, therefore making it harder for the model to correctly detect their presence whereas high-pitch murmurs' frequencies distance from the latter.
2023
Autores
Kumar, R; Moreira, JM; Chandra, J;
Publicação
APPLIED INTELLIGENCE
Abstract
Intelligent transportation systems (ITS) are gaining attraction in large cities for better traffic management. Traffic forecasting is an important part of ITS, but a difficult one due to the intricate spatiotemporal relationships of traffic between different locations. Despite the fact that remote or far sensors may have temporal and spatial similarities with the predicting sensor, existing traffic forecasting research focuses primarily on modeling correlations between neighboring sensors while disregarding correlations between remote sensors. Furthermore, existing methods for capturing spatial dependencies, such as graph convolutional networks (GCNs), are unable to capture the dynamic spatial dependence in traffic systems. Self-attention-based techniques for modeling dynamic correlations of all sensors currently in use overlook the hierarchical features of roads and have quadratic computational complexity. Our paper presents a new Dynamic Graph Convolution LSTM Network (DyGCN-LSTM) to address the aforementioned limitations. The novelty of DyGCN-LSTM is that it can model the underlying non-linear spatial and temporal correlations of remotely located sensors at the same time. Experimental investigations conducted using four real-world traffic data sets show that the suggested approach is superior to state-of-the-art benchmarks by 25% in terms of RMSE.
2023
Autores
Sousa, AO; Veloso, DT; Goncalves, HM; Faria, JP; Mendes Moreira, J; Graca, R; Gomes, D; Castro, RN; Henriques, PC;
Publicação
IEEE ACCESS
Abstract
Software estimation is a vital yet challenging project management activity. Various methods, from empirical to algorithmic, have been developed to fit different development contexts, from plan-driven to agile. Recently, machine learning techniques have shown potential in this realm but are still underexplored, especially for individual task estimation. We investigate the use of machine learning techniques in predicting task effort and duration in software projects to assess their applicability and effectiveness in production environments, identify the best-performing algorithms, and pinpoint key input variables (features) for predictions. We conducted experiments with datasets of various sizes and structures exported from three project management tools used by partner companies. For each dataset, we trained regression models for predicting the effort and duration of individual tasks using eight machine learning algorithms. The models were validated using k-fold cross-validation and evaluated with several metrics. Ensemble algorithms like Random Forest, Extra Trees Regressor, and XGBoost consistently outperformed non-ensemble ones across the three datasets. However, the estimation accuracy and feature importance varied significantly across datasets, with a Mean Magnitude of Relative Error (MMRE) ranging from 0.11 to 9.45 across the datasets and target variables. Nevertheless, even in the worst-performing dataset, effort estimates aggregated to the project level showed good accuracy, with MMRE = 0.23. Machine learning algorithms, especially ensemble ones, seem to be a viable option for estimating the effort and duration of individual tasks in software projects. However, the quality of the estimates and the relevant features may depend largely on the characteristics of the available datasets and underlying projects. Nevertheless, even when the accuracy of individual estimates is poor, the aggregated estimates at the project level may present a good accuracy due to error compensation.
2023
Autores
Bhanu, M; Roy, S; Priya, S; Mendes Moreira, J; Chandra, J;
Publicação
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
Abstract
Predicting taxi demands in large cities can help in better traffic management as well as ensure better commuter satisfaction for an intelligent transportation system. However, the traffic demands across different locations have varying spatio-temporal correlations that are difficult to model. Despite the ability of the existing Deep Neural Network (DNN) models to capture the non-linearity in spatial and temporal characteristics of the demand time-series, capturing spatio-temporal characteristics in different real-world scenarios like varying historic and prediction time frame, spatio-temporal variations due to noise or missing data, etc. still remain a big challenge for the state-of-the-art models. In this paper, we introduce Encoder-ApproXimator (EnAppX), an encoder-decoder DNN-based model that uses Chebyshev function approximation in the decoding stage for taxi demand times-series prediction and can better estimate the time-series in the presence of large spatio-temporal variations. Opposed to any existing state-of-the-art model, the proposed model approximates complete spatiotemporal characteristics in the frequency domain which in turn enables the model to make a robust and improved prediction in different scenarios. Validation over two real-world taxi datasets from different cities shows a considerable improvement of around 23% in RMSE scores compared to the state-of-the-art baseline model. Unlike several existing state-of-the-art models, EnAppX also produces improved prediction accuracy across two regions for both to and fro demands.
2023
Autores
Neves, TM; Meireles, L; Moreira, JM;
Publicação
CoRR
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.