2024
Autores
Castilho, D; Souza, TTP; Kang, SM; Gama, J; de Carvalho, ACPLF;
Publicação
KNOWLEDGE AND INFORMATION SYSTEMS
Abstract
We propose a model that forecasts market correlation structure from link- and node-based financial network features using machine learning. For such, market structure is modeled as a dynamic asset network by quantifying time-dependent co-movement of asset price returns across company constituents of major global market indices. We provide empirical evidence using three different network filtering methods to estimate market structure, namely Dynamic Asset Graph, Dynamic Minimal Spanning Tree and Dynamic Threshold Networks. Experimental results show that the proposed model can forecast market structure with high predictive performance with up to 40%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$40\%$$\end{document} improvement over a time-invariant correlation-based benchmark. Non-pair-wise correlation features showed to be important compared to traditionally used pair-wise correlation measures for all markets studied, particularly in the long-term forecasting of stock market structure. Evidence is provided for stock constituents of the DAX30, EUROSTOXX50, FTSE100, HANGSENG50, NASDAQ100 and NIFTY50 market indices. Findings can be useful to improve portfolio selection and risk management methods, which commonly rely on a backward-looking covariance matrix to estimate portfolio risk.
2024
Autores
Cabezas, MP; Fonseca, NA; Muñoz-Mérida, A;
Publicação
ENVIRONMENTAL MICROBIOME
Abstract
MotivationAccurate determination and quantification of the taxonomic composition of microbial communities, especially at the species level, is one of the major issues in metagenomics. This is primarily due to the limitations of commonly used 16S rRNA reference databases, which either contain a lot of redundancy or a high percentage of sequences with missing taxonomic information. This may lead to erroneous identifications and, thus, to inaccurate conclusions regarding the ecological role and importance of those microorganisms in the ecosystem.ResultsThe current study presents MIMt, a new 16S rRNA database for archaea and bacteria's identification, encompassing 47 001 sequences, all precisely identified at species level. In addition, a MIMt2.0 version was created with only curated sequences from RefSeq Targeted loci with 32 086 sequences. MIMt aims to be updated twice a year to include all newly sequenced species. We evaluated MIMt against Greengenes, RDP, GTDB and SILVA in terms of sequence distribution and taxonomic assignments accuracy. Our results showed that MIMt contains less redundancy, and despite being 20 to 500 times smaller than existing databases, outperforms them in completeness and taxonomic accuracy, enabling more precise assignments at lower taxonomic ranks and thus, significantly improving species-level identification.
2024
Autores
Guedes, JG; Ribeiro, R; Carqueijeiro, I; Guimaraes, AL; Bispo, C; Archer, J; Azevedo, H; Fonseca, NA; Sottomayor, M;
Publicação
JOURNAL OF EXPERIMENTAL BOTANY
Abstract
Catharanthus roseus leaves produce a range of monoterpenoid indole alkaloids (MIAs) that include low levels of the anticancer drugs vinblastine and vincristine. The MIA pathway displays a complex architecture spanning different subcellular and cell type localizations, and is under complex regulation. As a result, the development of strategies to increase the levels of the anticancer MIAs has remained elusive. The pathway involves mesophyll specialized idioblasts where the late unsolved biosynthetic steps are thought to occur. Here, protoplasts of C. roseus leaf idioblasts were isolated by fluorescence-activated cell sorting, and their differential alkaloid and transcriptomic profiles were characterized. This involved the assembly of an improved C. roseus transcriptome from short- and long-read data, IDIO+. It was observed that C. roseus mesophyll idioblasts possess a distinctive transcriptomic profile associated with protection against biotic and abiotic stresses, and indicative that this cell type is a carbon sink, in contrast to surrounding mesophyll cells. Moreover, it is shown that idioblasts are a hotspot of alkaloid accumulation, suggesting that their transcriptome may hold the key to the in-depth understanding of the MIA pathway and the success of strategies leading to higher levels of the anticancer drugs. Catharanthus mesophyll idioblasts are a hotspot of alkaloid accumulation. The idioblast transcriptome is associated with stress responses and provides a roadmap towards the increase of anticancer alkaloid levels.
2024
Autores
Pereira, RC; Abreu, PH; Rodrigues, PP;
Publicação
JOURNAL OF COMPUTATIONAL SCIENCE
Abstract
Missing data is an issue that can negatively impact any task performed with the available data and it is often found in real -world domains such as healthcare. One of the most common strategies to address this issue is to perform imputation, where the missing values are replaced by estimates. Several approaches based on statistics and machine learning techniques have been proposed for this purpose, including deep learning architectures such as generative adversarial networks and autoencoders. In this work, we propose a novel siamese neural network suitable for missing data imputation, which we call Siamese Autoencoder-based Approach for Imputation (SAEI). Besides having a deep autoencoder architecture, SAEI also has a custom loss function and triplet mining strategy that are tailored for the missing data issue. The proposed SAEI approach is compared to seven state-of-the-art imputation methods in an experimental setup that comprises 14 heterogeneous datasets of the healthcare domain injected with Missing Not At Random values at a rate between 10% and 60%. The results show that SAEI significantly outperforms all the remaining imputation methods for all experimented settings, achieving an average improvement of 35%. This work is an extension of the article Siamese Autoencoder-Based Approach for Missing Data Imputation [1] presented at the International Conference on Computational Science 2023. It includes new experiments focused on runtime, generalization capabilities, and the impact of the imputation in classification tasks, where the results show that SAEI is the imputation method that induces the best classification results, improving the F1 scores for 50% of the used datasets.
2024
Autores
Pereira, RC; Abreu, PH; Rodrigues, PP; Figueiredo, MAT;
Publicação
EXPERT SYSTEMS WITH APPLICATIONS
Abstract
Experimental assessment of different missing data imputation methods often compute error rates between the original values and the estimated ones. This experimental setup relies on complete datasets that are injected with missing values. The injection process is straightforward for the Missing Completely At Random and Missing At Random mechanisms; however, the Missing Not At Random mechanism poses a major challenge, since the available artificial generation strategies are limited. Furthermore, the studies focused on this latter mechanism tend to disregard a comprehensive baseline of state-of-the-art imputation methods. In this work, both challenges are addressed: four new Missing Not At Random generation strategies are introduced and a benchmark study is conducted to compare six imputation methods in an experimental setup that covers 10 datasets and five missingness levels (10% to 80%). The overall findings are that, for most missing rates and datasets, the best imputation method to deal with Missing Not At Random values is the Multiple Imputation by Chained Equations, whereas for higher missingness rates autoencoders show promising results.
2024
Autores
Rodrigues, MG; Rodrigues, JD; Moreira, JA; Clemente, F; Dias, CC; Azevedo, LF; Rodrigues, PP; Areias, JC; Areias, ME;
Publicação
CHILD CARE HEALTH AND DEVELOPMENT
Abstract
PurposeTo develop, implement and assess the results of psychoeducation to improve the QoL of parents with CHD newborns.MethodsParticipants were parents of inpatient newborns with the diagnosis of non-syndromic CHD. We conducted a parallel RCT with an allocation ratio of 1:1 (intervention vs. control), considering the newborns, using mixed methods research. The intervention group received psychoeducation (Parental Psychoeducation in CHD [PPeCHD]) and the usual routines, and the control group received just the regular practices. The allocation concealment was assured. PI was involved in enrolling participants, developing and implementing the intervention, data collection and data analysis. We followed the Consolidated Standards of Reporting Trials (CONSORT) guidelines.ResultsParents of eight newborns were allocated to the intervention group (n = 15 parents) and eight to the control group (n = 13 parents). It was performed as an intention-to-treat (ITT) analysis. In M2 (4 weeks), the intervention group presented better QoL levels in the physical, psychological, and environmental domains of World Health Organization Quality of Life instrument (WHOQOL-Bref). In M3 (16 weeks), scores in physical and psychological domains maintained a statistically significant difference between the groups.ConclusionsThe PPeCHD, the psychoeducational intervention we developed, positively impacted parental QoL. These results support the initial hypothesis. This study is a fundamental milestone in this research field, adding new essential information to the literature.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.