2022
Autores
Pinto, H; Pernice, R; Silva, ME; Javorka, M; Faes, L; Rocha, AP;
Publicação
PHYSIOLOGICAL MEASUREMENT
Abstract
Objective. In this work, an analytical framework for the multiscale analysis of multivariate Gaussian processes is presented, whereby the computation of Partial Information Decomposition measures is achieved accounting for the simultaneous presence of short-term dynamics and long-range correlations. Approach. We consider physiological time series mapping the activity of the cardiac, vascular and respiratory systems in the field of Network Physiology. In this context, the multiscale representation of transfer entropy within the network of interactions among Systolic arterial pressure (S), respiration (R) and heart period (H), as well as the decomposition into unique, redundant and synergistic contributions, is obtained using a Vector AutoRegressive Fractionally Integrated (VARFI) framework for Gaussian processes. This novel approach allows to quantify the directed information flow accounting for the simultaneous presence of short-term dynamics and long-range correlations among the analyzed processes. Additionally, it provides analytical expressions for the computation of the information measures, by exploiting the theory of state space models. The approach is first illustrated in simulated VARFI processes and then applied to H, S and R time series measured in healthy subjects monitored at rest and during mental and postural stress. Main Results. We demonstrate the ability of the VARFI modeling approach to account for the coexistence of short-term and long-range correlations in the study of multivariate processes. Physiologically, we show that postural stress induces larger redundant and synergistic effects from S and R to H at short time scales, while mental stress induces larger information transfer from S to H at longer time scales, thus evidencing the different nature of the two stressors. Significance. The proposed methodology allows to extract useful information about the dependence of the information transfer on the balance between short-term and long-range correlations in coupled dynamical systems, which cannot be observed using standard methods that do not consider long-range correlations.
2022
Autores
Silva, VF; Silva, ME; Ribeiro, P; Silva, F;
Publicação
DATA MINING AND KNOWLEDGE DISCOVERY
Abstract
Being able to capture the characteristics of a time series with a feature vector is a very important task with a multitude of applications, such as classification, clustering or forecasting. Usually, the features are obtained from linear and nonlinear time series measures, that may present several data related drawbacks. In this work we introduce NetF as an alternative set of features, incorporating several representative topological measures of different complex networks mappings of the time series. Our approach does not require data preprocessing and is applicable regardless of any data characteristics. Exploring our novel feature vector, we are able to connect mapped network features to properties inherent in diversified time series models, showing that NetF can be useful to characterize time data. Furthermore, we also demonstrate the applicability of our methodology in clustering synthetic and benchmark time series sets, comparing its performance with more conventional features, showcasing how NetF can achieve high-accuracy clusters. Our results are very promising, with network features from different mapping methods capturing different properties of the time series, adding a different and rich feature set to the literature.
2022
Autores
Sousa, R; Pereira, I; Silva, ME;
Publicação
RECENT DEVELOPMENTS IN STATISTICS AND DATA SCIENCE, SPE2021
Abstract
Often, real-life problems require modelling several response variables together. This work analyses a multivariate linear regression model when the data are censored. Censoring distorts the correlation structure of the underlying variables and increases the bias of the usual estimators. Thus, we propose three methods to deal with multivariate data under left censoring, namely Expectation Maximization (EM), DataAugmentation (DA) and Gibbs Sampler with Data Augmentation (GDA). Results from a simulation study showthat both DA and GDA estimates are consistent for low and moderate correlation. Under high correlation scenarios, EM estimates present a lower bias.
2022
Autores
Silva, ME; Campos, P;
Publicação
Proceedings of the IASE 2021 Satellite Conference
Abstract
2022
Autores
Cunha, LFD; Ramalho, JC;
Publicação
MACHINE LEARNING AND KNOWLEDGE EXTRACTION
Abstract
The amount of information preserved in Portuguese archives has increased over the years. These documents represent a national heritage of high importance, as they portray the country's history. Currently, most Portuguese archives have made their finding aids available to the public in digital format, however, these data do not have any annotation, so it is not always easy to analyze their content. In this work, Named Entity Recognition solutions were created that allow the identification and classification of several named entities from the archival finding aids. These named entities translate into crucial information about their context and, with high confidence results, they can be used for several purposes, for example, the creation of smart browsing tools by using entity linking and record linking techniques. In order to achieve high result scores, we annotated several corpora to train our own Machine Learning algorithms in this context domain. We also used different architectures, such as CNNs, LSTMs, and Maximum Entropy models. Finally, all the created datasets and ML models were made available to the public with a developed web platform, NER@DI.
2022
Autores
Cunha, LFD; Ramalho, JC;
Publicação
INFORMATION SYSTEMS AND TECHNOLOGIES, WORLDCIST 2022, VOL 2
Abstract
Currently, there is a vast amount of archival finding aids in Portuguese archives, however, these documents lack structure (are not annotated) making them hard to process and work with. In this way, we intend to extract and classify entities of interest, like geographical locations, people's names, dates, etc. For this, we will use an architecture that has been revolutionizing several NLP tasks, Transformers, presenting several models in order to achieve high results. It is also intended to understand what will be the degree of improvement that this new mechanism will present in comparison with previous architectures. Can Transformer-based models replace the LSTMs in NER? We intend to answer this question along this paper.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.