Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2018

Using multi-relational data mining to discriminate blended therapy efficiency on patients based on log data

Autores
Rocha, A; Camacho, R; Ruwaard, J; Riper, H;

Publicação
INTERNET INTERVENTIONS-THE APPLICATION OF INFORMATION TECHNOLOGY IN MENTAL AND BEHAVIOURAL HEALTH

Abstract
Introduction: Clinical trials of blended Internet-based treatments deliver a wealth of data from various sources, such as self-report questionnaires, diagnostic interviews, treatment platform log files and Ecological Momentary Assessments (EMA). Mining these complex data for clinically relevant patterns is a daunting task for which no definitive best method exists. In this paper, we explore the expressive power of the multi-relational Inductive Logic Programming (ILP) data mining approach, using combined trial data of the EU E-COMPARED depression trial. Methods: We explored the capability of ILP to handle and combine (implicit) multiple relationships in the E-COMPARED data. This data set has the following features that favor ILP analysis: 1) Time reasoning is involved; 2) there is a reasonable amount of explicit useful relations to be analyzed; 3) ILP is capable of building comprehensible models that might be perceived as putative explanations by domain experts; 4) both numerical and statistical models may coexist within ILP models if necessary. In our analyses, we focused on scores of the PHQ-8 self-report questionnaire (which taps depressive symptom severity), and on EMA of mood and various other clinically relevant factors. Both measures were administered during treatment, which lasted between 9 to 16 weeks. Results: E-COMPARED trial data revealed different individual improvement patterns: PHQ-8 scores suggested that some individuals improved quickly during the first weeks of the treatment, while others improved at a (much) slower pace, or not at all. Combining self-reported Ecological Momentary Assessments (EMA), PHQ-8 scores and log data about the usage of the ICT4D platform in the context of blended care, we set out to unveil possible causes for these different trajectories. Discussion: This work complements other studies into alternative data mining approaches to E-COMPARED trial data analysis, which are all aimed to identify clinically meaningful predictors of system use and treatment outcome. Strengths and limitations of the ILP approach given this objective will be discussed.

2018

LearnSec: A Framework for Full Text Analysis

Autores
Goncalves, C; Iglesias, EL; Borrajo, L; Camacho, R; Vieira, AS; Goncalves, CT;

Publicação
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS (HAIS 2018)

Abstract
Large corpus of scientific research papers have been available for a long time. However, most of those corpus store only the title and the abstract of the paper. For some domains this information may not be enough to achieve high performance in text mining tasks. This problem has been recently reduced by the growing availability of full text scientific research papers. A full text version provides more detailed information but, on the other hand, a large amount of data needs to be processed. A priori, it is difficult to know if the extra work of the full text analysis has a significant impact in the performance of text mining tasks, or if the effect depends on the scientific domain or the specific corpus under analysis. The goal of this paper is to show a framework for full text analysis, called LearnSec, which incorporates domain specific knowledge and information about the content of the document sections to improve the classification process with propositional and relational learning. To demonstrate the usefulness of the tool, we process a scientific corpus based on OSHUMED, generating an attribute/value dataset in Weka format and a First Order Logic dataset in Inductive Logic Programming (ILP) format. Results show a successful assessment of the framework.

2018

Autoencoders as Weight Initialization of Deep Classification Networks Applied to Papillary Thyroid Carcinoma

Autores
Ferreira, MF; Camacho, R; Teixeira, LF;

Publicação
PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM)

Abstract
Cancer is one of the most serious health problems of our time. One approach for automatically classifying tumor samples is to analyze derived molecular information. Previous work by Teixeira et al. compared different methods of Data Oversampling and Feature Reduction, as well as Deep (Stacked) Denoising Autoencoders followed by a shallow layer for classification. In this work, we compare the performance of 6 different types of Autoencoder (AE), combined with two different approaches when training the classification model: (a) fixing the weights, after pretraining an AE, and (b) allowing fine-tuning of the entire network. We also apply two different strategies for embedding the AE into the classification network: (1) by only importing the encoding layers, and (2) by importing the complete AE. Our best result was the combination of unsupervised feature learning through a single-layer Denoising AE, followed by its complete import into the classification network, and subsequent fine-tuning through supervised training, achieving an F1 score of 99.61% +/- 0.54. We conclude that a reconstruction of the input space, combined with a deeper classification network outperforms previous work, without resorting to data augmentation techniques.

2018

Enhancing traffic model of big cities: Network Skeleton & Reciprocity

Autores
Bhanu, M; Chandra, J; Mendes Moreira, J;

Publicação
2018 10TH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS)

Abstract
Handling major challenges like traffic volume estimation, mobility pattern detection and feature extraction in mobility network usually form a weak balance among them. Most of the works are focused towards one of these areas which fail in improving altogether. In this paper, we present a model with modified conventional methods meeting all three above challenges to an extent. Extracting new temporal & directional feature, we introduce Reciprocity metric. It proves to be more informative and efficient in capturing mobility pattern of the network than existing metrics. We introduce the idea of network skeleton which is a reduced form of mobility network but captures approx 90% of its inherent characteristics. Network Skeleton can extract higher level of information from the network while enhancing network's short-term predictability. Our work has the following steps: 1) extracting and building "link reciprocity", a more informative feature; 2) pattern detection in random mobility introduced by "convergence of mobility network"; and 3) estimation of network skeleton formed using a link based approach for short-term forecasting. Our network convergence method outperforms conventional approaches and detects active regions at a very fast rate compared to other approaches. Long ShortTerm Memory (LSTM), a kind of Recursive Neural Networks (RNN) capable of learning long-term dependencies is used to estimate network traffic. Indicating link based network-skeleton helps to reduce short-term forecasting error up to 6% and 3/4 times in different time-slots. Our network skeleton approach can be used to meet the general problems of the traffic-rules formulation by characterizing important routes (links), detecting regions of high importance in less time and predicting short-term traffic volume in a more accurate way. Moreover, network skeleton with reduced network-size can be easily operable with existing methodologies, which is another essential contribution of our work.

2018

Updating a Robust Optimization Model for Improving Bus Schedules

Autores
Baghoussi, Y; Mendes Moreira, J; Emmerich, MTM;

Publicação
2018 10TH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS)

Abstract
Transportation systems are very complex systems due to the characteristics of their components such as buses. Nowadays, buses are set up to follow a particular schedule that is very sensitive to the changes that occur inside the system. These schedules must frequently be updated, if necessary, due to many reasons. Among these reasons, we have the population growth inside the cities as well as traffic and congestions caused by unforeseen events. To solve the problem of system variability, companies such as the Public Transport Company in the city of Porto (STCP) usually fixes bus schedules with headways adapted to each type of bus lines (i.e., high/low-frequency bus lines). In this work, we adopt a robust optimization model from literature to improve the bus schedules using Automatic Vehicle Location Data collected along the year in the city of Porto. We apply the model to a high-frequency bus line case study. We present the model imperfections and propose new updates.

2018

A Cluster-Based Prototype Reduction for Online Classification

Autores
Garcia, KD; de Carvalho, ACPLF; Moreira, JM;

Publicação
Intelligent Data Engineering and Automated Learning - IDEAL 2018 - 19th International Conference, Madrid, Spain, November 21-23, 2018, Proceedings, Part I

Abstract
Data stream is a challenging research topic in which data can continuously arrive with a probability distribution that may change over time. Depending on the changes in the data distribution, different phenomena can occur, for example, a concept drift. A concept drift occurs when the concepts associated with a dataset change when new data arrive. This paper proposes a new method based on k-Nearest Neighbors that implements a sliding window requiring less instances stored for training than existing methods. For such, a clustering approach is used to summarize data by placing labeled instances considered similar in the same cluster. Besides, instances close to the uncertainty border of existing classes are also stored, in a sliding window, to adapt the model to concept drift. The proposed method is experimentally compared with state-of-the-art classifiers from the data stream literature, regarding accuracy and processing time. According to the experimental results, the proposed method has better accuracy and less time consumption when fewer information about the concepts are stored in a single sliding window. © 2018, Springer Nature Switzerland AG.

  • 241
  • 516