Publicacoes - INESC TEC

Publicações

Publicações por Vítor Santos Costa

2017

Managing Diabetes: Pattern Discovery and Counselling supported by user data in a mobile platform

Autores
Machado, D; Paiva, T; Dutra, I; Costa, VS; Brandao, P;

Publicação
2017 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC)

Abstract
Diabetes management is a complex and a sensible problem as each diabetic is a unique case with particular needs. The optimal solution would be a constant monitoring of the diabetic's values and automatically acting accordingly. We propose an approach that guides the user and analyses the data gathered to give individual advice. By using data mining algorithms and methods, we uncover hidden behaviour patterns that may lead to crisis situations. These patterns can then be transformed into logical rules, able to trigger in a particular context, and advise the user. We believe that this solution, is not only beneficial for the diabetic, but also for the doctor accompanying the situation. The advice and rules are useful input that the medical expert can use while prescribing a particular treatment. During the data gathering phase, when the number of records is not enough to attain useful conclusions, a base set of logical rules, defined from medical protocols, directives and/or advice, is responsible for advise and guiding the user. The proposed system will accompany the user at start with generic advice, and with constant learning, advise the user more specifically. We discuss this approach describing the architecture of the system, its base rules and data mining component. The system is to be incorporated in a currently developed diabetes management application for Android.

FecharLer Abstract

2017

Markov logic networks for adverse drug event extraction from text

Autores
Natarajan, S; Bangera, V; Khot, T; Picado, J; Wazalwar, A; Costa, VS; Page, D; Caldwell, M;

Publicação
KNOWLEDGE AND INFORMATION SYSTEMS

Abstract
Adverse drug events (ADEs) are a major concern and point of emphasis for the medical profession, government, and society. A diverse set of techniques from epidemiology, statistics, and computer science are being proposed and studied for ADE discovery from observational health data (e.g., EHR and claims data), social network data (e.g., Google and Twitter posts), and other information sources. Methodologies are needed for evaluating, quantitatively measuring and comparing the ability of these various approaches to accurately discover ADEs. This work is motivated by the observation that text sources such as the Medline/Medinfo library provide a wealth of information on human health. Unfortunately, ADEs often result from unexpected interactions, and the connection between conditions and drugs is not explicit in these sources. Thus, in this work, we address the question of whether we can quantitatively estimate relationships between drugs and conditions from the medical literature. This paper proposes and studies a state-of-the-art NLP-based extraction of ADEs from text.

FecharLer Abstract

2016

Predicting Wildfires Propositional and Relational Spatio-Temporal Pre-processing Approaches

Autores
Oliveira, M; Torgo, L; Costa, VS;

Publicação
DISCOVERY SCIENCE, (DS 2016)

Abstract
We present and evaluate two different methods for building spatio-temporal features: a propositional method and a method based on propositionalisation of relational clauses. Our motivating application, a regression problem, requires the prediction of the fraction of each Portuguese parish burnt yearly by wildfires - a problem with a strong socio-economic and environmental impact in the country. We evaluate and compare how these methods perform individually and combined together. We successfully use under-sampling to deal with the high skew in the data set. We find that combining the approaches significantly improves the similar results obtained by each method individually.

FecharLer Abstract

2016

Relational Learning with GPUs: Accelerating Rule Coverage

Autores
Alberto Martinez Angeles, CA; Wu, HC; Dutra, I; Costa, VS; Buenabad Chavez, J;

Publicação
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING

Abstract
Relational learning algorithms mine complex databases for interesting patterns. Usually, the search space of patterns grows very quickly with the increase in data size, making it impractical to solve important problems. In this work we present the design of a relational learning system, that takes advantage of graphics processing units (GPUs) to perform the most time consuming function of the learner, rule coverage. To evaluate performance, we use four applications: a widely used relational learning benchmark for predicting carcinogenesis in rodents, an application in chemo-informatics, an application in opinion mining, and an application in mining health record data. We compare results using a single and multiple CPUs in a multicore host and using the GPU version. Results show that the GPU version of the learner is up to eight times faster than the best CPU version.

FecharLer Abstract

2014

Towards using Probabilities and Logic to Model Regulatory Networks

Autores
Goncalves, A; Ong, I; Lewis, JA; Costa, VS;

Publicação
2014 IEEE 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS)

Abstract
Transcriptional regulation plays an important role in every cellular decision. Unfortunately, understanding the dynamics that govern how a cell will respond to diverse environmental cues is difficult using intuition alone. We introduce logic-based regulation models based on state-of-the-art work on statistical relational learning, and validate our approach by using it to analyze time-series gene expression data of the Hog1 pathway. Our results show that plausible regulatory networks can be learned from time series gene expression data using a probabilistic logical model. Hence, network hypotheses can be generated from existing gene expression data for use by experimental biologists.

FecharLer Abstract

2014

Couillard: Parallel programming via coarse-grained Data-flow Compilation

Autores
Marzulo, LAJ; Alves, TAO; Franca, FMG; Costa, VS;

Publicação
PARALLEL COMPUTING

Abstract
Data-flow is a natural approach to parallelism. However, describing dependencies and control between fine-grained data-flow tasks can be complex and present unwanted overheads. TALM (TALM is an Architecture and Language for Multi-threading) introduces a user-defined coarse-grained parallel data-flow model, where programmers identify code blocks, called super-instructions, to be run in parallel and connect them in a data-flow graph. TALM has been implemented as a hybrid Von Neumann/data-flow execution system: the Trebuchet. We have observed that TALM's usefulness largely depends on how programmers specify and connect super-instructions. Thus, we present Couillard, a full compiler that creates, based on an annotated C-program, a data-flow graph and C-code corresponding to each super-instruction. We show that our toolchain allows one to benefit from data-flow execution and explore sophisticated parallel programming techniques, with small effort. To evaluate our system we have executed a set of real applications on a large multi-core machine. Comparison with popular parallel programming methods shows competitive speedups, while providing an easier parallel programing approach. More specifically, for an application that follows the wavefront method, running with big inputs, Trebuchet achieved up to 4.7% speedup over Intel (R) TBB novel flow-graph approach and up to 44% over OpenMP.

FecharLer Abstract