Publicacoes - INESC TEC

Publicações

2015

Very fast decision rules for classification in data streams

Autores
Kosina, P; Gama, J;

Publicação
DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Data stream mining is the process of extracting knowledge structures from continuous, rapid data records. Many decision tasks can be formulated as stream mining problems and therefore many new algorithms for data streams are being proposed. Decision rules are one of the most interpretable and flexible models for predictive data mining. Nevertheless, few algorithms have been proposed in the literature to learn rule models for time-changing and high-speed flows of data. In this paper we present the very fast decision rules (VFDR) algorithm and discuss interesting extensions to the base version. All the proposed versions are one-pass and any-time algorithms. They work on-line and learn ordered or unordered rule sets. Algorithms designed to work with data streams should be able to detect changes and quickly adapt the decision model. In order to manage these situations we also present the adaptive extension (AVFDR) to detect changes in the process generating data and adapt the decision model. Detecting local drifts takes advantage of the modularity of the rule sets. In AVFDR, each individual rule monitors the evolution of performance metrics to detect concept drift. AVFDR prunes rules whenever a drift is signaled. This explicit change detection mechanism provides useful information about the dynamics of the process generating data, faster adaptation to changes and generates more compact rule sets. The experimental evaluation demonstrates that algorithms achieve competitive results in comparison to alternative methods and the adaptive methods are able to learn fast and compact rule sets from evolving streams.

FecharLer Abstract

2015

Discrimination and characterisation of extra virgin olive oils from three cultivars in different maturation stages using Fourier transform infrared spectroscopy in tandem with chemometrics

Autores
Gouvinhas, I; de Almeida, JMMM; Carvalho, T; Machado, N; Barros, AIRNA;

Publicação
FOOD CHEMISTRY

Abstract
A methodology based on Fourier transform infrared (FTIR) spectroscopy, combined with multivariate analysis methods, was applied in order to monitor extra virgin olive oils produced from three distinct cultivars on different maturation stages. For the first time, this kind of methodology is used for the simultaneous discrimination of the maturation stage, and different cultivars. Principal component analysis and discriminant analysis were utilised to create a model for the discrimination of olive oil samples. Partial least squares regression was employed to design calibration models for the determination of chemical parameters. The performance of these models was based on the multiple coefficient of determination (R-2), the root mean square error of calibration (RMSEC) and root mean square error of cross validation (RMSECV). The prediction models for the chemical parameters resulted in a R-2 ranged from 0.93 to 0.99, a RMSEC ranged from 1% to 4% and a RMSECV from 2% to 5%. It has been shown that this kind of approach allows to distinguish the different cultivars, and to clearly discern the different maturation stages, in each one of these distinct cultivars. Furthermore, the results demonstrated that FTIR spectroscopy in tandem with chemometric techniques allows the creation of viable and accurate models, suitable for correlating the data collected by FTIR spectroscopy, with the chemical composition of the EVOOs, obtained by standard methods.

FecharLer Abstract

2015

A Mobile Sensing Approach to Stress Detection and Memory Activation for Public Bus Drivers

Autores
Rodrigues, JGP; Kaiseler, M; Aguiar, A; Cunha, JPS; Barros, J;

Publicação
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

Abstract
The experience of daily stress among bus drivers has shown to affect physical and psychological health, and can impact driving behavior and overall road safety. Although previous research consistently supports these findings, little attention has been dedicated to the design of a stress detection method able to synchronize physiological and psychological stress responses of public bus drivers in their day-to-day routine work. To overcome this limitation, we propose a mobile sensing approach to detect georeferenced stress responses and facilitate memory recall of the stressful situations. Data were collected among public bus drivers in the city of Porto, Portugal (145 h, 36 bus drivers, +2300 km), and results supported the validation of our approach among this population and allowed us to determine specific stressor categories within certain areas of the city. Furthermore, data collected throughout the city allowed us to produce a citywide "stress map" that can be used for spotting areas in need of local authority intervention. The enriching findings suggest that our system can be a promising tool to support applied occupational health interventions for public bus drivers and guide authorities' interventions to improve these aspects in "future" cities.

FecharLer Abstract

2015

ROBOTICS: A TEACHING TOOL FOR STEM EDUCATION IN HIGH SCHOOL

Autores
Costa, V; Sousa, A; Cunha, T; Morais, C;

Publicação
EDULEARN15: 7TH INTERNATIONAL CONFERENCE ON EDUCATION AND NEW LEARNING TECHNOLOGIES

Abstract
This article describes an experience in university and high school cooperation. It is expected to foster knowledge and deep learning in secondary schools by turning extra-curricular activities into articulated subject. The robotics area is very useful and generates interest and enthusiasm, even more so when associated with competition. The experience used Lego Ev3 robot and the students learned to program with a healthy technical approach called state machine programming and the easy to use Lego Software programming tool. The participation is enthusiastic because of the participation in the national robotics festival that leads into international RoboCup Federation robotics competitions. The article proposes a set of sessions adequate for secondary school students that constitute the initial step to find a curriculum for robotics in order to simultaneously learn robotics and foster interconnections with the curricular courses in STEM areas, even extending into structured programming issues. The test involved two participations in the national robotics competition that interestingly involved a team of 3 girls and another team of 3 boys although more students were involved during the year that the experience lasted. Declarations from the involved stakeholders are mentioned, even allowing for a brief discussion for women in STEM areas and technology distance for young (wo)men. Some hints, issues and lessons learned are shown. The advocacy of such informal learning strategy is made, advantages and limitations discussed.

FecharLer Abstract

2015

A Biased Random-key Genetic Algorithm for Placement of Virtual Machines across Geo-Separated Data Centers

Autores
Stefanello, F; Aggarwal, V; Buriol, LS; Goncalves, JF; Resende, MGC;

Publicação
GECCO'15: PROCEEDINGS OF THE 2015 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE

Abstract
Cloud computing has recently emerged as a new technology for hosting and supplying services over the Internet. This technology has brought many benefits, such as eliminating the need for maintaining expensive computing hardware and allowing business owners to start from small and increase resources only when there is a rise in service demand. With an increasing demand for cloud computing, providing performance guarantees for applications that run over cloud become important. Applications can be abstracted into a set of virtual machines with certain guarantees depicting the quality of service of the application. In this paper, we consider the placement of these virtual machines across multiple data centers, meeting the quality of service requirements while minimizing the bandwidth cost of the data centers. This problem is a generalization of the NP-hard Generalized Quadratic Assignment Problem (GQAP). We formalize the problem and propose a novel algorithm based on a biased random-key genetic algorithm (BRKGA) to find nearoptimal solutions for the problem. The experimental results show that the proposed algorithm is effective in quickly finding feasible solutions and it produces better results than a baseline aproach provided by a commercial solver and a multi-start algorithm.

FecharLer Abstract

2015

Collaborative filtering with recency-based negative feedback

Autores
Vinagre, J; Jorge, AM; Gama, J;

Publicação
30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II

Abstract
Many online communities and services continuously generate data that can be used by recommender systems. When explicit ratings are not available, rating prediction algorithms are not directly applicable. Instead, data consists of positive-only user-item interactions, and the task is therefore not to predict ratings, but rather to predict good items to recommend - item prediction. One particular challenge of positive-only data is how to interpret absent user-item interactions. These can either be seen as negative or as unknown preferences. In this paper, we propose a recency-based scheme to perform negative preference imputation in an incremental matrix factorization algorithm designed for streaming data. Our results show that this approach substantially improves the accuracy of the baseline method, outperforming both classic and state-of-the-art algorithms.

FecharLer Abstract

2404
4201