Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por CRACS

2015

Guest editors' introduction: special issue on Inductive Logic Programming and on Multi-Relational Learning

Autores
Zaverucha, G; Costa, VS;

Publicação
MACHINE LEARNING

Abstract

2015

Predicting Adverse Drug Events from Electronic Medical Records

Autores
Davis, J; Costa, VS; Peissig, PL; Caldwell, M; Page, D;

Publicação
Foundations of Biomedical Knowledge Representation - Methods and Applications

Abstract

2015

Automatic network configuration in virtualized environment using GNS3

Autores
Emiliano, R; Antunes, M;

Publicação
10TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2015)

Abstract
Computer networking is a central topic in computer science courses curricula offered by higher education institutions. Network virtualization and simulation tools, like GNS3, allows students and practitioners to test real world networking configuration scenarios and to configure complex network scenarios by configuring virtualized equipments, such as routers and switches, through each one's virtual console. The configuration of advanced network topics in GNS3 requires that students have to apply basic and very repetitive IP configuration tasks in all network equipments. As the network topology grows, so does the amount of network equipments to be configured, which may lead to logical configuration errors. In this paper we propose an extension for GNS3 network virtualizer, to automatically generate a valid configuration of all the network equipments in a GNS3 scenario. Our implementation is able to automatically produce an initial IP and routing configuration of all the Cisco virtual equipments by using the GNS3 specification files. We tested this extension against a set of networked scenarios which proved the robustness, readiness and speedup of the overall configuration tasks. In a learning environment, this feature may save time for all networking practitioners, both beginners or advanced, who aim to configure and test network topologies, since it automatically produces a valid and operational configuration for all the equipments designed in a GNS3 environment.

2015

The Impact of Longstanding Messages In Micro-Blogging Classification

Autores
Costa, J; Silva, C; Antunes, M; Ribeiro, B;

Publicação
2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)

Abstract
Social networks are making part of the daily routine of millions of users. Twitter is among Facebook and Instagram one of the most used, and can be seen as a relevant source of information as users share not only daily status, but rapidly propagate news and events that occur worldwide. Considering the dynamic nature of social networks, and their potential in information spread, it is imperative to find learning strategies able to learn in these environments and cope with their dynamic nature. Time plays an important role by easily out-dating information, being crucial to understand how informative can past events be to current learning models and for how long it is relevant to store previously seen information, to avoid the computation burden associated with the amount of data produced. In this paper we study the impact of longstanding messages in micro-blogging classification by using different training time-window sizes in the learning process. Since there are few studies dealing with drift in Twitter and thus little is known about the types of drift that may occur, we simulate different types of drift in an artificial dataset to evaluate and validate our strategy. Results shed light on the relevance of previously seen examples according to different types of drift.

2015

Health Twitter Big Bata Management with Hadoop Framework

Autores
Cunha, J; Silva, C; Antunes, M;

Publicação
CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS/INTERNATIONAL CONFERENCE ON PROJECT MANAGEMENT/CONFERENCE ON HEALTH AND SOCIAL CARE INFORMATION SYSTEMS AND TECHNOLOGIES, CENTERIS/PROJMAN / HCIST 2015

Abstract
Social media advancements and the rapid increase in volume and complexity of data generated by Internet services are becoming challenging not only technologically, but also in terms of application areas. Performance and availability of data processing are critical factors that need to be evaluated since conventional data processing mechanisms may not provide adequate support. Apache Hadoop with Mahout is a framework to storage and process data at large-scale, including different tools to distribute processing. It has been considered an effective tool currently used by both small and large businesses and corporations, like Google and Facebook, but also public and private healthcare institutions. Given its recent emergence and the increasing complexity of the associated technological issues, a variety of holistic framework solutions have been put forward for each specific application. In this work, we propose a generic functional architecture with Apache Hadoop framework and Mahout for handling, storing and analyzing big data that can be used in different scenarios. To demonstrate its value, we will show its features, advantages and applications on health Twitter data. We show that big health social data can generate important information, valuable both for common users and practitioners. Preliminary results of data analysis on Twitter health data using Apache Hadoop demonstrate the potential of the combination of these technologies. (C) 2015 The Authors. Published by Elsevier B.V.

2015

Active Manifold Learning with Twitter Big Data

Autores
Silva, C; Antunes, M; Costa, J; Ribeiro, B;

Publicação
INNS CONFERENCE ON BIG DATA 2015 PROGRAM

Abstract
The data produced by Internet applications have increased substantially. Big data is a flaring field that deals with this deluge of data by using storage techniques, dedicated infrastructures and development frameworks for the parallelization of defined tasks and its consequent reduction. These solutions however fall short in online and highly data demanding scenarios, since users expect swift feedback. Reduction techniques are efficiently used in big data online applications to improve classification problems. Reduction in big data usually falls in one of two main methods: (i) reduce the dimensionality by pruning or reformulating the feature set; (ii) reduce the sample size by choosing the most relevant examples. Both approaches have benefits, not only of time consumed to build a model, but eventually also performance-wise, usually by reducing overfitting and improving generalization capabilities. In this paper we investigate reduction techniques that tackle both dimensionality and size of big data. We propose a framework that combines a manifold learning approach to reduce dimensionality and an active learning SVM-based strategy to reduce the size of labeled sample. Results on Twitter data show the potential of the proposed active manifold learning approach.

  • 104
  • 202