Publications

Publications by LIAAD

2015

Are Rankings of Financial Analysts Useful to Investors?

Authors
Aiguzhinov, A; Serra, AP; Soares, C;

Publication
SSRN Electronic Journal

Abstract

2015

A Bounded Neural Network for Open Set Recognition

Authors
Cardoso, DO; Franca, F; Gama, J;

Publication
2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)

Abstract
Open set recognition is, more than an interesting research subject, a component of various machine learning applications which is sometimes neglected: it is not unusual the existence of learning systems developed on the top of closed-set assumptions, ignoring the error risk involved in a prediction. This risk is strictly related to the location in feature space where the prediction has to be made, compared to the location of the training data: the more distant the training observations are, less is known, higher is the risk. Proper handling of this risk can be necessary in various situation where classification and its variants are employed. This paper presents an approach to open set recognition based on an elaborate distance-like computation provided by a weightless neural network model. The results obtained in the proposed test scenarios are quite interesting, placing the proposed method among the current best ones.

CloseRead Abstract

2015

A framework for analysing dynamic communities in large-scale social networks

Authors
Cerqueira, V; Oliveira, M; Gama, J;

Publication
ICEIS 2015 - 17th International Conference on Enterprise Information Systems, Proceedings

Abstract
Telecommunications companies must process large-scale social networks that reveal the communication patterns among their customers. These networks are dynamic in nature as new customers appear, old customers leave, and the interaction among customers changes over time. One way to uncover the evolution patterns of such entities is by monitoring the evolution of the communities they belong to. Large-scale networks typically comprise thousands, or hundreds of thousands, of communities and not all of them are worth monitoring, or interesting from the business perspective. Several methods have been proposed for tracking the evolution of groups of entities in dynamic networks but these methods lack strategies to effectively extract knowledge and insight from the analysis. In this paper we tackle this problem by proposing an integrated business-oriented framework to track and interpret the evolution of communities in very large networks. The framework encompasses several steps such as network sampling, community detection, community selection, monitoring of dynamic communities and rule-based interpretation of community evolutionary profiles. The usefulness of the proposed framework is illustrated using a real-world large-scale social network from a major telecommunications company.

CloseRead Abstract

2015

EigenEvent: An algorithm for event detection from complex data streams in syndromic surveillance

Authors
Fanaee T, H; Gama, J;

Publication
INTELLIGENT DATA ANALYSIS

Abstract
Syndromic surveillance systems continuously monitor multiple pre-diagnostic daily streams of indicators from different regions with the aim of early detection of disease outbreaks. The main objective of these systems is to detect outbreaks hours or days before the clinical and laboratory confirmation. The type of data that is being generated via these systems is usually multivariate and seasonal with spatial and temporal dimensions. The algorithm What's Strange About Recent Events (WSARE) is the state-of-the-art method for such problems. It exhaustively searches for contrast sets in the multivariate data and signals an alarm when find statistically significant rules. This bottom-up approach presents a much lower detection delay comparing the existing top-down approaches. However, WSARE is very sensitive to the small-scale changes and subsequently comes with a relatively high rate of false alarms. We propose a new approach called EigenEvent that is neither fully top-down nor bottom-up. In this method, we instead of top-down or bottom-up search, track changes in data correlation structure via eigenspace techniques. This new methodology enables us to detect both overall changes (via eigenvalue) and dimension-level changes (via eigenvectors). Experimental results on hundred sets of benchmark data reveals that EigenEvent presents a better overall performance comparing state-of-the-art, in particular in terms of the false alarm rate.

CloseRead Abstract

2015

Eigenspace method for spatiotemporal hotspot detection

Authors
Fanaee T, H; Gama, J;

Publication
EXPERT SYSTEMS

Abstract
Hotspot detection aims at identifying sub-groups in the observations that are unexpected, with respect to some baseline information. For instance, in disease surveillance, the purpose is to detect sub-regions in spatiotemporal space, where the count of reported diseases (e.g. cancer) is higher than expected, with respect to the population. The state-of-the-art method for this kind of problem is the space-time scan statistics, which exhaustively search the whole space through a sliding window looking for significant spatiotemporal clusters. Space-time scan statistics makes some restrictive assumptions about the distribution of data, the shape of the hotspots and the quality of data, which can be unrealistic for some non-traditional data sources. A novel methodology called EigenSpot is proposed where instead of an exhaustive search over the space, it tracks the changes in a space-time occurrences structure. The new approach does not only present much more computational efficiency but also makes no assumption about the data distribution, hotspot shape or the data quality. The principal idea is that with the joint combination of abnormal elements in the principal spatial and the temporal singular vectors, the location of hotspots in the spatiotemporal space can be approximated. The experimental evaluation, both on simulated and real data sets, reveals the effectiveness of the proposed method.

CloseRead Abstract

2015

Exploring multi-relational temporal databases with a propositional sequence miner

Authors
Ferreira, CA; Gama, J; Costa, VS;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
In this work, we introduce the MuSer, a propositional framework that explores temporal information available in multi-relational databases. At the core of this system is an encoding technique that translates the temporal information into a propositional sequence of events. By using this technique, we are able to explore the temporal information using a propositional sequence miner. With this framework, we mine each class partition individually and we do not use classical aggregation strategies, like window aggregation. Moreover, in this system we combine feature selection and propositionalization techniques to cast a multi-relational classification problem into a propositional one. We empirically evaluate the MuSer framework using two real databases. The results show that mining each partition individually is a time-and memory-efficient strategy that generates a high number of highly discriminative patterns.

CloseRead Abstract