2014
Authors
Domingues, MA; Soares, C; Jorge, AM; Rezende, SO;
Publication
Journal of the Brazilian Computer Society
Abstract
Background: Due to the constant demand for new information and timely updates of services and content in order to satisfy the user’s needs, web site automation has emerged as a solution to automate several personalization and management activities of a web site. One goal of automation is the reduction of the editor’s effort and consequently of the costs for the owner. The other goal is that the site can more timely adapt to the behavior of the user, improving the browsing experience and helping the user in achieving his/her own goals. Methods: A database to store rich web data is an essential component for web site automation. In this paper, we propose a data warehouse that is developed to be a repository of information to support different web site automation and monitoring activities. We implemented our data warehouse and used it as a repository of information in three different case studies related to the areas of e-commerce, e-learning, and e-news. Result: The case studies showed that our data warehouse is appropriate for web site automation in different contexts. Conclusion: In all cases, the use of the data warehouse was quite simple and with a good response time, mainly because of the simplicity of its structure. © 2014, Domingues et al.; licensee Springer.
2014
Authors
Carneiro, AR; Jorge, AM; Brito, PQ; Domingues, MA;
Publication
Springer Proceedings in Mathematics and Statistics
Abstract
2014
Authors
Pereira, P; Ribeiro, RP; Gama, J;
Publication
DISCOVERY SCIENCE, DS 2014
Abstract
Machine or system failures have high impact both at technical and economic levels. Most modern equipment has logging systems that allow us to collect a diversity of data regarding their operation and health. Using data mining models for novelty detection enables us to explore those datasets, building classification systems that can detect and issue an alert when a failure starts evolving, avoiding the unknown development up to breakdown. In the present case we use a failure detection system to predict train doors breakdowns before they happen using data from their logging system. We study three methods for failure detection: outlier detection, novelty detection and a supervised SVM. Given the problem's features, namely the possibility of a passenger interrupting the movement of a door, the three predictors are prone to false alarms. The main contribution of this work is the use of a low-pass filter to process the output of the predictors leading to a strong reduction in the false alarm rate.
2014
Authors
Brito, P;
Publication
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY
Abstract
Symbolic Data Analysis (SDA) provides a framework for the representation and analysis of data that comprehends inherent variability. While in Data Mining and classical Statistics the data to be analyzed usually presents one single value for each variable, that is no longer the case when the entities under analysis are not single elements, but groups gathered on the basis of some given criteria. Then, for each variable, variability inherent to each group should be taken into account. Also, when analysing concepts, such as botanic species, disease descriptions, car models, and so on, data entail intrinsic variability, which should be explicitly considered. To this purpose, new variable types have been introduced, whose realizations are not single real values or categories, but sets, intervals, or, more generally, distributions over a given domain. SDA provides methods for the (multivariate) analysis of such data, where the variability expressed in the data representation is taken into account, using various approaches. (C) 2014 John Wiley & Sons, Ltd.
2014
Authors
Giordano, G; Brito, P;
Publication
ANALYSIS AND MODELING OF COMPLEX DATA IN BEHAVIORAL AND SOCIAL SCIENCES
Abstract
Starting from the main idea of Symbolic Data Analysis to extend Statistics and Data Mining methods from first-order to second-order objects, we focus on network data-as defined in the framework of Social Network Analysis-to define a graph structure and the underlying network in the context of complex data objects. A Network Symbolic description is defined according to the statistical characterization of the network topological properties. We use suitable network measures, which are represented by means of symbolic variables. Their study through multidimensional data analysis, allows for the synthetic representation of a network as a point onto a metric space. The proposed approach is discussed on the basis of a simulation study considering three classical network growth processes.
2014
Authors
Strecht, P; Mendes Moreira, J; Soares, C;
Publication
ADVANCED DATA MINING AND APPLICATIONS, ADMA 2014
Abstract
Predicting the failure of students in university courses can provide useful information for course and programme managers as well as to explain the drop out phenomenon. While it is important to have models at course level, their number makes it hard to extract knowledge that can be useful at the university level. Therefore, to support decision making at this level, it is important to generalize the knowledge contained in those models. We propose an approach to group and merge interpretable models in order to replace them with more general ones without compromising the quality of predictive performance. We evaluate our approach using data from the U. Porto. The results obtained are promising, although they suggest alternative approaches to the problem.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.