Publications

Publications by Alípio Jorge

2008

A methodology for exploring association models

Authors
Jorge, A; Pocas, J; Azevedo, PJ;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Visualization in data mining is typically related to data exploration. In this chapter we present a methodology for the post processing and visualization of association rule models. One aim is to provide the user with a tool that enables the exploration of a large set of association rules. The method is inspired by the hypertext metaphor. The initial set of rules is dynamically divided into small comprehensible sets or pages, according to the interest of the user. From each set, the user can move to other sets by choosing one appropriate operator. The set of available operators transform sets of rules into sets of rules, allowing focusing on interesting regions of the rule space. Each set of rules can also be then seen with different graphical representations. The tool is web-based and dynamically generates SVG pages to represent graphics. Association rules are given in PMML format. © 2008 Springer-Verlag Berlin Heidelberg.

CloseRead Abstract

2006

Semi-automatic creation and maintenance of web resources with webTopic

Authors
Escudeiro, NF; Jorge, AM;

Publication
Semantics, Web and Mining

Abstract
In this paper we propose a methodology for automatically retrieving document collections from the web on specific topics and for organizing them and keeping them up-to-date over time, according to user specific persistent information needs. The documents collected are organized according to user specifications and are classified partly by the user and partly automatically. A presentation layer enables the exploration of large sets of documents and, simultaneously, monitors and records user interaction with these document collections. The quality of the system is permanently monitored; the system periodically measures and stores the values of its quality parameters. Using this quality log it is possible to maintain the quality of the resources by triggering procedures aimed at correcting or preventing quality degradation.

CloseRead Abstract

2005

Monitoring the quality of meta-data in web portals using statistics, visualization and data mining

Authors
Soares, C; Jorge, AM; Domingues, MA;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS

Abstract
We propose a methodology to monitor the quality of the meta-data used to describe content in web portals. It is based on the analysis of the meta-data using statistics, visualization and data mining tools. The methodology enables the site's editor to detect and correct problems in the description of contents, thus improving the quality of the web portal and the satisfaction of its users. We also define a general architecture for a platform to support the proposed methodology. We have implemented this platform and tested it on a Portuguese portal for management; executives. The results validate the methodology proposed.

CloseRead Abstract

2005

An experiment with association rules and classification: Post-bagging and conviction

Authors
Jorge, AM; Azevedo, PJ;

Publication
DISCOVERY SCIENCE, PROCEEDINGS

Abstract
In this paper we study a new technique we call post-bagging, which consists in resampling parts of a classification model rather then the data. We do this with a particular kind of model: large sets of classification association rules, and in combination with ordinary best rule and weighted voting approaches. We empirically evaluate the effects of the technique in terms of classification accuracy. We also discuss the predictive power of different metrics used for association rule mining, such as confidence, lift, conviction and chi(2). We conclude that, for the described experimental conditions, post-bagging improves classification results and that the best metric is conviction.

CloseRead Abstract

2002

Remote collaborative data mining through online knowledge sharing

Authors
Jorge, A; Moyle, S; Voss, A;

Publication
COLLABORATIVE BUSINESS ECOSYSTEMS AND VIRTUAL ENTERPRISES

Abstract
The basic principles of a methodology for remote collaborative data mining are proposed. Starting from CRISP-DM, a general data mining process designed to carry out data mining projects; it is described how the principles of knowledge sharing and ease of communication can be embedded in the data mining process, The aim is to allow the execution of data mining projects, with the participation of multiple experts working from distant locations. All the participants in such a project can profit from the knowledge produced by others and share their knowledge online with the other participants. The produced knowledge (for example data transformations, working hypothesis, models, results of experiments) is also stored for future inspection and use, in pursuit of organizational learning. A prototypical implementation (RAMSYS) of the remote collaborative methodology is described with examples.

CloseRead Abstract

2005

Knowledge Discovery in Databases: PKDD 2005, 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, Porto, Portugal, October 3-7, 2005, Proceedings

Authors
Jorge, A; Torgo, L; Brazdil, P; Camacho, R; Gama, J;

Publication
PKDD

Abstract