Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2004

Nonlinear sea level trends from European tide gauge records

Authors
Barbosa, SM; Fernandes, MJ; Silva, ME;

Publication
ANNALES GEOPHYSICAE

Abstract
Mean sea level is a variable of considerable interest in meteorological and oceanographic studies, particularly long-term sea level variation and its relation to climate changes. This study concerns the analysis of monthly mean sea level data from tide gauge stations in the Northeast Atlantic with long and continuous records. Much research effort on mean sea level studies has been focused on identifying long-term linear trends, usually estimated through least-squares fitting of a deterministic function. Here, we estimate nonparametric and robust trends using lowess, a robust smoothing procedure based on locally weighted regression. This approach is more flexible than a linear trend to describe the deterministic part of the variation in tide gauge records, which has a complex structure. A common trend pattern of reduced sea levels around 1975 is found in all the analysed records and interpreted as the result of hydrological and atmospheric forcing associated with drought conditions at the tide gauge sites. This feature is overlooked by a linear regression model. Moreover, nonlinear deterministic behaviour in the time series, such as the one identified, introduces a bias in linear trends determined from short and noisy records.

2003

The use of Ada, GNAT.Spitbol, and XML in the Sol-Eu-Net project

Authors
Alves, MA; Jorge, A; Heaney, M;

Publication
RELIABLE SOFTWARE TECHNOLOGIES - ADA-EUROPE 2003

Abstract
We report the use of Ada in the European research project Sol-Eu-Net. Ada was used in a web mining subproject, mainly for data preparation, and also for web system development. Open source Ada resources e.g. GNAT.Spitbol were used. Some such resources were modified, some created anew. XML and SQL were also used in association with Ada.

2003

Automatic selection of table areas in documents for information extraction

Authors
Silva, ACE; Jorge, A; Torgo, L;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
The information contained in companies' financial statements is valuable to several users. Much of the relevant information in such documents is contained in tables and is currently mainly extracted by hand. We propose a method that accomplishes a prior step of the task of automatically extracting information from tables in documents: selecting the lines that are likely to belong to tables. Our method has been developed by empirically analyzing a set of Portuguese companies' financial statements using statistical and data mining techniques. Empirical evaluation indicates that more than 99% of table lines are selected after discarding at least 50% of all lines. The method can cope with the complexity of styles used in assembling information on paper and adapt its performance accordingly, thus maximizing its results.

2003

Visualization and evaluation support of knowledge discovery through the predictive model markup language

Authors
Wettschereck, D; Jorge, A; Moyle, S;

Publication
KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS

Abstract
The emerging standard for the platform- and system-independent representation of data mining models PMML (Predictive Model Markup Language) is currently supported by a number of knowledge discovery support engines. The primary purpose of the PMML standard is to separate model generation from model storage in order to enable users to view, post-process, and utilize data mining models independently of the tool that generated the model. In this paper two systems, called VizWiz and PEAR, are described. These software packages allow for the visualization and evaluation of data mining models that are specified in PMML. They can be viewed. as decision support systems, since they enable non-expert users of data mining results to interactively inspect and evaluate these results.

2003

Predicting outliers

Authors
Torgo, L; Ribeiro, R;

Publication
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS

Abstract
This paper describes a method designed for data mining applications where the main goal is to predict extreme and rare values of a continuous target variable, as well as to understand under which conditions these values occur. Our objective is to induce models that are accurate at predicting these outliers but are also interpretable from the user perspective. We describe a new splitting criterion for regression trees that enables the induction of trees achieving these goals. We evaluate our proposal on several real world problems and contrast the obtained models with standard regression trees. The results of this evaluation show the clear advantage of our proposal in terms of the evaluation statistics that are relevant for these applications.

2003

Predicting harmful algae blooms

Authors
Ribeiro, R; Torgo, L;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
In several applications the main interest resides in predicting rare and extreme values. This is the case of the prediction of harmful algae blooms. Though it's rare, the occurrence of these blooms has a strong impact in river life forms and water quality and turns out to be a serious ecological problem. In this paper, we describe a data mining method whose main goal is to predict accurately this kind of rare extreme values. We propose a new splitting criterion for regression trees that enables the induction of trees achieving these goals. We carry out an analysis of the results obtained with our method on this application domain and compare them to those obtained with standard regression trees. We conclude that this new method achieves better results in terms of the evaluation statistics that are relevant for this kind of applications.

  • 494
  • 514