Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2004

Exploring the Linear Relations in the Estimation of Matrices B and D in Subspace Identification Methods

Authors
Delgado, CJM; Santos, PLd;

Publication
ICINCO 2004, Proceedings of the First International Conference on Informatics in Control, Automation and Robotics, Setúbal, Portugal, August 25-28, 2004

Abstract

2004

Clickstreams, the basis to establish user navigation patterns on web sites

Authors
Alves, R; Belo, O; Cavalcanti, F; Ferreira, P;

Publication
DATA MINING V: DATA MINING, TEXT MINING AND THEIR BUSINESS APPLICATIONS

Abstract
Collecting and mining clickstream data from c-commerce sites has become increasingly important for marketing, advertising, and traffic analysis activities. Organizations are promoting many initiatives concerning user's navigation pattern discovery, in order to implement better sites, more functional and close to customers' needs. Basically, the main idea is to provide more quality of attendance in their sites, and, consequently, get more profitability. However, clickstream processing is not a simple task. The sequences of clicks are very difficult to handle using conventional techniques, essentially due to their diversity and nature. They include a lot of aspects that reveal the multidimensional perspective of web data. OLAP technology provides today the means and techniques to represent, store and analyse such kinds of multidimensional data. However, it does not offer discovery driven analysis to support traversal pattern identification processes on web sites. Mining traversal pattern techniques can be applied in conjunction with OLAP as an integrated alternative for understanding those particular sequences of clicks. In this paper we present an integrated OLAP and mining approach specially conceived for exploring user navigation patterns based on clickstreams. We also describe the multidimensional structure provided for modelling click sequences and the OLAP operations and mining techniques that can be pushed over data cubes to bring up navigation patterns.

2004

Difference equations for the higher-order moments and cumulants of the INAR(1) model

Authors
Da Silva, ME; Oliveira, VL;

Publication
JOURNAL OF TIME SERIES ANALYSIS

Abstract
Recently, as a result of the growing interest in modelling stationary processes with discrete marginal distributions, several models for integer value time series have been proposed in the literature. One of these models is the INteger-AutoRegressive (INAR) model. Here we consider the higher-order moments and cumulants of the INAR(1) process and show that they satisfy a set of Yule-Walker type difference equations. We also obtain the spectral and bispectral density functions, thus characterizing the INAR(1) process in the frequency domain. We use a frequency domain approach, namely the Whittle criterion, to estimate the parameters of the model. The estimation theory and associated asymptotic theory of this estimation method are illustrated numerically.

2004

Nonlinear sea level trends from European tide gauge records

Authors
Barbosa, SM; Fernandes, MJ; Silva, ME;

Publication
ANNALES GEOPHYSICAE

Abstract
Mean sea level is a variable of considerable interest in meteorological and oceanographic studies, particularly long-term sea level variation and its relation to climate changes. This study concerns the analysis of monthly mean sea level data from tide gauge stations in the Northeast Atlantic with long and continuous records. Much research effort on mean sea level studies has been focused on identifying long-term linear trends, usually estimated through least-squares fitting of a deterministic function. Here, we estimate nonparametric and robust trends using lowess, a robust smoothing procedure based on locally weighted regression. This approach is more flexible than a linear trend to describe the deterministic part of the variation in tide gauge records, which has a complex structure. A common trend pattern of reduced sea levels around 1975 is found in all the analysed records and interpreted as the result of hydrological and atmospheric forcing associated with drought conditions at the tide gauge sites. This feature is overlooked by a linear regression model. Moreover, nonlinear deterministic behaviour in the time series, such as the one identified, introduces a bias in linear trends determined from short and noisy records.

2003

The use of Ada, GNAT.Spitbol, and XML in the Sol-Eu-Net project

Authors
Alves, MA; Jorge, A; Heaney, M;

Publication
RELIABLE SOFTWARE TECHNOLOGIES - ADA-EUROPE 2003

Abstract
We report the use of Ada in the European research project Sol-Eu-Net. Ada was used in a web mining subproject, mainly for data preparation, and also for web system development. Open source Ada resources e.g. GNAT.Spitbol were used. Some such resources were modified, some created anew. XML and SQL were also used in association with Ada.

2003

Automatic selection of table areas in documents for information extraction

Authors
Silva, ACE; Jorge, A; Torgo, L;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
The information contained in companies' financial statements is valuable to several users. Much of the relevant information in such documents is contained in tables and is currently mainly extracted by hand. We propose a method that accomplishes a prior step of the task of automatically extracting information from tables in documents: selecting the lines that are likely to belong to tables. Our method has been developed by empirically analyzing a set of Portuguese companies' financial statements using statistical and data mining techniques. Empirical evaluation indicates that more than 99% of table lines are selected after discarding at least 50% of all lines. The method can cope with the complexity of styles used in assembling information on paper and adapt its performance accordingly, thus maximizing its results.

  • 485
  • 506