Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2011

Clustering distributed sensor data streams using local processing and reduced communication

Authors
Gama, J; Rodrigues, PP; Lopes, L;

Publication
INTELLIGENT DATA ANALYSIS

Abstract
Nowadays applications produce infinite streams of data distributed across wide sensor networks. In this work we study the problem of continuously maintain a cluster structure over the data points generated by the entire network. Usual techniques operate by forwarding and concentrating the entire data in a central server, processing it as a multivariate stream. In this paper, we propose DGClust, a new distributed algorithm which reduces both the dimensionality and the communication burdens, by allowing each local sensor to keep an online discretization of its data stream, which operates with constant update time and (almost) fixed space. Each new data point triggers a cell in this univariate grid, reflecting the current state of the data stream at the local site. Whenever a local site changes its state, it notifies the central server about the new state it is in. This way, at each point in time, the central site has the global multivariate state of the entire network. To avoid monitoring all possible states, which is exponential in the number of sensors, the central site keeps a small list of counters of the most frequent global states. Finally, a simple adaptive partitional clustering algorithm is applied to the frequent states central points in order to provide an anytime definition of the clusters centers. The approach is evaluated in the context of distributed sensor networks, focusing on three outcomes: loss to real centroids, communication prevention, and processing reduction. The experimental work on synthetic data supports our proposal, presenting robustness to a high number of sensors, and the application to real data from physiological sensors exposes the aforementioned advantages of the system.

2011

Data Mining Applied on Grain Data Mart

Authors
Correa, FE; Oliveira, MDB; Alves, LRA; Gama, J; Correa, PLP;

Publication
EFITA/WCCA '11

Abstract
Agribusiness, as many other activities, produces huge amounts of spatio-temporal data. We need a system in order to store, analyze, and mine this data. In a previous work, we developed data warehouse tools to store, organize and query Brazilian agribusiness data from several regions along 10 years. In this paper, we go a step ahead, and propose specific data mining techniques to discover marks and evolution patterns from Agribusiness data. We propose the use of Tucker decomposition to automatically detect short time windows that exhibit large changes in the correlation structure between the time-series of prices from the Brazil Grain market.

2011

L2GClust: local-to-global clustering of stream sources

Authors
Rodrigues, PP; Gama, J; Araújo, J; Lopes, LMB;

Publication
Proceedings of the 2011 ACM Symposium on Applied Computing (SAC), TaiChung, Taiwan, March 21 - 24, 2011

Abstract
In ubiquitous streaming data sources, such as sensor networks, clustering nodes by the data they produce is an important problem that gives insights on the phenomenon being monitored by such networks. However, if these techniques require data to be gathered centrally, communication and storage requirements are often unbounded. The goal of this paper is to assess the feasibility of computing local clustering at each node, using only neighbors' centroids, as an approximation of the global clustering computed by a centralized process. A local algorithm is proposed to perform clustering of sensors based on the moving average of each node's data over time: the moving average of each node is approximated using memory-less fading average; clustering is based on the furthest point algorithm applied to the centroids computed by the node's direct neighbors. The algorithm was evaluated on a state-of-the-art sensor network simulator, measuring the agreement between local and global clustering. Experimental work on synthetic data with spherical Gaussian clusters is consistently analyzed for different network size, number of clusters and cluster overlapping. Results show a high level of agreement between each node's clustering definitions and the global clustering definition, with special emphasis on separability agreement. Overall, local approaches are able to keep a good approximation of the global clustering, improving privacy among nodes, and decreasing communication and computation load in the network. Hence, the basic requirements for distributed clustering of streaming data sensors recommend that clustering on these settings should be performed locally. © 2011 ACM.

2011

Data Streams

Authors
Gama, J; Rodrigues, PP;

Publication
Encyclopedia of Data Warehousing and Mining, Second Edition

Abstract

2011

Learning from Data Streams

Authors
Gama, J; Rodrigues, PP;

Publication
Encyclopedia of Data Warehousing and Mining, Second Edition

Abstract

2011

Contributions to a Decision Support System Based on Depth of Anesthesia Signals

Authors
Sebastiao, R; Silva, MM; Gama, J; Mendonca, T;

Publication
2012 25TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS)

Abstract
In the clinical practice the concerns about the administration of hypnotics and analgesics for minimally invasive diagnostics and therapeutic procedures have enormously increased in the past years. The automatic detection of changes in the signals used to evaluate the depth of anesthesia is hence of foremost importance in order to decide how to adapt the doses of hypnotics and analgesics that should be administered to patients. The aim of this work is to online detect drifts in the referred depth of anesthesia signals of patients undergoing general anesthesia. The performance of the proposed method is illustrated using BIS records previously collected from patients subject to abdominal surgery. The results show that the drifts detected by the proposed method are in accordance with the actions of the clinicians in terms of times where a change in the hypnotic or analgesic rates had occurred. This detection was performed under the presence of noise and sensor faults. The presented algorithm was also online validated. The results encourage the inclusion of the proposed algorithm in a decision support system based on depth of anesthesia signals.

  • 392
  • 506