Detalhes
Nome
Rita Paula RibeiroCargo
Investigador SéniorDesde
01 janeiro 2008
Nacionalidade
PortugalCentro
Laboratório de Inteligência Artificial e Apoio à DecisãoContactos
+351220402963
rita.p.ribeiro@inesctec.pt
2025
Autores
Silva, I; Ribeiro, RP; Gama, J;
Publicação
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2023, PT II
Abstract
Pet owners are increasingly becoming conscious of their pet's necessities and are paying more attention to their overall wellness. The well-being of their pets is intricately linked to their own emotional and physical well-being. Some veterinary system solutions are emerging to provide proactive healthcare options for pets. One such solution offers the continuous monitoring of a pet's activity through accelerometer tracking devices. Based on data collected by this application, in this paper, we study different time aggregation and three unsupervised machine learning techniques to identify anomalies in pet behaviour data. Specifically, three algorithms, Isolation Forest, Local Outlier Factor, and K-Nearest Neighbour, with various thresholds to differentiate between normal and abnormal events. Results conducted on ten pets (five cats and five dogs) show that the most effective approach is to use daily data divided into periods. Moreover, the Local Outlier Factor is the best algorithm for detecting anomalies when prioritizing the identification of true positives. However, it also produces a high false positive ratio.
2025
Autores
Barbosa, M; Ribeiro, C; Gomes, F; Ribeiro, RP; Gama, J;
Publicação
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2023, PT II
Abstract
The rise of environmental crimes has become a major concern globally as they cause significant damage to ecosystems, public health and result in economic losses. The availability of vast sensor data provides an opportunity to analyze environmental data proactively. This helps to detect irregularities and uncover potential criminal activities. This paper highlights the critical role played by machine learning (ML) and remote sensing technologies in the continuously evolving scenarios of environmental crime. By examining some case studies on detecting illegal fishing, illegal oil spills, illegal landfills, and illegal logging, we delve into the practical implementation of data-driven approaches for environmental crime detection. Our goal with this study is to provide an overview of the existing research in this area and foster the use of ML and data science techniques to enhance environmental crime detection.
2025
Autores
Andrade, C; Ribeiro, RP; Gama, J;
Publicação
INTELLIGENT SYSTEMS, BRACIS 2024, PT III
Abstract
Latent Dirichlet Allocation (LDA) is a fundamental method for clustering short text streams. However, when applied to large datasets, it often faces significant challenges, and its performance is typically evaluated in domain-specific datasets such as news and tweets. This study aims to fill this gap by evaluating the effectiveness of short text clustering methods in a large and diverse e-commerce dataset. We specifically investigate how well these clustering algorithms adapt to the complex dynamics and larger scale of e-commerce text streams, which differ from their usual application domains. Our analysis focuses on the impact of high homogeneity scores on the reported Normalized Mutual Information (NMI) values. We particularly examine whether these scores are inflated due to the prevalence of single-element clusters. To address potential biases in clustering evaluation, we propose using the Akaike Information Criterion (AIC) as an alternative metric to reduce the formation of single-element clusters and provide a more balanced measure of clustering performance. We present new insights for applying short text clustering methodologies in real-world situations, especially in sectors like e-commerce, where text data volumes and dynamics present unique challenges.
2025
Autores
Shaji, N; Tabassum, S; Ribeiro, P; Gama, J; Santana, P; Garcia, A;
Publicação
Studies in Computational Intelligence
Abstract
Waste transport management is a critical sector where maintaining accurate records and preventing fraudulent or illegal activities is essential for regulatory compliance, environmental protection, and public safety. However, monitoring and analyzing large-scale waste transport records to identify suspicious patterns or anomalies is a complex task. These records often involve multiple entities and exhibit variability in waste flows between them. Traditional anomaly detection methods relying solely on individual transaction data, may struggle to capture the deeper, network-level anomalies that emerge from the interactions between entities. To address this complexity, we propose a hybrid approach that integrates network-based measures with machine learning techniques for anomaly detection in waste transport data. Our method leverages advanced graph analysis techniques, such as sub-graph detection, community structure analysis, and centrality measures, to extract meaningful features that describe the network’s topology. We also introduce novel metrics for edge weight disparities. Further, advanced machine learning techniques, including clustering, neural network, density-based, and ensemble methods are applied to these structural features to enhance and refine the identification of anomalous behaviors. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
2025
Autores
Chandramohan, MS; da Silva, IM; Ribeiro, RP; Jorge, A; da Silva, JE;
Publicação
ENVIRONMENTS
Abstract
This study investigates spatial distribution and chemical elemental composition screening in soils in Rome (Italy) using X-ray fluorescence analysis. Fifty-nine soil samples were collected from various locations within the urban areas of the Rome municipality and were analyzed for 19 elements. Multivariate statistical techniques, including nonlinear mapping, principal component analysis, and hierarchical cluster analysis, were employed to identify clusters of similar soil samples and their spatial distribution and to try to obtain environmental quality information. The soil sample clusters result from natural geological processes and anthropogenic activities on soil contamination patterns. Spatial clustering using the k-means algorithm further identified six distinct clusters, each with specific geographical distributions and elemental characteristics. Hence, the findings underscore the importance of targeted soil assessments to ensure the sustainable use of land resources in urban areas.
Teses supervisionadas
2023
Autor
Amanda Custódio Tavares
Instituição
UP-FCUP
2023
Autor
Nirbhaya Shaji
Instituição
UP-FCUP
2023
Autor
Sérgio Gabriel Pontes de Jesus
Instituição
UP-FCUP
2023
Autor
Cesar Henrique Goersch Andrade
Instituição
UP-FCUP
2023
Autor
Ehsan Aminian
Instituição
UP-FCUP
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.