2023
Authors
Gama, J; Nowaczyk, S; Pashami, S; Ribeiro, RP; Nalepa, GJ; Veloso, B;
Publication
PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023
Abstract
The field of Explainable Predictive Maintenance (PM) is concerned with developing methods that can clarify how AI systems operate in the PM domain. One of the challenges of creating maintenance plans is integrating AI output with human decision-making processes and expertise. For AI to be helpful and trustworthy, fault predictions must be contextualized and easily comprehensible to humans. This involves providing tailored explanations to different actors depending on their roles and needs. For example, engineers can be connected to technical installation blueprints, while managers can evaluate system downtime costs, and lawyers can assess safety-threatening failures' potential liability. In many industries, black-box AI systems analyze sensor data to predict failures by detecting anomalies and deviations from typical behavior with impressive accuracy. However, PM is just one part of a broader context that aims to identify the most probable causes, develop a recovery plan, and estimate remaining useful life while providing alternative solutions. Achieving this requires complex interactions among various actors in industrial and decision-making processes. Our tutorial explores current trends, promising research directions in Explainable AI (XAI) relevant to Explainable Predictive Maintenance (XPM), and future challenges and open issues on this topic. We will also present three case studies that highlight XPM's challenges in bus and train operations and steel factories.
2008
Authors
Ribeiro, R; Torgo, L;
Publication
ECOLOGICAL MODELLING
Abstract
Algae blooms are ecological events associated with extremely high abundance value of certain algae. These rare events have a strong impact in the river's ecosystem. In this context, the prediction of such events is of special importance. This paper addresses the problems that result from evaluating and comparing models at the prediction of rare extreme values using standard evaluation statistics. In this context, we describe a new evaluation statistic that we have proposed in Torgo and Ribeiro [Torgo, L., Ribeiro, R., 2006. Predicting rare extreme values. In: Ng, W, Kitsuregawa, M., Li, J., Chang, K. (Eds.), Proceedings of the loth Pacific-Asia Conference on Knowledge Discover and Data Mining (PAKDD'2006). Springer, pp. 816-820 (number 3918 in LNAI)], which can be used to identify the best models for predicting algae blooms. We apply this new statistic in a comparative study involving several models for predicting the abundance of different groups of phytoplankton in water samples collected in Douro River, Porto, Portugal. Results show that the proposed statistic identifies a variant of a Support Vector Machine as outperforming the other models that were tried in the prediction of algae blooms.
2012
Authors
Ribeiro, RP;
Publication
12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012)
Abstract
Utilitybased learning is a key technique for addressing many real world data mining applications, where the costs/benefits are not uniform across the domain of the target variable. Still, most of the existing research has been focused on classification problems. In this paper we address a related problem. There are many relevant domains (e. g. ecological, meteorological, finance) where decisions are based on the forecast of a numeric quantity (i.e. the result of a regression model). The goal of the work on this paper is to present an evaluation framework for applications where the numeric outcome of a regression model may lead to different costs/benefits as a consequence of the actions it entails. The new metric provides a more informed estimate of the utility of any regression model, given the application-specific preference biases, and hence makes more reliable the comparison and selection between alternative regression models. We illustrate the objective of our evaluation methodology on a real-life application and also carry a set of experiments over a subset of our target regression tasks: the prediction of rare and extreme values. Results show the effectiveness of our proposed utility metric for identifying the models that perform better on this type of applications.
2007
Authors
Torgo, L; Ribeiro, R;
Publication
Knowledge Discovery in Databases: PKDD 2007, Proceedings
Abstract
Cost-sensitive learning is a key technique for addressing many real world data mining applications. Most existing research has been focused on classification problems. In this paper we propose a framework for evaluating regression models in applications with non-uniform costs and benefits across the domain of the continuous target variable. Namely, we describe two metrics for asserting the costs and benefits of the predictions of any model given a set of test cases. We illustrate the use of our metrics in the context of a specific type of applications where non-uniform costs are required: the prediction of rare extreme values of a continuous target variable. Our experiments provide clear evidence of the utility of the proposed framework for evaluating the merits of any model in this class of regression domains.
2003
Authors
Torgo, L; Ribeiro, R;
Publication
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS
Abstract
This paper describes a method designed for data mining applications where the main goal is to predict extreme and rare values of a continuous target variable, as well as to understand under which conditions these values occur. Our objective is to induce models that are accurate at predicting these outliers but are also interpretable from the user perspective. We describe a new splitting criterion for regression trees that enables the induction of trees achieving these goals. We evaluate our proposal on several real world problems and contrast the obtained models with standard regression trees. The results of this evaluation show the clear advantage of our proposal in terms of the evaluation statistics that are relevant for these applications.
2003
Authors
Ribeiro, R; Torgo, L;
Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE
Abstract
In several applications the main interest resides in predicting rare and extreme values. This is the case of the prediction of harmful algae blooms. Though it's rare, the occurrence of these blooms has a strong impact in river life forms and water quality and turns out to be a serious ecological problem. In this paper, we describe a data mining method whose main goal is to predict accurately this kind of rare extreme values. We propose a new splitting criterion for regression trees that enables the induction of trees achieving these goals. We carry out an analysis of the results obtained with our method on this application domain and compare them to those obtained with standard regression trees. We conclude that this new method achieves better results in terms of the evaluation statistics that are relevant for this kind of applications.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.