2021
Authors
Bécue, A; Praça, I; Gama, J;
Publication
Artif. Intell. Rev.
Abstract
2021
Authors
Paraíso, P; Ruiz, S; Gomes, P; Rodrigues, L; Gama, J;
Publication
Int. J. Data Sci. Anal.
Abstract
2021
Authors
Cerri, R; Costa Jínior, JD; Faria, ER; Gama, J;
Publication
SAC '21: The 36th ACM/SIGAPP Symposium on Applied Computing, Virtual Event, Republic of Korea, March 22-26, 2021
Abstract
Several algorithms have been proposed for offline multi-label classification. However, applications in areas such as traffic monitoring, social networks, and sensors produce data continuously, the so called data streams, posing challenges to batch multi-label learning. With the lack of stationarity in the distribution of data streams, new algorithms are needed to online adapt to such changes (concept drift). Also, in realistic applications, changes occur in scenarios with infinitely delayed labels, where the true classes of the arrival instances are never available. We propose an online unsupervised incremental method based on self-organizing maps for multi-label stream classification in scenarios with infinitely delayed labels. We consider the existence of an initial set of labeled instances to train a self-organizing map for each label. The learned models are then used and adapted in an evolving stream to classify new instances, considering that their classes will never be available. We adapt to incremental concept drifts by online updating the weight vectors of winner neurons and the dataset label cardinality. Predictions are obtained using the Bayes rule and the outputs of each neuron, adapting the prior probabilities and conditional probabilities of the classes in the stream. Experiments using synthetic and real datasets show that our method is highly competitive with several ones from the literature, in both stationary and concept drift scenarios. © 2021 ACM.
2021
Authors
Fernandes, S; Fanaee T, H; Gama, J;
Publication
ARTIFICIAL INTELLIGENCE REVIEW
Abstract
Social networks are becoming larger and more complex as new ways of collecting social interaction data arise (namely from online social networks, mobile devices sensors, ...). These networks are often large-scale and of high dimensionality. Therefore, dealing with such networks became a challenging task. An intuitive way to deal with this complexity is to resort to tensors. In this context, the application of tensor decomposition has proven its usefulness in modelling and mining these networks: it has not only been applied for exploratory analysis (thus allowing the discovery of interaction patterns), but also for more demanding and elaborated tasks such as community detection and link prediction. In this work, we provide an overview of the methods based on tensor decomposition for the purpose of analysing time-evolving social networks from various perspectives: from community detection, link prediction and anomaly/event detection to network summarization and visualization. In more detail, we discuss the ideas exploited to carry out each social network analysis task as well as its limitations in order to give a complete coverage of the topic.
2021
Authors
Paraiso, P; Ruiz, S; Gomes, P; Rodrigues, L; Gama, J;
Publication
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS
Abstract
The usage of non-traditional data in credit scoring, from microfinance institutions, is very useful when trying to address the problem, very common in emerging markets, of the lack of a verifiable customers' credit history. In this context, this paper relies on data acquired from smartphones in a loan classification problem. We conduct a set of experiments concerning feature selection, strategies to deal with imbalanced datasets and algorithm choice, to define a baseline model. This model is, then, compared to others adding network features to the original ones. For that comparison, we generate a network that links a given user to its phone book contacts which are users of a given mobile application, taking into account the ethics and privacy concerns involved, and use some feature extraction techniques, such as the introduction of centrality measures and the definition of node embeddings, in order to capture certain aspects of the network's topology. Several node embedding algorithms are tested, but only Node2Vec proves to be significantly better than the baseline model, applying Friedman's post hoc tests. This node embedding algorithm outperforms all the other, representing a relative improvement, in comparison with the baseline model, of 5.74% on the mean accuracy, 7.13% on the area under the Receiver Operating Characteristic curve and 30.83% on the Kolmogorov-Smirnov statistic scores. This method, therefore, proves to be very promising when trying to discriminate between "good" and "bad" customers, in credit scoring classification problems.
2021
Authors
Becue, A; Praca, I; Gama, J;
Publication
ARTIFICIAL INTELLIGENCE REVIEW
Abstract
This survey paper discusses opportunities and threats of using artificial intelligence (AI) technology in the manufacturing sector with consideration for offensive and defensive uses of such technology. It starts with an introduction of Industry 4.0 concept and an understanding of AI use in this context. Then provides elements of security principles and detection techniques applied to operational technology (OT) which forms the main attack surface of manufacturing systems. As some intrusion detection systems (IDS) already involve some AI-based techniques, we focus on existing machine-learning and data-mining based techniques in use for intrusion detection. This article presents the major strengths and weaknesses of the main techniques in use. We also discuss an assessment of their relevance for application to OT, from the manufacturer point of view. Another part of the paper introduces the essential drivers and principles of Industry 4.0, providing insights on the advent of AI in manufacturing systems as well as an understanding of the new set of challenges it implies. AI-based techniques for production monitoring, optimisation and control are proposed with insights on several application cases. The related technical, operational and security challenges are discussed and an understanding of the impact of such transition on current security practices is then provided in more details. The final part of the report further develops a vision of security challenges for Industry 4.0. It addresses aspects of orchestration of distributed detection techniques, introduces an approach to adversarial/robust AI development and concludes with human-machine behaviour monitoring requirements.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.