Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2023

Trustworthy artificial intelligence and machine learning: Implications on users' security and privacy perceptions

Authors
Do Espírito Santo Faria, RM; Torres, AI; Beirão, G;

Publication
Confronting Security and Privacy Challenges in Digital Marketing

Abstract
Artificial intelligence (AI) has altered our world in numerous ways. Although its application has benefits, the underlying issues surrounding privacy and security in AI need to be understood, not only by the organizations that use it but also by the users that are susceptible to its vulnerabilities. To better understand the impact of privacy and security in AI, this chapter reviews the current literature on artificial intelligence, trustworthiness, and privacy and security concepts and uses bibliometric techniques to understand and identify current trends in the field. Finally, the authors highlight the challenges facing AI and machine learning and discuss the results obtained from the bibliometric analysis, which provides insight into the several implications for managers and contributions to future research and policy. © 2023, IGI Global. All rights reserved.

2023

Causal Reasoning in Data

Authors
Nogueira, AR;

Publication

Abstract

2023

Automatic Classification of Bird Sounds: Using MFCC and Mel Spectrogram Features with Deep Learning

Authors
Carvalho, S; Gomes, EF;

Publication
VIETNAM JOURNAL OF COMPUTER SCIENCE

Abstract
Bird species identification is a relevant and time-consuming task for ornithologists and ecologists. With growing amounts of audio-annotated data, automatic bird classification using machine learning techniques is an important trend in the scientific community. Analyzing bird behavior and population trends helps detect other organisms in the environment and is an important problem in ecology. Bird populations react quickly to environmental changes, which make their real-time counting and tracking challenging and very useful. A reliable methodology that automatically identifies bird species from audio would therefore be a valuable tool for the experts in different scientific and applicational domains. The goal of this work is to propose a methodology to identify bird sounds. In this paper, we explore deep learning techniques that are being used in this domain, such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to classify the data. In deep learning, audio problems are commonly approached by converting them into images using audio feature extraction techniques such as Mel Spectrograms and Mel Frequency Cepstral Coefficients (MFCCs). We propose and test multiple deep learning and feature extraction combinations in order to find the most suitable approach to this problem.

2023

A systematic evaluation of deep learning methods for the prediction of drug synergy in cancer

Authors
Baptista, D; Ferreira, PG; Rocha, M;

Publication
PLOS COMPUTATIONAL BIOLOGY

Abstract
Author summaryCancer therapies often fail because tumor cells become resistant to treatment. One way to overcome resistance is by treating patients with a combination of two or more drugs. Some combinations may be more effective than when considering individual drug effects, a phenomenon called drug synergy. Computational drug synergy prediction methods can help to identify new, clinically relevant drug combinations. In this study, we developed several deep learning models for drug synergy prediction. We examined the effect of using different types of deep learning architectures, and different ways of representing drugs and cancer cell lines. We explored the use of biological prior knowledge to select relevant cell line features, and also tested data-driven feature reduction methods. We tested both precomputed drug features and deep learning methods that can directly learn features from raw representations of molecules. We also evaluated whether including genomic features, in addition to gene expression data, improves the predictive performance of the models. Through these experiments, we were able to identify strategies that will help guide the development of new deep learning models for drug synergy prediction in the future. One of the main obstacles to the successful treatment of cancer is the phenomenon of drug resistance. A common strategy to overcome resistance is the use of combination therapies. However, the space of possibilities is huge and efficient search strategies are required. Machine Learning (ML) can be a useful tool for the discovery of novel, clinically relevant anti-cancer drug combinations. In particular, deep learning (DL) has become a popular choice for modeling drug combination effects. Here, we set out to examine the impact of different methodological choices on the performance of multimodal DL-based drug synergy prediction methods, including the use of different input data types, preprocessing steps and model architectures. Focusing on the NCI ALMANAC dataset, we found that feature selection based on prior biological knowledge has a positive impact-limiting gene expression data to cancer or drug response-specific genes improved performance. Drug features appeared to be more predictive of drug response, with a 41% increase in coefficient of determination (R-2) and 26% increase in Spearman correlation relative to a baseline model that used only cell line and drug identifiers. Molecular fingerprint-based drug representations performed slightly better than learned representations-ECFP4 fingerprints increased R-2 by 5.3% and Spearman correlation by 2.8% w.r.t the best learned representations. In general, fully connected feature-encoding subnetworks outperformed other architectures. DL outperformed other ML methods by more than 35% (R-2) and 14% (Spearman). Additionally, an ensemble combining the top DL and ML models improved performance by about 6.5% (R-2) and 4% (Spearman). Using a state-of-the-art interpretability method, we showed that DL models can learn to associate drug and cell line features with drug response in a biologically meaningful way. The strategies explored in this study will help to improve the development of computational methods for the rational design of effective drug combinations for cancer therapy.

2023

Soteria: Preserving Privacy in Distributed Machine Learning

Authors
Brito, C; Ferreira, P; Portela, B; Oliveira, R; Paulo, J;

Publication
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023

Abstract
We propose Soteria, a system for distributed privacy-preserving Machine Learning (ML) that leverages Trusted Execution Environments (e.g. Intel SGX) to run code in isolated containers (enclaves). Unlike previous work, where all ML-related computation is performed at trusted enclaves, we introduce a hybrid scheme, combining computation done inside and outside these enclaves. The conducted experimental evaluation validates that our approach reduces the runtime of ML algorithms by up to 41%, when compared to previous related work. Our protocol is accompanied by a security proof, as well as a discussion regarding resilience against a wide spectrum of ML attacks.

2023

Privacy-Preserving Machine Learning on Apache Spark

Authors
Brito, CV; Ferreira, PG; Portela, BL; Oliveira, RC; Paulo, JT;

Publication
IEEE ACCESS

Abstract
The adoption of third-party machine learning (ML) cloud services is highly dependent on the security guarantees and the performance penalty they incur on workloads for model training and inference. This paper explores security/performance trade-offs for the distributed Apache Spark framework and its ML library. Concretely, we build upon a key insight: in specific deployment settings, one can reveal carefully chosen non-sensitive operations (e.g. statistical calculations). This allows us to considerably improve the performance of privacy-preserving solutions without exposing the protocol to pervasive ML attacks. In more detail, we propose Soteria, a system for distributed privacy-preserving ML that leverages Trusted Execution Environments (e.g. Intel SGX) to run computations over sensitive information in isolated containers (enclaves). Unlike previous work, where all ML-related computation is performed at trusted enclaves, we introduce a hybrid scheme, combining computation done inside and outside these enclaves. The experimental evaluation validates that our approach reduces the runtime of ML algorithms by up to 41% when compared to previous related work. Our protocol is accompanied by a security proof and a discussion regarding resilience against a wide spectrum of ML attacks.

  • 73
  • 497