2022
Authors
Lopes, D; Medeiros, P; Dong, JD; Barradas, D; Portela, B; Vinagre, J; Ferreira, B; Christin, N; Santos, N;
Publication
Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, CCS 2022, Los Angeles, CA, USA, November 7-11, 2022
Abstract
Tor is the most popular anonymity network in the world. It relies on advanced security and obfuscation techniques to ensure the privacy of its users and free access to the Internet. However, the investigation of traffic correlation attacks against Tor Onion Services (OSes) has been relatively overlooked in the literature. In particular, determining whether it is possible to emulate a global passive adversary capable of deanonymizing the IP addresses of both the Tor OSes and of the clients accessing them has remained, so far, an open question. In this paper, we present ongoing work toward addressing this question and reveal some preliminary results on a scalable traffic correlation attack that can potentially be used to deanonymize Tor OS sessions. Our attack is based on a distributed architecture involving a group of colluding ISPs from across the world. After collecting Tor traffic samples at multiple vantage points, ISPs can run them through a pipeline where several stages of traffic classifiers employ complementary techniques that result in the deanonymization of OS sessions with high confidence (i.e., low false positives). We have responsibly disclosed our early results with the Tor Project team and are currently working not only on improving the effectiveness of our attack but also on developing countermeasures to preserve Tor users' privacy.
2022
Authors
Silva, E; Ferreira-Coimbra, J; Oliveira, E; Henriques, M; Rodrigues, NF;
Publication
SSRN Electronic Journal
Abstract
2022
Authors
Morais, J; Simões, J; Lourenço, J; Sargo, S;
Publication
Revista EDaPECI
Abstract
2022
Authors
Vaz, B; Bernardes, V; Figueira, A;
Publication
INFORMATION SYSTEMS AND TECHNOLOGIES, WORLDCIST 2022, VOL 3
Abstract
The use of Generative Adversarial Networks is almost traditional in creating synthetic images for medical purposes. They are probably the best use of GANs until now, as their results can easily be checked by the eye of specialists. In fake news detection models, we have seen lately that neural models (and deep learning) can provide a considerable improvement from standard classifiers. Yet, the most problematic problem still is the lack of data, mostly fake news data to feed these models. In this paper, we address that by proposing the use of a GAN. Results show a better capacity to generalize when used for training an extended dataset based on synthetic samples created by this GAN.
2022
Authors
Pais, S; Cordeiro, J; Jamil, ML;
Publication
JOURNAL OF BIG DATA
Abstract
Natural language processing (NLP) refers to the field of study that focuses on the interactions between human language and computers. It has recently gained much attention for analyzing human language computationally and has spread its applications for various tasks such as machine translation, information extraction, summarization, question answering, and others. With the rapid growth of cloud computing services, merging NLP in the cloud is a significant benefit. It allows researchers to conduct NLP-related experiments on large amounts of data handled by big data techniques while harnessing the cloud's vast, on-demand computing power. However, it has not sufficiently spread its tools and applications as a service in the cloud and there is little literature available that discusses the scope of interdisciplinary work. NLP, cloud Computing, and big data are vast domains and contain their challenges and potentials. By overcoming those challenges and integrating these fields, great potential for NLP and its applications can be unleashed. This paper presents a survey of NLP in cloud computing with a key focus on the comparison of cloud-based NLP services, challenges of NLP and big data while emphasizing the necessity of viable cloud-based NLP services. In the first part of this paper, an overview of NLP is presented by discussing different levels of NLP and components of natural language generation (NLG), followed by the applications of NLP. In the second part, the concept of cloud computing is discussed that highlights the architectural layers and deployment models of cloud computing and cloud-hosted NLP services. In the third part, the field of big data in the cloud is discussed with an emphasis on NLP. Furthermore, information extraction via NLP techniques within big data is introduced.
2022
Authors
Pereira, K; Vinagre, J; Alonso, AN; Coelho, F; Carvalho, M;
Publication
Machine Learning and Principles and Practice of Knowledge Discovery in Databases - International Workshops of ECML PKDD 2022, Grenoble, France, September 19-23, 2022, Proceedings, Part II
Abstract
The application of machine learning to insurance risk prediction requires learning from sensitive data. This raises multiple ethical and legal issues. One of the most relevant ones is privacy. However, privacy-preserving methods can potentially hinder the predictive potential of machine learning models. In this paper, we present preliminary experiments with life insurance data using two privacy-preserving techniques: discretization and encryption. Our objective with this work is to assess the impact of such privacy preservation techniques in the accuracy of ML models. We instantiate the problem in three general, but plausible Use Cases involving the prediction of insurance claims within a 1-year horizon. Our preliminary experiments suggest that discretization and encryption have negligible impact in the accuracy of ML models. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.