2024
Authors
dos Santos, AF; Leal, JP;
Publication
OpenAccess Series in Informatics
Abstract
Semantic measure (SM) algorithms allow software to mimic the human ability of assessing the strength of the semantic relations between elements such as concepts, entities, words, or sentences. SM algorithms are typically evaluated by comparison against gold standard datasets built by human annotators. These datasets are composed of pairs of elements and an averaged numeric rating. Building such datasets usually requires asking human annotators to assign a numeric value to their perception of the strength of the semantic relation between two elements. Large language models (LLMs) have recently been successfully used to perform tasks which previously required human intervention, such as text summarization, essay writing, image description, image synthesis, question answering, and so on. In this paper, we present ongoing research on LLMs capabilities for semantic relations assessment. We queried several LLMs to rate the relationship of pairs of elements from existing semantic measures evaluation datasets, and measured the correlation between the results from the LLMs and gold standard datasets. Furthermore, we performed additional experiments to evaluate which other factors can influence LLMs performance in this task. We present and discuss the results obtained so far. © André Fernandes dos Santos and José Paulo Leal.
2024
Authors
Filgueiras, A; Marques, ERB; Lopes, LMB; Marques, M; Silva, H;
Publication
CoRR
Abstract
2024
Authors
Moreno, P; Areias, M; Rocha, R; Costa, VS;
Publication
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING
Abstract
Prolog systems rely on an atom table for symbol management, which is usually implemented as a dynamically resizeable hash table. This is ideal for single threaded execution, but can become a bottleneck in a multi-threaded scenario. In this work, we replace the original atom table implementation in the YAP Prolog system with a lock-free hash-based data structure, named Lock-free Hash Tries (LFHT), in order to provide efficient and scalable symbol management. Being lock-free, the new implementation also provides better guarantees, namely, immunity to priority inversion, to deadlocks and to livelocks. Performance results show that the new lock-free LFHT implementation has better results in single threaded execution and much better scalability than the original lock based dynamically resizing hash table.
2024
Authors
Machado, D; Costa, VS; Brandao, P;
Publication
BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, BIOSTEC 2023
Abstract
Diabetes management data is composed of diverse factors and glycaemia indicators. Glycaemia predictive models tend to focus solely on glycaemia values. A comprehensive understanding of diabetes management requires the consideration of several aspects of diabetes management, beyond glycaemia. However, the inclusion of every aspect of diabetes management can create an overly high-dimensional data set. Excessive feature spaces increase computational complexity and may introduce over-fitting. Additionally, the inclusion of inconsequential features introduces noise that hinders a model's performance. Feature importance is a process that evaluates a feature's value, and can be used to identify optimal feature sub-sets. Depending on the context, multiple methods can be used. The drop feature method, in the literature, is considered to be the best approach to evaluate individual feature importance. To reach an optimal set, the best approach is branch and bound, albeit its heavy computational cost. This overhead can be addressed through a trade-off between the feature set's optimisation level and the process' computational feasibility. The improvement of the feature space has implications on the effectiveness of data balancing approaches. Whilst, in this study, the observed impact was not substantial, it warrants the need to reconsider the balancing approach given a superior feature space.
2024
Authors
Fernandes, P; Ciardhuáin, SO; Antunes, M;
Publication
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I
Abstract
The data exchange between different sectors of society has led to the development of electronic documents supported by different reading formats, namely portable PDF format. These documents have characteristics similar to those used in programming languages, allowing the incorporation of potentially malicious code, which makes them a vector for cyberattacks. Thus, detecting anomalies in digital documents, such as PDF files, has become crucial in several domains, such as finance, digital forensic analysis and law enforcement. Currently, detection methods are mostly based on machine learning and are characterised by being complex, slow and mainly inefficient in detecting zero-day attacks. This paper aims to propose a Benford Law (BL) based model to uncover manipulated PDF documents by analysing potential anomalies in the first digit extracted from the PDF document's characteristics. The proposed model was evaluated using the CIC Evasive PDFMAL-2022 dataset, consisting of 1191 documents (278 benign and 918 malicious). To classify the PDF documents, based on BL, into malicious or benign documents, three statistical models were used in conjunction with the mean absolute deviation: the parametric Pearson and the non-parametric Spearman and Cramer-Von Mises models. The results show a maximum F1 score of 87.63% in detecting malicious documents using Pearson's model, demonstrating the suitability and effectiveness of applying Benford's Law in detecting anomalies in digital documents to maintain the accuracy and integrity of information and promoting trust in systems and institutions.
2024
Authors
Areia, J; Santos, B; Antunes, M;
Publication
Proceedings of the 21st International Conference on Security and Cryptography, SECRYPT 2024, Dijon, France, July 8-10, 2024.
Abstract
Memorising passwords poses a significant challenge for individuals, leading to the increasing adoption of password managers, particularly browser password managers. Despite their benefits to users’ daily routines, the use of these tools introduces new vulnerabilities to web and network security. This paper aims to investigate these vulnerabilities and analyse the security mechanisms of browser-based password managers integrated into Google Chrome, Microsoft Edge, Opera GX, Mozilla Firefox, and Brave. Through malware development and deployment, Dvorak is capable of extracting essential files from the browser’s password manager for subsequent decryption. To assess Dvorak functionalities we conducted a controlled security analysis across all aforementioned browsers. Our findings reveal that the designed malware successfully retrieves all stored passwords from the tested browsers when no master password is used. However, the results differ depending on whether a master password is used. A comparison between browsers is made, based on the results of the malware. The paper ends with recommendations for potential strategies to mitigate these security concerns. © 2024 by SCITEPRESS – Science and Technology Publications, Lda.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.