Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by CRACS

2021

A Graph Database Representation of Portuguese Criminal-Related Documents

Authors
Carnaz, G; Nogueira, VB; Antunes, M;

Publication
INFORMATICS-BASEL

Abstract
Organizations have been challenged by the need to process an increasing amount of data, both structured and unstructured, retrieved from heterogeneous sources. Criminal investigation police are among these organizations, as they have to manually process a vast number of criminal reports, news articles related to crimes, occurrence and evidence reports, and other unstructured documents. Automatic extraction and representation of data and knowledge in such documents is an essential task to reduce the manual analysis burden and to automate the discovering of names and entities relationships that may exist in a case. This paper presents SEMCrime, a framework used to extract and classify named-entities and relations in Portuguese criminal reports and documents, and represent the data retrieved into a graph database. A 5WH1 (Who, What, Why, Where, When, and How) information extraction method was applied, and a graph database representation was used to store and visualize the relations extracted from the documents. Promising results were obtained with a prototype developed to evaluate the framework, namely a name-entity recognition with an F-Measure of 0.73, and a 5W1H information extraction performance with an F-Measure of 0.65.

2021

Shedding light on the african enigma: In vitro testing of homo sapiens-helicobacter pylori coevolution

Authors
Cavadas, B; Leite, M; Pedro, N; Magalhaes, AC; Melo, J; Correia, M; Maximo, V; Camacho, R; Fonseca, NA; Figueiredo, C; Pereira, L;

Publication
Microorganisms

Abstract
The continuous characterization of genome-wide diversity in population and case- cohort samples, allied to the development of new algorithms, are shedding light on host ancestry impact and selection events on various infectious diseases. Especially interesting are the longstanding associations between humans and certain bacteria, such as the case of Helicobacter pylori, which could have been strong drivers of adaptation leading to coevolution. Some evidence on admixed gastric cancer cohorts have been suggested as supporting Homo-Helicobacter coevolution, but reliable experimental data that control both the bacterium and the host ancestries are lacking. Here, we conducted the first in vitro coinfection assays with dual humanand bacterium-matched and -mismatched ancestries, in African and European backgrounds, to evaluate the genome wide gene expression host response to H. pylori. Our results showed that: (1) the host response to H. pylori infection was greatly shaped by the human ancestry, with variability on innate immune system and metabolism; (2) African human ancestry showed signs of coevolution with H. pylori while European ancestry appeared to be maladapted; and (3) mismatched ancestry did not seem to be an important differentiator of gene expression at the initial stages of infection as assayed here. © 2021 by the authors.

2021

Evaluating the impact of sampling strategies and bioinformatics on ethanol-based DNA metabarcoding

Authors
Martins, FM; Fonseca, NA; Egeter, B; Pinto, J; Assunção, T; Chaves, C; Sousa, P; Jesus, J; Beja, P;

Publication
ARPHA Conference Abstracts

Abstract
Recent developments on ethanol-based DNA (etDNA) metabarcoding have shown that it is possible to extract meaningful information about macroinvertebrate community diversity and composition from the ethanol used to preserve bulk samples. The major advantages of this molecular approach are the reduced processing time and costs, and the possibility to keep specimens intact for other experiments. Yet, organisms with highly sclerotised exoskeleton or that are rare in the sample have been found to release a lower amount of DNA into solution and tend to be consistently missed by etDNA metabarcoding, thereby compromising the viability of the method. Few studies have shown that the first steps of the metabarcoding workflow are crucial for the good performance of etDNA-based assays, such as the decision on storage time before sampling and the ethanol phase to be analysed, the inclusion of pre-treatment strategies (i.e., freezing), and the choice of the DNA extraction protocol. In this study, we aimed to evaluate the combined effect of various technical choices on the performance of etDNA metabarcoding, considering factors such as sample volume, ethanol phase of sorted and unsorted samples, pre-capture treatments (evaporation vs filtration) and bioinformatic pipelines. Through the application of decision-tree models, our preliminary data revealed that the increase of volume (by itself) is enough to improve PCR amplification yields and proportion of families matching the morphological identifications, with great impact on the detection of hard-bodied and cased taxa. Also, no major differences among phases with or without a sorting step nor among bioinformatic pipelines were detected, particularly at higher volumes. Our results suggest that the higher performance (with lower observed variation) in taxonomic detection at higher volumes is likely a consequence of a higher availability of longer fragments of DNA in solution. This study highlights the importance of understanding the impact of technical choices to improve the efficiency of a DNA-based method, and reinstates etDNA metabarcoding as a potential method in the context of biomonitoring.

2021

On the Implementation of Memory Reclamation Methods in a Lock-Free Hash Trie Design

Authors
Moreno, P; Areias, M; Rocha, R;

Publication
Journal of Parallel and Distributed Computing

Abstract

2021

Exposing Manipulated Photos and Videos in Digital Forensics Analysis

Authors
Ferreira, S; Antunes, M; Correia, ME;

Publication
Journal of Imaging

Abstract
Tampered multimedia content is being increasingly used in a broad range of cybercrime activities. The spread of fake news, misinformation, digital kidnapping, and ransomware-related crimes are amongst the most recurrent crimes in which manipulated digital photos and videos are the perpetrating and disseminating medium. Criminal investigation has been challenged in applying machine learning techniques to automatically distinguish between fake and genuine seized photos and videos. Despite the pertinent need for manual validation, easy-to-use platforms for digital forensics are essential to automate and facilitate the detection of tampered content and to help criminal investigators with their work. This paper presents a machine learning Support Vector Machines (SVM) based method to distinguish between genuine and fake multimedia files, namely digital photos and videos, which may indicate the presence of deepfake content. The method was implemented in Python and integrated as new modules in the widely used digital forensics application Autopsy. The implemented approach extracts a set of simple features resulting from the application of a Discrete Fourier Transform (DFT) to digital photos and video frames. The model was evaluated with a large dataset of classified multimedia files containing both legitimate and fake photos and frames extracted from videos. Regarding deepfake detection in videos, the Celeb-DFv1 dataset was used, featuring 590 original videos collected from YouTube, and covering different subjects. The results obtained with the 5-fold cross-validation outperformed those SVM-based methods documented in the literature, by achieving an average F1-score of 99.53%, 79.55%, and 89.10%, respectively for photos, videos, and a mixture of both types of content. A benchmark with state-of-the-art methods was also done, by comparing the proposed SVM method with deep learning approaches, namely Convolutional Neural Networks (CNN). Despite CNN having outperformed the proposed DFT-SVM compound method, the competitiveness of the results attained by DFT-SVM and the substantially reduced processing time make it appropriate to be implemented and embedded into Autopsy modules, by predicting the level of fakeness calculated for each analyzed multimedia file.

2021

FGPE Gamification Service: A GraphQL Service to Gamify Online Education

Authors
Paiva, JC; Haraszczuk, A; Queirós, R; Leal, JP; Swacha, J; Kosta, S;

Publication
Trends and Applications in Information Systems and Technologies - Volume 4, WorldCIST 2021, Terceira Island, Azores, Portugal, 30 March - 2 April, 2021.

Abstract
Keeping students engaged while learning programming is becoming more and more imperative. Of the several proposed techniques, gamification is presumably the most widely studied and has already proven as an effective means to engage students. However, there is a complete lack of public and customizable solutions to gamified programming education that can be reused with personalized rules and learning material. FGPE Gamification Service (FGPE GS) is an open-source GraphQL service that transforms a package containing the gamification layer – adhering to a dedicated open-source language, GEdIL – into a game. The game provides students with a gamified experience leveraging on the automatically-assessable activities referenced by the challenges. This paper presents FGPE GS, its architecture, data model, and validation. © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.

  • 1
  • 106