2021
Authors
Bahri, M; Bifet, A; Gama, J; Gomes, HM; Maniu, S;
Publication
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY
Abstract
The significant growth of interconnected Internet-of-Things (IoT) devices, the use of social networks, along with the evolution of technology in different domains, lead to a rise in the volume of data generated continuously from multiple systems. Valuable information can be derived from these evolving data streams by applying machine learning. In practice, several critical issues emerge when extracting useful knowledge from these potentially infinite data, mainly because of their evolving nature and high arrival rate which implies an inability to store them entirely. In this work, we provide a comprehensive survey that discusses the research constraints and the current state-of-the-art in this vibrant framework. Moreover, we present an updated overview of the latest contributions proposed in different stream mining tasks, particularly classification, regression, clustering, and frequent patterns. This article is categorized under: Fundamental Concepts of Data and Knowledge > Key Design Issues in Data Mining Fundamental Concepts of Data and Knowledge > Motivation and Emergence of Data Mining
2021
Authors
Jesus, SM; Belém, C; Balayan, V; Bento, J; Saleiro, P; Bizarro, P; Gama, J;
Publication
FAccT '21: 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event / Toronto, Canada, March 3-10, 2021
Abstract
There have been several research works proposing new Explainable AI (XAI) methods designed to generate model explanations having specific properties, or desiderata, such as fidelity, robustness, or human-interpretability. However, explanations are seldom evaluated based on their true practical impact on decision-making tasks. Without that assessment, explanations might be chosen that, in fact, hurt the overall performance of the combined system of ML model + end-users. This study aims to bridge this gap by proposing XAI Test, an application-grounded evaluation methodology tailored to isolate the impact of providing the end-user with different levels of information. We conducted an experiment following XAI Test to evaluate three popular XAI methods - LIME, SHAP, and TreeInterpreter - on a real-world fraud detection task, with real data, a deployed ML model, and fraud analysts. During the experiment, we gradually increased the information provided to the fraud analysts in three stages: Data Only, i.e., just transaction data without access to model score nor explanations, Data + ML Model Score, and Data + ML Model Score + Explanations. Using strong statistical analysis, we show that, in general, these popular explainers have a worse impact than desired. Some of the conclusion highlights include: i) showing Data Only results in the highest decision accuracy and the slowest decision time among all variants tested, ii) all the explainers improve accuracy over the Data + ML Model Score variant but still result in lower accuracy when compared with Data Only; iii) LIME was the least preferred by users, probably due to its substantially lower variability of explanations from case to case. © 2021 ACM.
2021
Authors
Cavadas, B; Leite, M; Pedro, N; Magalhaes, AC; Melo, J; Correia, M; Maximo, V; Camacho, R; Fonseca, NA; Figueiredo, C; Pereira, L;
Publication
MICROORGANISMS
Abstract
The continuous characterization of genome-wide diversity in population and case-cohort samples, allied to the development of new algorithms, are shedding light on host ancestry impact and selection events on various infectious diseases. Especially interesting are the long-standing associations between humans and certain bacteria, such as the case of Helicobacter pylori, which could have been strong drivers of adaptation leading to coevolution. Some evidence on admixed gastric cancer cohorts have been suggested as supporting Homo-Helicobacter coevolution, but reliable experimental data that control both the bacterium and the host ancestries are lacking. Here, we conducted the first in vitro coinfection assays with dual human- and bacterium-matched and -mismatched ancestries, in African and European backgrounds, to evaluate the genome wide gene expression host response to H. pylori. Our results showed that: (1) the host response to H. pylori infection was greatly shaped by the human ancestry, with variability on innate immune system and metabolism; (2) African human ancestry showed signs of coevolution with H. pylori while European ancestry appeared to be maladapted; and (3) mismatched ancestry did not seem to be an important differentiator of gene expression at the initial stages of infection as assayed here.
2021
Authors
Egeter, B; Veríssimo, J; Lopes-Lima, M; chaves, c; Pinto, J; Riccardi, N; Beja, P; Fonseca, NA;
Publication
ARPHA Conference Abstracts
Abstract
2021
Authors
Garg, M; Couturier, DL; Nsengimana, J; Fonseca, NA; Wongchenko, M; Yan, YB; Lauss, M; Jonsson, GB; Newton Bishop, J; Parkinson, C; Middleton, MR; Bishop, DT; McDonald, S; Stefanos, N; Tadross, J; Vergara, IA; Lo, S; Newell, F; Wilmott, JS; Thompson, JF; Long, GV; Scolyer, RA; Corrie, P; Adams, DJ; Brazma, A; Rabbie, R;
Publication
NATURE COMMUNICATIONS
Abstract
Adjuvant systemic therapies are now routinely used following resection of stage III melanoma, however accurate prognostic information is needed to better stratify patients. We use differential expression analyses of primary tumours from 204 RNA-sequenced melanomas within a large adjuvant trial, identifying a 121 metastasis-associated gene signature. This signature strongly associated with progression-free (HR=1.63, p=5.24 x 10(-5)) and overall survival (HR=1.61, p=1.67 x 10(-4)), was validated in 175 regional lymph nodes metastasis as well as two externally ascertained datasets. The machine learning classification models trained using the signature genes performed significantly better in predicting metastases than models trained with clinical covariates (p(AUROC) = 7.03 x 10(-4)), or published prognostic signatures (p(AUROC) < 0.05). The signature score negatively correlated with measures of immune cell infiltration (
2021
Authors
Fernandes, C; Martins, L; Teixeira, M; Blom, J; Pothier, JE; Fonseca, NA; Tavares, F;
Publication
MICROORGANISMS
Abstract
The recent report of distinct Xanthomonas lineages of Xanthomonas arboricola pv. juglandis and Xanthomonas euroxanthea within the same walnut tree revealed that this consortium of walnut-associated Xanthomonas includes both pathogenic and nonpathogenic strains. As the implications of this co-colonization are still poorly understood, in order to unveil niche-specific adaptations, the genomes of three X. euroxanthea strains (CPBF 367, CPBF 424(T), and CPBF 426) and of an X. arboricola pv. juglandis strain (CPBF 427) isolated from a single walnut tree in Loures (Portugal) were sequenced with two different technologies, Illumina and Nanopore, to provide consistent single scaffold chromosomal sequences. General genomic features showed that CPBF 427 has a genome similar to other X. arboricola pv. juglandis strains, regarding its size, number, and content of CDSs, while X. euroxanthea strains show a reduction regarding these features comparatively to X. arboricola pv. juglandis strains. Whole genome comparisons revealed remarkable genomic differences between X. arboricola pv. juglandis and X. euroxanthea strains, which translates into different pathogenicity and virulence features, namely regarding type 3 secretion system and its effectors and other secretory systems, chemotaxis-related proteins, and extracellular enzymes. Altogether, the distinct genomic repertoire of X. euroxanthea may be particularly useful to address pathogenicity emergence and evolution in walnut-associated Xanthomonas.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.