Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Sobre

Sobre

João Gama é Professor Catedrático da Faculdade de Economia da Universidade do Porto. É investigador e vice-diretor do LIAAD, INESC TEC. Concluiu o doutoramento na Universidade do Porto, em 2000. É Sénior member do IEEE. Trabalhou em vários projetos nacionais e europeus sobre sistemas de aprendizagem incremental e adaptativo, descoberta de conhecimento em tempo real, e aprendizagem de dados massivos e estruturados. Foi PC chair no ECML2005, DS2009, ADMA2009, IDA '2011 e ECML / PKDD'2015. Foi track chair ACM SAC de 2007 a 2018. Organizou uma série de Workshops sobre Descoberta de Conhecimento de fluxos de dados no ECMLPKDD, ICML, e no ACM SIGKDD. É autor de vários livros em Data Mining e autoria de uma monografia sobre Descoberta de Conhecimento a partir de fluxos de Dados. É autor de mais de 250 papéis peer-reviewed em áreas relacionadas com a aprendizagem automática, aprendizagem de dados em tempo real e fluxos de dados. É membro do conselho editorial de revistas internacionais ML, DMKD, TKDE, IDA, NGC e KAIS. Supervisionou mais de 15 estudantes de doutoramento e 50 alunos de mestrado.

Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    João Gama
  • Cargo

    Investigador Coordenador
  • Desde

    01 abril 2009
017
Publicações

2024

Classification of Pulmonary Nodules in 2-[<SUP>18</SUP>F]FDG PET/CT Images with a 3D Convolutional Neural Network

Autores
Alves, VM; Cardoso, JD; Gama, J;

Publicação
NUCLEAR MEDICINE AND MOLECULAR IMAGING

Abstract
Purpose 2-[F-18]FDG PET/CT plays an important role in the management of pulmonary nodules. Convolutional neural networks (CNNs) automatically learn features from images and have the potential to improve the discrimination between malignant and benign pulmonary nodules. The purpose of this study was to develop and validate a CNN model for classification of pulmonary nodules from 2-[F-18]FDG PET images.Methods One hundred thirteen participants were retrospectively selected. One nodule per participant. The 2-[F-18]FDG PET images were preprocessed and annotated with the reference standard. The deep learning experiment entailed random data splitting in five sets. A test set was held out for evaluation of the final model. Four-fold cross-validation was performed from the remaining sets for training and evaluating a set of candidate models and for selecting the final model. Models of three types of 3D CNNs architectures were trained from random weight initialization (Stacked 3D CNN, VGG-like and Inception-v2-like models) both in original and augmented datasets. Transfer learning, from ImageNet with ResNet-50, was also used.Results The final model (Stacked 3D CNN model) obtained an area under the ROC curve of 0.8385 (95% CI: 0.6455-1.0000) in the test set. The model had a sensibility of 80.00%, a specificity of 69.23% and an accuracy of 73.91%, in the test set, for an optimised decision threshold that assigns a higher cost to false negatives.Conclusion A 3D CNN model was effective at distinguishing benign from malignant pulmonary nodules in 2-[F-18]FDG PET images.

2024

Forecasting financial market structure from network features using machine learning

Autores
Castilho, D; Souza, TTP; Kang, SM; Gama, J; de Carvalho, ACPLF;

Publicação
KNOWLEDGE AND INFORMATION SYSTEMS

Abstract
We propose a model that forecasts market correlation structure from link- and node-based financial network features using machine learning. For such, market structure is modeled as a dynamic asset network by quantifying time-dependent co-movement of asset price returns across company constituents of major global market indices. We provide empirical evidence using three different network filtering methods to estimate market structure, namely Dynamic Asset Graph, Dynamic Minimal Spanning Tree and Dynamic Threshold Networks. Experimental results show that the proposed model can forecast market structure with high predictive performance with up to 40%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$40\%$$\end{document} improvement over a time-invariant correlation-based benchmark. Non-pair-wise correlation features showed to be important compared to traditionally used pair-wise correlation measures for all markets studied, particularly in the long-term forecasting of stock market structure. Evidence is provided for stock constituents of the DAX30, EUROSTOXX50, FTSE100, HANGSENG50, NASDAQ100 and NIFTY50 market indices. Findings can be useful to improve portfolio selection and risk management methods, which commonly rely on a backward-looking covariance matrix to estimate portfolio risk.

2024

SWINN: Efficient nearest neighbor search in sliding windows using graphs

Autores
Mastelini, SM; Veloso, B; Halford, M; de Carvalho, ACPDF; Gama, J;

Publicação
INFORMATION FUSION

Abstract
Nearest neighbor search (NNS) is one of the main concerns in data stream applications since similarity queries can be used in multiple scenarios. Online NNS is usually performed on a sliding window by lazily scanning every element currently stored in the window. This paper proposes Sliding Window-based Incremental Nearest Neighbors (SWINN), a graph-based online search index algorithm for speeding up NNS in potentially never-ending and dynamic data stream tasks. Our proposal broadens the application of online NNS-based solutions, as even moderately large data buffers become impractical to handle when a naive NNS strategy is selected. SWINN enables efficient handling of large data buffers by using an incremental strategy to build and update a search graph supporting any distance metric. Vertices can be added and removed from the search graph. To keep the graph reliable for search queries, lightweight graph maintenance routines are run. According to experimental results, SWINN is significantly faster than performing a naive complete scan of the data buffer while keeping competitive search recall values. We also apply SWINN to online classification and regression tasks and show that our proposal is effective against popular online machine learning algorithms.

2024

Improving hyper-parameter self-tuning for data streams by adapting an evolutionary approach

Autores
Moya, AR; Veloso, B; Gama, J; Ventura, S;

Publicação
DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Hyper-parameter tuning of machine learning models has become a crucial task in achieving optimal results in terms of performance. Several researchers have explored the optimisation task during the last decades to reach a state-of-the-art method. However, most of them focus on batch or offline learning, where data distributions do not change arbitrarily over time. On the other hand, dealing with data streams and online learning is a challenging problem. In fact, the higher the technology goes, the greater the importance of sophisticated techniques to process these data streams. Thus, improving hyper-parameter self-tuning during online learning of these machine learning models is crucial. To this end, in this paper, we present MESSPT, an evolutionary algorithm for self-hyper-parameter tuning for data streams. We apply Differential Evolution to dynamically-sized samples, requiring a single pass-over of data to train and evaluate models and choose the best configurations. We take care of the number of configurations to be evaluated, which necessarily has to be reduced, thus making this evolutionary approach a micro-evolutionary one. Furthermore, we control how our evolutionary algorithm deals with concept drift. Experiments on different learning tasks and over well-known datasets show that our proposed MESSPT outperforms the state-of-the-art on hyper-parameter tuning for data streams.

2024

Unveiling Group-Specific Distributed Concept Drift: A Fairness Imperative in Federated Learning

Autores
Salazar, T; Gama, J; Araújo, H; Abreu, PH;

Publicação
CoRR

Abstract

Teses
supervisionadas

2023

Determinants of political participation: A machine learning approach

Autor
Rita Allen Valente Guedes de Pinho

Instituição
UP-FEP

2023

Applied Machine Learning Fairness in Business to Consumer Services Industry

Autor
Nuno Filipe Loureiro Paiva

Instituição
UP-FEP

2023

Customers' revenue fluctuation in a Telecommunication company: Data Warehouse Construction and Visualization

Autor
Cândido Rafael Toledo Rocha

Instituição
UP-FEP

2023

Causal Reasoning in Data

Autor
Ana Rita Dias Nogueira

Instituição
UP-FEP

2023

Text mining of companies annual reports in PDF format

Autor
Svetlana Zamyatina

Instituição
UP-FEP