Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Sobre

Sobre

Sou natural do distrito de porto. Obtive a Licenciatura em Eng. Eletrotécnica e de Computadores em 2001, o grau de Mestre em Redes e Serviços de Comunicação em 2004 e o Doutoramento em Eng. Eletrotécnica e de Computadores em 2012, todos na Faculdade de Engenharia da Universidade do Porto (FEUP). Sou colaborador no INESC TEC desde 2001 e tenho a função de Investigador Sénior no Centro de Telecomunicações e Multimédia. Sou também Professor Adjunto Convidado no Departamento de Engenharia Eletrotécnica do Instituto Superior de Engenharia do Porto (ISEP). Os meus atuais interesses de investigação incluem procesamento de imagem e vídeo, sistemas multimédia e visão computacional. 

Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    Pedro Miguel Carvalho
  • Cargo

    Investigador Sénior
  • Desde

    01 setembro 2001
014
Publicações

2025

Automatic Visual Inspection for Industrial Application

Autores
Ribeiro, AG; Vilaça, L; Costa, C; da Costa, TS; Carvalho, PM;

Publicação
JOURNAL OF IMAGING

Abstract
Quality control represents a critical function in industrial environments, ensuring that manufactured products meet strict standards and remain free from defects. In highly regulated sectors such as the pharmaceutical industry, traditional manual inspection methods remain widely used. However, these are time-consuming and prone to human error, and they lack the reliability required for large-scale operations, highlighting the urgent need for automated solutions. This is crucial for industrial applications, where environments evolve and new defect types can arise unpredictably. This work proposes an automated visual defect detection system specifically designed for pharmaceutical bottles, with potential applicability in other manufacturing domains. Various methods were integrated to create robust tools capable of real-world deployment. A key strategy is the use of incremental learning, which enables machine learning models to incorporate new, unseen data without full retraining, thus enabling adaptation to new defects as they appear, allowing models to handle rare cases while maintaining stability and performance. The proposed solution incorporates a multi-view inspection setup to capture images from multiple angles, enhancing accuracy and robustness. Evaluations in real-world industrial conditions demonstrated high defect detection rates, confirming the effectiveness of the proposed approach.

2025

Exploring Motion Information in Homography Calculation for Football Matches With Moving Cameras

Autores
Gomes, C; Mastralexi, C; Carvalho, P;

Publicação
IEEE ACCESS

Abstract
In football, where minor differences can significantly affect outcomes and performance, automatic video analysis has become a critical tool for analyzing and optimizing team strategies. However, many existing solutions require expensive and complex hardware comprising multiple cameras, sensors, or GPS devices, limiting accessibility for many clubs, particularly those with limited resources. Using images and video from a moving camera can help a wider audience benefit from video analysis, but it introduces new challenges related to motion. To address this, we explore an alternative homography estimation in moving camera scenarios. Homography plays a crucial role in video analysis, but presents challenges when keypoints are sparse, especially in dynamic environments. Existing techniques predominantly rely on visible keypoints and apply homography transformations on a frame-by-frame basis, often lacking temporal consistency and facing challenges in areas with sparse keypoints. This paper explores the use of estimated motion information for homography computation. Our experimental results reveal that integrating motion data directly into homography estimations leads to reduced errors in keypoint-sparse frames, surpassing state-of-the-art methods, filling a current gap in moving camera scenarios.

2025

Correction to: A Review of Recent Advances and Challenges in Grocery Label Detection and Recognition (Applied Sciences, (2023), 13, 5, (2871), 10.3390/app13052871)

Autores
Guimarães, V; Nascimento, J; Viana, P; Carvalho, P;

Publicação
Applied Sciences (Switzerland)

Abstract
There was an error in the original publication [1]. The statement in the Acknowledgments section is incorrect and should be removed because the official start of the project WATSON was after the paper’s publication date. The authors state that the scientific conclusions are unaffected. This correction was approved by the Academic Editor. The original publication has also been updated. © 2025 by the authors.

2025

Enhancing Weakly-Supervised Video Anomaly Detection With Temporal Constraints

Autores
Caetano, F; Carvalho, P; Mastralexi, C; Cardoso, JS;

Publicação
IEEE ACCESS

Abstract
Anomaly Detection has been a significant field in Machine Learning since it began gaining traction. In the context of Computer Vision, the increased interest is notorious as it enables the development of video processing models for different tasks without the need for a cumbersome effort with the annotation of possible events, that may be under represented. From the predominant strategies, weakly and semi-supervised, the former has demonstrated potential to achieve a higher score in its analysis, adding to its flexibility. This work shows that using temporal ranking constraints for Multiple Instance Learning can increase the performance of these models, allowing the focus on the most informative instances. Moreover, the results suggest that altering the ranking process to include information about adjacent instances generates best-performing models.

2024

A Transition Towards Virtual Representations of Visual Scenes

Autores
Pereira, A; Carvalho, P; Côrte Real, L;

Publicação
Advances in Internet of Things & Embedded Systems

Abstract
We propose a unified architecture for visual scene understanding, aimed at overcoming the limitations of traditional, fragmented approaches in computer vision. Our work focuses on creating a system that accurately and coherently interprets visual scenes, with the ultimate goal to provide a 3D virtual representation, which is particularly useful for applications in virtual and augmented reality. By integrating various visual and semantic processing tasks into a single, adaptable framework, our architecture simplifies the design process, ensuring a seamless and consistent scene interpretation. This is particularly important in complex systems that rely on 3D synthesis, as the need for precise and semantically coherent scene descriptions keeps on growing. Our unified approach addresses these challenges, offering a flexible and efficient solution. We demonstrate the practical effectiveness of our architecture through a proof-of-concept system and explore its potential in various application domains, proving its value in advancing the field of computer vision.