Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by João Tiago Pinto

2021

My Eyes Are Up Here: Promoting Focus on Uncovered Regions in Masked Face Recognition

Authors
Neto, PC; Boutros, F; Pinto, JR; Saffari, M; Damer, N; Sequeira, AF; Cardoso, JS;

Publication
PROCEEDINGS OF THE 20TH INTERNATIONAL CONFERENCE OF THE BIOMETRICS SPECIAL INTEREST GROUP (BIOSIG 2021)

Abstract
The recent Covid-19 pandemic and the fact that wearing masks in public is now mandatory in several countries, created challenges in the use of face recognition systems (FRS). In this work, we address the challenge of masked face recognition (MFR) and focus on evaluating the verification performance in FRS when verifying masked vs unmasked faces compared to verifying only unmasked faces. We propose a methodology that combines the traditional triplet loss and the mean squared error (MSE) intending to improve the robustness of an MFR system in the masked-unmasked comparison mode. The results obtained by our proposed method show improvements in a detailed step-wise ablation study. The conducted study showed significant performance gains induced by our proposed training paradigm and modified triplet loss on two evaluation databases.

2021

AUTOMOTIVE: A Case Study on AUTOmatic multiMOdal Drowsiness detecTIon for smart VEhicles

Authors
Esteves, T; Pinto, JR; Ferreira, PM; Costa, PA; Rodrigues, LA; Antunes, I; Lopes, G; Gamito, P; Abrantes, AJ; Jorge, PM; Lourenco, A; Sequeira, AF; Cardoso, JS; Rebelo, A;

Publication
IEEE ACCESS

Abstract
As technology and artificial intelligence conquer a place under the spotlight in the automotive world, driver drowsiness monitoring systems have sparked much interest as a way to increase safety and avoid sleepiness-related accidents. Such technologies, however, stumble upon the observation that each driver presents a distinct set of behavioral and physiological manifestations of drowsiness, thus rendering its objective assessment a non-trivial process. The AUTOMOTIVE project studied the application of signal processing and machine learning techniques for driver-specific drowsiness detection in smart vehicles, enabled by immersive driving simulators. More broadly, comprehensive research on biometrics using the electrocardiogram (ECG) and face enables the continuous learning of subject-specific models of drowsiness for more efficient monitoring. This paper aims to offer a holistic but comprehensive view of the research and development work conducted for the AUTOMOTIVE project across the various addressed topics and how it ultimately brings us closer to the target of improved driver drowsiness monitoring.

2021

Optimizing Person Re-Identification Using Generated Attention Masks

Authors
Capozzi, L; Pinto, JR; Cardoso, JS; Rebelo, A;

Publication
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 25th Iberoamerican Congress, CIARP 2021, Porto, Portugal, May 10-13, 2021, Revised Selected Papers

Abstract
The task of person re-identification has important applications in security and surveillance systems. It is a challenging problem since there can be a lot of differences between pictures belonging to the same person, such as lighting, camera position, variation in poses and occlusions. The use of Deep Learning has contributed greatly towards more effective and accurate systems. Many works use attention mechanisms to force the models to focus on less distinctive areas, in order to improve performance in situations where important information may be missing. This paper proposes a new, more flexible method for calculating these masks, using a U-Net which receives a picture and outputs a mask representing the most distinctive areas of the picture. Results show that the method achieves an accuracy comparable or superior to those in state-of-the-art methods.

2021

End-to-End Deep Sketch-to-Photo Matching Enforcing Realistic Photo Generation

Authors
Capozzi, L; Pinto, JR; Cardoso, JS; Rebelo, A;

Publication
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 25th Iberoamerican Congress, CIARP 2021, Porto, Portugal, May 10-13, 2021, Revised Selected Papers

Abstract
The traditional task of locating suspects using forensic sketches posted on public spaces, news, and social media can be a difficult task. Recent methods that use computer vision to improve this process present limitations, as they either do not use end-to-end networks for sketch recognition in police databases (which generally improve performance) or/and do not offer a photo-realistic representation of the sketch that could be used as alternative if the automatic matching process fails. This paper proposes a method that combines these two properties, using a conditional generative adversarial network (cGAN) and a pre-trained face recognition network that are jointly optimised as an end-to-end model. While the model can identify a short list of potential suspects in a given database, the cGAN offers an intermediate realistic face representation to support an alternative manual matching process. Evaluation on sketch-photo pairs from the CUFS, CUFSF and CelebA databases reveal the proposed method outperforms the state-of-the-art in most tasks, and that forcing an intermediate photo-realistic representation only results in a small performance decrease.

2021

FocusFace: Multi-task Contrastive Learning for Masked Face Recognition

Authors
Neto, PC; Boutros, F; Pinto, JR; Damer, N; Sequeira, AF; Cardoso, JS;

Publication
2021 16TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2021)

Abstract
SARS-CoV-2 has presented direct and indirect challenges to the scientific community. One of the most prominent indirect challenges advents from the mandatory use of face masks in a large number of countries. Face recognition methods struggle to perform identity verification with similar accuracy on masked and unmasked individuals. It has been shown that the performance of these methods drops considerably in the presence of face masks, especially if the reference image is unmasked. We propose FocusFace, a multi-task architecture that uses contrastive learning to be able to accurately perform masked face recognition. The proposed architecture is designed to be trained from scratch or to work on top of state-of-the-art face recognition methods without sacrificing the capabilities of a existing models in conventional face recognition tasks. We also explore different approaches to design the contrastive learning module. Results are presented in terms of masked-masked (MM) and unmasked-masked (U-M) face verification performance. For both settings, the results are on par with published methods, but for M-M specifically, the proposed method was able to outperform all the solutions that it was compared to. We further show that when using our method on top of already existing methods the training computational costs decrease significantly while retaining similar performances. The implementation and the trained models are available at GitHub.

2022

Streamlining Action Recognition in Autonomous Shared Vehicles with an Audiovisual Cascade Strategy

Authors
Pinto, JR; Carvalho, P; Pinto, C; Sousa, A; Capozzi, L; Cardoso, JS;

Publication
PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5

Abstract
With the advent of self-driving cars, and big companies such as Waymo or Bosch pushing forward into fully driverless transportation services, the in-vehicle behaviour of passengers must be monitored to ensure safety and comfort. The use of audio-visual information is attractive by its spatio-temporal richness as well as non-invasive nature, but faces tile likely constraints posed by available hardware and energy consumption. Hence new strategies are required to improve the usage of these scarce resources. We propose the processing of audio and visual data in a cascade pipeline for in-vehicle action recognition. The data is processed by modality-specific sub-modules. with subsequent ones being used when a confident classification is not reached. Experiments show an interesting accuracy-acceleration trade-off when compared with a parallel pipeline with late fusion, presenting potential for industrial applications on embedded devices.

  • 4
  • 6