Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About

About

I obtained my BSc and MSc in Computer Science at the Faculty of Science in the University of Porto.

Since 2014 I have been working at INESC TEC mainly on computer vision and I am currently also a PhD student at the Faculty of Engineering in the University of Porto.

My main research goals are related to Computer Vision, also with an emphasis on Machine Learning and Virtual Reality. 

Interest
Topics
Details

Details

003
Publications

2022

Towards vehicle occupant-invariant models for activity characterisation

Authors
Capozzi, L; Barbosa, V; Pinto, C; Pinto, JR; Pereira, A; Carvalho, PM; Cardoso, JS;

Publication
IEEE ACCESS

Abstract

2022

Boosting color similarity decisions using the CIEDE2000_PF Metric

Authors
Pereira, A; Carvalho, P; Corte Real, L;

Publication
SIGNAL IMAGE AND VIDEO PROCESSING

Abstract

2021

Automatic TV Logo Identification for Advertisement Detection without Prior Data

Authors
Carvalho, P; Pereira, A; Viana, P;

Publication
APPLIED SCIENCES-BASEL

Abstract
Advertisements are often inserted in multimedia content, and this is particularly relevant in TV broadcasting as they have a key financial role. In this context, the flexible and efficient processing of TV content to identify advertisement segments is highly desirable as it can benefit different actors, including the broadcaster, the contracting company, and the end user. In this context, detecting the presence of the channel logo has been seen in the state-of-the-art as a good indicator. However, the difficulty of this challenging process increases as less prior data is available to help reduce uncertainty. As a result, the literature proposals that achieve the best results typically rely on prior knowledge or pre-existent databases. This paper proposes a flexible method for processing TV broadcasting content aiming at detecting channel logos, and consequently advertising segments, without using prior data about the channel or content. The final goal is to enable stream segmentation identifying advertisement slices. The proposed method was assessed over available state-of-the-art datasets as well as additional and more challenging stream captures. Results show that the proposed method surpasses the state-of-the-art.

2020

Efficient CIEDE2000-based Color Similarity Decision for Computer Vision

Authors
Pereira, A; Carvalho, P; Coelho, G; Corte Real, L;

Publication
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

Abstract

2016

Cognition inspired format for the expression of computer vision metadata

Authors
Castro, H; Monteiro, J; Pereira, A; Silva, D; Coelho, G; Carvalho, P;

Publication
MULTIMEDIA TOOLS AND APPLICATIONS

Abstract
Over the last decade noticeable progress has occurred in automated computer interpretation of visual information. Computers running artificial intelligence algorithms are growingly capable of extracting perceptual and semantic information from images, and registering it as metadata. There is also a growing body of manually produced image annotation data. All of this data is of great importance for scientific purposes as well as for commercial applications. Optimizing the usefulness of this, manually or automatically produced, information implies its precise and adequate expression at its different logical levels, making it easily accessible, manipulable and shareable. It also implies the development of associated manipulating tools. However, the expression and manipulation of computer vision results has received less attention than the actual extraction of such results. Hence, it has experienced a smaller advance. Existing metadata tools are poorly structured, in logical terms, as they intermix the declaration of visual detections with that of the observed entities, events and comprising context. This poor structuring renders such tools rigid, limited and cumbersome to use. Moreover, they are unprepared to deal with more advanced situations, such as the coherent expression of the information extracted from, or annotated onto, multi-view video resources. The work here presented comprises the specification of an advanced XML based syntax for the expression and processing of Computer Vision relevant metadata. This proposal takes inspiration from the natural cognition process for the adequate expression of the information, with a particular focus on scenarios of varying numbers of sensory devices, notably, multi-view video.