2025
Authors
Caetano, F; Carvalho, P; Mastralexi, C; Cardoso, JS;
Publication
IEEE ACCESS
Abstract
Anomaly Detection has been a significant field in Machine Learning since it began gaining traction. In the context of Computer Vision, the increased interest is notorious as it enables the development of video processing models for different tasks without the need for a cumbersome effort with the annotation of possible events, that may be under represented. From the predominant strategies, weakly and semi-supervised, the former has demonstrated potential to achieve a higher score in its analysis, adding to its flexibility. This work shows that using temporal ranking constraints for Multiple Instance Learning can increase the performance of these models, allowing the focus on the most informative instances. Moreover, the results suggest that altering the ranking process to include information about adjacent instances generates best-performing models.
2024
Authors
Pereira, A; Carvalho, P; Côrte Real, L;
Publication
Advances in Internet of Things & Embedded Systems
Abstract
2024
Authors
Vilça, L; Viana, P; Carvalho, P; Andrade, MT;
Publication
IEEE ACCESS
Abstract
It is well known that the performance of Machine Learning techniques, notably when applied to Computer Vision (CV), depends heavily on the amount and quality of the training data set. However, large data sets lead to time-consuming training loops and, in many situations, are difficult or even impossible to create. Therefore, there is a need for solutions to reduce their size while ensuring good levels of performance, i.e., solutions that obtain the best tradeoff between the amount/quality of training data and the model's performance. This paper proposes a dataset reduction approach for training data used in Deep Learning methods in Facial Recognition (FR) problems. We focus on maximizing the variability of representations for each subject (person) in the training data, thus favoring quality instead of size. The main research questions are: 1) Which facial features better discriminate different identities? 2) Will it be possible to significantly reduce the training time without compromising performance? 3) Should we favor quality over quantity for very large datasets in FR? This analysis uses a pipeline to discriminate a set of features suitable for capturing the diversity and a cluster-based sampling to select the best images for each training subject, i.e., person. Results were obtained using VGGFace2 and Labeled Faces in the Wild (for benchmarking) and show that, with the proposed approach, a data reduction is possible while ensuring similar levels of accuracy.
2023
Authors
Romero, A; Carvalho, P; Corte-Real, L; Pereira, A;
Publication
JOURNAL OF IMAGING
Abstract
The problem of gathering sufficiently representative data, such as those about human actions, shapes, and facial expressions, is costly and time-consuming and also requires training robust models. This has led to the creation of techniques such as transfer learning or data augmentation. However, these are often insufficient. To address this, we propose a semi-automated mechanism that allows the generation and editing of visual scenes with synthetic humans performing various actions, with features such as background modification and manual adjustments of the 3D avatars to allow users to create data with greater variability. We also propose an evaluation methodology for assessing the results obtained using our method, which is two-fold: (i) the usage of an action classifier on the output data resulting from the mechanism and (ii) the generation of masks of the avatars and the actors to compare them through segmentation. The avatars were robust to occlusion, and their actions were recognizable and accurate to their respective input actors. The results also showed that even though the action classifier concentrates on the pose and movement of the synthetic humans, it strongly depends on contextual information to precisely recognize the actions. Generating the avatars for complex activities also proved problematic for action recognition and the clean and precise formation of the masks.
2007
Authors
Ciobanu, G; Andrade, MT; Carvalho, P; Carrapatoso, E;
Publication
NOVAS PERSPECTIVAS EM SISTEMAS E TECNOLOGIAS DE INFORMACAO, VOL II
Abstract
MPEG-21 enables content consumers to access and interoperate with a large variety of multimedia resources and their descriptions in a flexible manner. Considering the great heterogeneity that presently exists across the entire multimedia content chain and the growing importance of open standards to facilitate the interoperations across environments, applications and formats, an MPEG-21 Peer was developed to process and present complex multimedia content, represented as MPEG-21 Digital Items. The novelty of the work essentially relies on the adoption of a Web Services architecture, based on a single Digital Items processing core available for all types of terminal devices.
2008
Authors
Shao, BL; Mattavelli, M; Renzi, D; Andrade, MT; Battista, S; Keller, S; Ciobanu, G; Carvalho, P;
Publication
2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4
Abstract
This paper addresses multimedia end user system design for content distribution over heterogeneous networks and terminals, with particular focus on End-to-End quality of service (QoS) control. A multimedia terminal comprising content-related metadata processor, usage environment characteristics provider, end user QoS monitor and H.264's extension Scalable Video Coding (SVC) audio-visual player in coordination under a terminal middleware, has been conceived and implemented. This end user terminal enables End-to-End QoS control for content adaptation solution both in semantic and physical approaches to maximize end user's perceptual experience and minimize resources. Such design approach illustrates a possible architecture for next generation multimedia end user terminal supporting MPEG-21 and H.264's extension SVC codec standards.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.