2024
Autores
Vilça, L; Viana, P; Carvalho, P; Andrade, MT;
Publicação
IEEE ACCESS
Abstract
It is well known that the performance of Machine Learning techniques, notably when applied to Computer Vision (CV), depends heavily on the amount and quality of the training data set. However, large data sets lead to time-consuming training loops and, in many situations, are difficult or even impossible to create. Therefore, there is a need for solutions to reduce their size while ensuring good levels of performance, i.e., solutions that obtain the best tradeoff between the amount/quality of training data and the model's performance. This paper proposes a dataset reduction approach for training data used in Deep Learning methods in Facial Recognition (FR) problems. We focus on maximizing the variability of representations for each subject (person) in the training data, thus favoring quality instead of size. The main research questions are: 1) Which facial features better discriminate different identities? 2) Will it be possible to significantly reduce the training time without compromising performance? 3) Should we favor quality over quantity for very large datasets in FR? This analysis uses a pipeline to discriminate a set of features suitable for capturing the diversity and a cluster-based sampling to select the best images for each training subject, i.e., person. Results were obtained using VGGFace2 and Labeled Faces in the Wild (for benchmarking) and show that, with the proposed approach, a data reduction is possible while ensuring similar levels of accuracy.
2023
Autores
da Costa, TS; Andrade, MT; Viana, P; Silva, NC;
Publicação
DATA IN BRIEF
Abstract
The Data2MV dataset contains gaze fixation data obtained through experimental procedures from a total of 45 participants using an Intel RealSense F200 camera module and seven different video playlists. Each of the playlists had an approximate duration of 20 minutes and was viewed at least 17 times, with raw tracking data being recorded with a 0.05 second interval. The Data2MV dataset encompasses a total of 1.0 0 0.845 gaze fixations, gathered across a total of 128 experiments. It is also composed of 68.393 image frames, extracted from each of the 6 videos selected for these experiments, and an equal quantity of saliency maps, generated from aggregate fixation data. Software tools to obtain saliency maps and generate complementary plots are also provided as an open source software package. The Data2MV dataset was publicly released to the research community on Mendeley Data and constitutes an important contribution to reduce the current scarcity of such data, particularly in immersive, multi-view streaming scenarios. (c) 2023 Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
2023
Autores
Costa, TS; Viana, P; Andrade, MT;
Publicação
IEEE ACCESS
Abstract
Quality of Experience (QoE) in multi-view streaming systems is known to be severely affected by the latency associated with view-switching procedures. Anticipating the navigation intentions of the viewer on the multi-view scene could provide the means to greatly reduce such latency. The research work presented in this article builds on this premise by proposing a new predictive view-selection mechanism. A VGG16-inspired Convolutional Neural Network (CNN) is used to identify the viewer's focus of attention and determine which views would be most suited to be presented in the brief term, i.e., the near-term viewing intentions. This way, those views can be locally buffered before they are actually needed. To this aim, two datasets were used to evaluate the prediction performance and impact on latency, in particular when compared to the solution implemented in the previous version of our multi-view streaming system. Results obtained with this work translate into a generalized improvement in perceived QoE. A significant reduction in latency during view-switching procedures was effectively achieved. Moreover, results also demonstrated that the prediction of the user's visual interest was achieved with a high level of accuracy. An experimental platform was also established on which future predictive models can be integrated and compared with previously implemented models.
2023
Autores
da Costa, TS; Andrade, MT; Viana, P; Silva, NC;
Publicação
PROCEEDINGS OF THE 2023 PROCEEDINGS OF THE 14TH ACM MULTIMEDIA SYSTEMS CONFERENCE, MMSYS 2023
Abstract
Immersive video applications impose unpractical bandwidth requirements for best-effort networks. With Multi-View(MV) streaming, these can be minimized by resorting to view prediction techniques. SmoothMV is a multi-view system that uses a non-intrusive head tracking mechanism to detect the viewer's interest and select appropriate views. By coupling Neural Networks (NNs) to anticipate the viewer's interest, a reduction of view-switching latency is likely to be obtained. The objective of this paper is twofold: 1) Present a solution for acquisition of gaze data from users when viewing MV content; 2) Describe a dataset, collected with a large-scale testbed, capable of being used to train NNs to predict the user's viewing interest. Tracking data from head movements was obtained from 45 participants using an Intel Realsense F200 camera, with 7 video playlists, each being viewed a minimum of 17 times. This dataset is publicly available to the research community and constitutes an important contribution to reducing the current scarcity of such data. Tools to obtain saliency/heat maps and generate complementary plots are also provided as an open-source software package.
2012
Autores
Andrade, MT; Dogan, S; Carreras, A; Barbosa, V; Arachchi, HK; Delgado, J; Kondoz, AM;
Publicação
MULTIMEDIA TOOLS AND APPLICATIONS
Abstract
A major challenge when accessing protected multimedia content in heterogeneous usage environments is the ability to provide acceptable levels of quality of experience to all involving users. Additionally, different levels of protection should be possible to be addressed when manipulating the content towards the quality of experience maximization. This paper describes the use of a context-aware and Digital Rights Management (DRM)-enabled content adaptation platform towards meeting these challenges. The platform was conceived to deliver advanced content adaptation within different application scenarios, among which Virtual Collaboration (VC) was central. Descriptions of use cases implemented by the platform in heterogeneous VC environments are provided. Conducted experiments have highlighted the benefits to users when compared to an operation without the platform. Results of different adaptations suitable to sensed context conditions are also provided and analyzed. A brief description of the platform functionality is included together with pointers to additional information.
2011
Autores
Andrade, MT; Dogan, S; Carreras, A; Barbosa, V; Kodikara Arachchi, H; Delgado, J; Kondoz, AM;
Publicação
Multimedia Tools and Applications
Abstract
A major challenge when accessing protected multimedia content in heterogeneous usage environments is the ability to provide acceptable levels of quality of experience to all involving users. Additionally, different levels of protection should be possible to be addressed when manipulating the content towards the quality of experience maximization. This paper describes the use of a context-aware and Digital Rights Management (DRM)-enabled content adaptation platform towards meeting these challenges. The platform was conceived to deliver advanced content adaptation within different application scenarios, among which Virtual Collaboration (VC) was central. Descriptions of use cases implemented by the platform in heterogeneous VC environments are provided. Conducted experiments have highlighted the benefits to users when compared to an operation without the platform. Results of different adaptations suitable to sensed context conditions are also provided and analyzed. A brief description of the platform functionality is included together with pointers to additional information. © 2011 Springer Science+Business Media, LLC.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.