Publications

Publications by Paula Viana

2021

Automatic TV Logo Identification for Advertisement Detection without Prior Data

Authors
Carvalho, P; Pereira, A; Viana, P;

Publication
APPLIED SCIENCES-BASEL

Abstract
Advertisements are often inserted in multimedia content, and this is particularly relevant in TV broadcasting as they have a key financial role. In this context, the flexible and efficient processing of TV content to identify advertisement segments is highly desirable as it can benefit different actors, including the broadcaster, the contracting company, and the end user. In this context, detecting the presence of the channel logo has been seen in the state-of-the-art as a good indicator. However, the difficulty of this challenging process increases as less prior data is available to help reduce uncertainty. As a result, the literature proposals that achieve the best results typically rely on prior knowledge or pre-existent databases. This paper proposes a flexible method for processing TV broadcasting content aiming at detecting channel logos, and consequently advertising segments, without using prior data about the channel or content. The final goal is to enable stream segmentation identifying advertisement slices. The proposed method was assessed over available state-of-the-art datasets as well as additional and more challenging stream captures. Results show that the proposed method surpasses the state-of-the-art.

CloseRead Abstract

2020

Context-Based Cultural Visits

Authors
Assis, M; Andrade, MT; Viana, P;

Publication
IBICA

Abstract
Mobile Augmented Reality (MAR) systems have emerged and greatly evolved in the last two decades. They have application in many domains, most notably in the field of Cultural Heritage (CH) and tourism, where people tend to rely on smartphones when visiting a new city to obtain additional information on the city landmarks. Expectations are that they obtain precise and tailored information to the visitor’s needs. Therefore, researchers started to investigate innovative approaches for presenting and suggesting digital content related to cultural and historical places. This article presents a novel MAR application, NearHeritage, which uses emergent technologies to assist visitors in finding and exploring Cultural Heritage. The research focuses on combining the use of context-awareness with Augmented Reality (AR). By sensing the context surrounding the user, the NearHeritage app discloses not only the list of nearby points-of-interest (POI) but also detailed information about the POIs in the form of AR content adapted to the user context. The solution presented uses built-in sensors of Android devices and takes advantage of various APIs (Foursquare API, Google Maps API and IntelContextSensing SDK) to retrieve information about the landmarks and the visitor context. Results from initial experimentation indicate that the concept of a context-aware MAR application can improve the user experience in discovering and learning more about Cultural Heritage, creating an interactive, enjoyable and unforgettable adventure.

CloseRead Abstract

2021

Reviving Direct Observation Methods for Physical Activity Behavior

Authors
Pedro Miguel Ribeiro da Silva; Sérgio Hélder da Silva Soares Soares; Jorge Augusto Pinto Silva Mota; Paula Maria Marques Moura Gomes Viana; Pedro Miguel Machado Soares Carvalho;

Publication
Journal of Sports Science

Abstract

2021

<i>FiM's DE</i>-the communication package for the creative pipeline

Authors
Castro, H; Andrade, MT; Viana, P;

Publication
MULTIMEDIA TOOLS AND APPLICATIONS

Abstract
The FotoInMotion (FiM) project is building a novel media creation platform, leveraging the use of semi-automated analysis and editing tools to empower creators to easily transform static visual acquisitions of real-world events into rich, animated and engaging objects, distributable through common channels. FiM transforms the content creative chain into an integrated pipeline across which media and metadata seamlessly flow and are exploited to produce more complex media objects. One of the addressed challenges consists the need for a seamless and efficient communication across such pipeline and on how to preserve, in a structured manner, all of the involved media and metadata. Existing standardized metadata tools and content wrappers are limited in expressivity and scope and incapable of fully supporting the needs of the content creative pipeline. This paper describes FiM's new structured data object, i.e. the Digital Event (DE), which acts as a universal vehicle for media and metadata. It builds on well-established and emergent MPEG standards (MPEG-21, MPEG-V, MPEG-7 and MPEG HEIF), to support data diversity, interoperability, packaging and sharing, within complex, Machine Learning enhanced, creative pipelines. Our solution has been validated by creative professionals (photojournalism, fashion marketing and festivals), who have conducted experiments within the context of different creative workflows in real world scenarios. DE's employment revealed to be advantageous, particularly in the homogenization of the media and metadata representation and packaging and in the normalization of the interaction between different pipeline components.

CloseRead Abstract

2022

Photo2Video: Semantic-Aware Deep Learning-Based Video Generation from Still Content

Authors
Viana, P; Andrade, MT; Carvalho, P; Vilaça, L; Teixeira, IN; Costa, T; Jonker, P;

Publication
JOURNAL OF IMAGING

Abstract
Applying machine learning (ML), and especially deep learning, to understand visual content is becoming common practice in many application areas. However, little attention has been given to its use within the multimedia creative domain. It is true that ML is already popular for content creation, but the progress achieved so far addresses essentially textual content or the identification and selection of specific types of content. A wealth of possibilities are yet to be explored by bringing the use of ML into the multimedia creative process, allowing the knowledge inferred by the former to influence automatically how new multimedia content is created. The work presented in this article provides contributions in three distinct ways towards this goal: firstly, it proposes a methodology to re-train popular neural network models in identifying new thematic concepts in static visual content and attaching meaningful annotations to the detected regions of interest; secondly, it presents varied visual digital effects and corresponding tools that can be automatically called upon to apply such effects in a previously analyzed photo; thirdly, it defines a complete automated creative workflow, from the acquisition of a photograph and corresponding contextual data, through the ML region-based annotation, to the automatic application of digital effects and generation of a semantically aware multimedia story driven by the previously derived situational and visual contextual data. Additionally, it presents a variant of this automated workflow by offering to the user the possibility of manipulating the automatic annotations in an assisted manner. The final aim is to transform a static digital photo into a short video clip, taking into account the information acquired. The final result strongly contrasts with current standard approaches of creating random movements, by implementing an intelligent content- and context-aware video.

CloseRead Abstract

2022

Symbolic Music Generation Conditioned on Continuous-Valued Emotions

Authors
Sulun, S; Davies, MEP; Viana, P;

Publication
IEEE ACCESS

Abstract
In this paper we present a new approach for the generation of multi-instrument symbolic music driven by musical emotion. The principal novelty of our approach centres on conditioning a state-of-the-art transformer based on continuous-valued valence and arousal labels. In addition, we provide a new large-scale dataset of symbolic music paired with emotion labels in terms of valence and arousal. We evaluate our approach in a quantitative manner in two ways, first by measuring its note prediction accuracy, and second via a regression task in the valence-arousal plane. Our results demonstrate that our proposed approaches outperform conditioning using control tokens which is representative of the current state of the art.

CloseRead Abstract