Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About

About

I am a Coordinator Professor at the Polytechnic of Porto and a Researcher at INESC TEC, where I lead the Multimedia Communications Technology Area. I obtained my PhD from University of Porto in the area of multimedia content management. I have been responsible for the participation of INESC TEC in several national and European projects, involving universities and media industries. Author of several publications, I am also an active reviewer for journals and conferences and engaged in the organization of workshops and program committees in the area of Multimedia. Recently I co-chaired the Immersive Media Experiences workshop series (2013-2015) at ACM MM. Additionally I am also often engaged in the evaluation of European and Portuguese research proposals and projects. My main research activities and interests are in the field of networked audiovisual systems, including digital television and new services, content management, personalization and recomendation, new media formats and immersive and interactive media.

Interest
Topics
Details

Details

  • Name

    Paula Viana
  • Role

    Area Manager
  • Since

    01st January 1993
023
Publications

2025

Video Soundtrack Generation by Aligning Emotions and Temporal Boundaries

Authors
Sulun, S; Viana, P; Davies, MEP;

Publication
CoRR

Abstract

2025

Converge: towards an efficient multi-modal sensing research infrastructure for next-generation 6 G networks

Authors
Teixeira, FB; Ricardo, M; Coelho, A; Oliveira, HP; Viana, P; Paulino, N; Fontes, H; Marques, P; Campos, R; Pessoa, L;

Publication
EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING

Abstract
Telecommunications and computer vision solutions have evolved significantly in recent years, allowing a huge advance in the functionalities and applications offered. However, these two fields have been making their way as separate areas, not exploring the potential benefits of merging the innovations brought from each of them. In challenging environments, for example, combining radio sensing and computer vision can strongly contribute to solving problems such as those introduced by obstructions or limited lighting. Machine learning algorithms, able to fuse heterogeneous and multi-modal data, are also a key element for understanding and inferring additional knowledge from raw and low-level data, able to create a new abstracting level that can significantly enhance many applications. This paper introduces the CONVERGE vision-radio concept, a new paradigm that explores the benefits of integrating two fields of knowledge towards the vision of View-to-Communicate, Communicate-to-View. The main concepts behind this vision, including supporting use cases and the proposed architecture, are presented. CONVERGE introduces a set of tools integrating wireless communications and computer vision to create a novel experimental infrastructure that will provide open datasets to the scientific community of both experimental and simulated data, enabling new research addressing various 6 G verticals, including telecommunications, automotive, manufacturing, media, and health.

2025

A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning

Authors
Vilaça, L; Yu, Y; Viana, P;

Publication
ACM COMPUTING SURVEYS

Abstract
Audio-visual correlation learning aims at capturing and understanding natural phenomena between audio and visual data. The rapid growth of Deep Learning propelled the development of proposals that process audio-visual data and can be observed in the number of proposals in the past years. Thus encouraging the development of a comprehensive survey. Besides analyzing the models used in this context, we also discuss some tasks of definition and paradigm applied in AI multimedia. In addition, we investigate objective functions frequently used and discuss how audio-visual data is exploited in the optimization process, i.e., the different methodologies for representing knowledge in the audio-visual domain. In fact, we focus on how human-understandable mechanisms, i.e., structured knowledge that reflects comprehensible knowledge, can guide the learning process. Most importantly, we provide a summarization of the recent progress of Audio-Visual Correlation Learning (AVCL) and discuss the future research directions.

2025

Correction to: A Review of Recent Advances and Challenges in Grocery Label Detection and Recognition (Applied Sciences, (2023), 13, 5, (2871), 10.3390/app13052871)

Authors
Guimarães, V; Nascimento, J; Viana, P; Carvalho, P;

Publication
Applied Sciences (Switzerland)

Abstract
There was an error in the original publication [1]. The statement in the Acknowledgments section is incorrect and should be removed because the official start of the project WATSON was after the paper’s publication date. The authors state that the scientific conclusions are unaffected. This correction was approved by the Academic Editor. The original publication has also been updated. © 2025 by the authors.

2024

VEMOCLAP: A video emotion classification web application

Authors
Sulun, S; Viana, P; Davies, MEP;

Publication
IEEE International Symposium on Multimedia, ISM 2024, Tokyo, Japan, December 11-13, 2024

Abstract
We introduce VEMOCLAP: Video EMOtion Classifier using Pretrained features, the first readily available and open-source web application that analyzes the emotional content of any user-provided video. We improve our previous work, which exploits open-source pretrained models that work on video frames and audio, and then efficiently fuse the resulting pretrained features using multi-head cross-attention. Our approach increases the state-of-the-art classification accuracy on the Ekman-6 video emotion dataset by 4.3% and offers an online application for users to run our model on their own videos or YouTube videos. We invite the readers to try our application at serkansulun.com/app.