Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Facts & Numbers
000
Presentation

Telecommunications and Multimedia

At CTM, our vision is to promote a lively and sustainable world where networked intelligence enables ubiquitous interaction with sensory-rich content. Our mission is to develop advanced systems and technologies to enable high capacity, efficient, and secure communications, media knowledge extraction, and immersive ubiquitous multimedia applications.

We work in 4 main areas of research: Optical and Electronic Technologies, Wireless Networks, Multimedia and Communications Technologies, and VCMI (Visual Computing and Machine Intelligence).

Latest News

Can we be sure that Douro wine really comes from the Douro? INESC TEC has the answer

Portuguese wine, Spanish honey, Greek olive oil, German meat, Nordic dairy and fish – what do they all have in common? They’re all part of WATSON, a project bringing blockchain, Artificial Intelligence, computer vision, sensors, and geolocation systems to the table to improve the traceability of food products and help advance information and prevention to tackle fraud.

29th May 2025

Communications

Safer and smarter factories? INESC TEC at the forefront of developing digital transformation technologies for the industry sector

A project named MechEye is currently developing technologies aimed at improving safety in industrial environments, particularly in equipment’s use and operation.

26th May 2025

INESC TEC with five FCT exploratory projects approved in four R&D areas

Telecommunications and Multimedia, Applied Photonics, High-assurance Software and Advanced Computing Systems – these are the four domains that INESC TEC researchers will explore within the scope of the five projects that were approved through the Call for Exploratory Projects promoted by the Foundation for Science and Technology (FCT).

02nd October 2024

Artificial Intelligence

Já arrancou o primeiro projeto europeu liderado pelo INESC TEC na área da saúde

Chama-se AI4Lungs e tem como objetivo desenvolver ferramentas e modelos computacionais baseados em Inteligência Artificial para otimizar o diagnóstico e o tratamento de doenças pulmonares. Através de uma abordagem holística e multimodal, os investigadores vão criar uma solução de cuidados de saúde personalizados para doenças respiratórias. No final de fevereiro, representantes das 18 entidades parceiras do projeto, provenientes de 10 países, reuniram-se no INESC TEC para assinalar o arranque do AI4Lungs.

01st April 2024

Communications

Europe discusses collaboration opportunities in high-frequency wireless communications

Smart propagation environments, improvements in signal processing for the sixth generation of mobile communications, and 6G-centred network and location developments were some of the topics discussed at an event organised by the European projects TERRAMETA (coordinated by INESC TEC), 6G-SHINE and TIMES, in collaboration with RESTART-IN – an Italian PRR.

06th March 2024

001

Featured Projects

PFAI4_5eD

Programa de Formação Avançada Industria 4 - 5a edição

2024-2024

Team
002

Laboratories

Laboratory of Sound and Music Computing

Optical and Electronic Technologies Research Laboratory

Publications

CTM Publications

View all Publications

2025

A Review of Voicing Decision in Whispered Speech: From Rules to Machine Learning

Authors
da Silva, JMPP; Duarte Nunes, G; Ferreira, A;

Publication

Abstract

2025

Neural network models for whisper to normal speech conversion

Authors
Yamamura, F; Scalassara, R; Oliveira, A; Ferreira, JS;

Publication
U.Porto Journal of Engineering

Abstract
Whispers are common and essential for secondary communication. Nonetheless, individuals with aphonia, including laryngectomees, rely on whispers as their primary means of communication. Due to the distinct features between whispered and regular speech, debates have emerged in the field of speech recognition, highlighting the challenge of effectively converting between them. This study investigates the characteristics of whispered speech and proposes a system for converting whispered vowels into normal ones. The system is developed using multilayer perceptron networks and two types of generative adversarial networks. Three metrics are analyzed to evaluate the performance of the system: mel-cepstral distortion, root mean square error of the fundamental frequency, and accuracy with f1-score of a vowel classifier. Overall, the perceptron networks demonstrated better results, with no significant differences observed between male and female voices or the presence/absence of speech silence, except for improved accuracy in estimating the fundamental frequency during the conversion process. © 2025, Universidade do Porto - Faculdade de Engenharia. All rights reserved.

2025

A Vision-aided Open Radio Access Network for Obstacle-aware Wireless Connectivity

Authors
Simões, C; Coelho, A; Ricardo, M;

Publication
20th Wireless On-Demand Network Systems and Services Conference, WONS 2025, Hintertux, Austria, January 27-29, 2025

Abstract
High-frequency radio networks, including those operating in the millimeter-wave bands, are sensible to Line-of-Sight (LoS) obstructions. Computer Vision (CV) algorithms can be leveraged to improve network performance by processing and interpreting visual data, enabling obstacle avoidance and ensuring LoS signal propagation. We propose a vision-aided Radio Access Network (RAN) based on the O-RAN architecture and capable of perceiving the surrounding environment. The vision-aided RAN consists of a gNodeB (gNB) equipped with a video camera that employs CV techniques to extract critical environmental information. An xApp is used to collect and process metrics from the RAN and receive data from a Vision Module (VM). This enhances the RAN's ability to perceive its surroundings, leading to better connectivity in challenging environments. © 2025 IFIP.

2025

A Framework to Develop and Validate RL-Based Obstacle-Aware UAV Positioning Algorithms

Authors
Shafafi, K; Ricardo, M; Campos, R;

Publication
CoRR

Abstract

2025

A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning

Authors
Vilaça, L; Yu, Y; Viana, P;

Publication
ACM Computing Surveys

Abstract
Audio-visual correlation learning aims to capture and understand natural phenomena between audio and visual data. The rapid growth of Deep Learning propelled the development of proposals that process audio-visual data and can be observed in the number of proposals in the past years. Thus encouraging the development of a comprehensive survey. Besides analyzing the models used in this context, we also discuss some tasks of definition and paradigm applied in AI multimedia. In addition, we investigate objective functions frequently used and discuss how audio-visual data is exploited in the optimization process, i.e., the different methodologies for representing knowledge in the audio-visual domain. In fact, we focus on how human-understandable mechanisms, i.e., structured knowledge that reflects comprehensible knowledge, can guide the learning process. Most importantly, we provide a summarization of the recent progress of Audio-Visual Correlation Learning (AVCL) and discuss the future research directions.

Facts & Figures

28Senior Researchers

2016

2R&D Employees

2020

19Papers in indexed journals

2020

Contacts