Detalhes
Nome
Serkan SulunCargo
Assistente de InvestigaçãoDesde
11 março 2019
Nacionalidade
TurquiaCentro
Telecomunicações e MultimédiaContactos
+351222094000
serkan.sulun@inesctec.pt
2025
Autores
Sulun, S; Viana, P; Davies, MEP;
Publicação
CoRR
Abstract
2024
Autores
Sulun, S; Viana, P; Davies, MEP;
Publicação
IEEE International Symposium on Multimedia, ISM 2024, Tokyo, Japan, December 11-13, 2024
Abstract
We introduce VEMOCLAP: Video EMOtion Classifier using Pretrained features, the first readily available and open-source web application that analyzes the emotional content of any user-provided video. We improve our previous work, which exploits open-source pretrained models that work on video frames and audio, and then efficiently fuse the resulting pretrained features using multi-head cross-attention. Our approach increases the state-of-the-art classification accuracy on the Ekman-6 video emotion dataset by 4.3% and offers an online application for users to run our model on their own videos or YouTube videos. We invite the readers to try our application at serkansulun.com/app.
2024
Autores
Sulun, S; Viana, P; Davies, MEP;
Publicação
EXPERT SYSTEMS WITH APPLICATIONS
Abstract
We introduce a novel method for movie genre classification, capitalizing on a diverse set of readily accessible pretrained models. These models extract high-level features related to visual scenery, objects, characters, text, speech, music, and audio effects. To intelligently fuse these pretrained features, we train small classifier models with low time and memory requirements. Employing the transformer model, our approach utilizes all video and audio frames of movie trailers without performing any temporal pooling, efficiently exploiting the correspondence between all elements, as opposed to the fixed and low number of frames typically used by traditional methods. Our approach fuses features originating from different tasks and modalities, with different dimensionalities, different temporal lengths, and complex dependencies as opposed to current approaches. Our method outperforms state-of-the-art movie genre classification models in terms of precision, recall, and mean average precision (mAP). To foster future research, we make the pretrained features for the entire MovieNet dataset, along with our genre classification code and the trained models, publicly available.
2023
Autores
Sulun, S; Oliveira, P; Viana, P;
Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT II
Abstract
We present a new large-scale emotion-labeled symbolic music dataset consisting of 12 k MIDI songs. To create this dataset, we first trained emotion classification models on the GoEmotions dataset, achieving state-of-the-art results with a model half the size of the baseline. We then applied these models to lyrics from two large-scale MIDI datasets. Our dataset covers a wide range of fine-grained emotions, providing a valuable resource to explore the connection between music and emotions and, especially, to develop models that can generate music based on specific emotions. Our code for inference, trained models, and datasets are available online.
2022
Autores
Sulun, S; Davies, MEP; Viana, P;
Publicação
IEEE ACCESS
Abstract
In this paper we present a new approach for the generation of multi-instrument symbolic music driven by musical emotion. The principal novelty of our approach centres on conditioning a state-of-the-art transformer based on continuous-valued valence and arousal labels. In addition, we provide a new large-scale dataset of symbolic music paired with emotion labels in terms of valence and arousal. We evaluate our approach in a quantitative manner in two ways, first by measuring its note prediction accuracy, and second via a regression task in the valence-arousal plane. Our results demonstrate that our proposed approaches outperform conditioning using control tokens which is representative of the current state of the art.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.