Detalhes
Nome
Serkan SulunCargo
Assistente de InvestigaçãoDesde
11 março 2019
Nacionalidade
TurquiaCentro
Centro de Telecomunicações e MultimédiaContactos
+351222094000
serkan.sulun@inesctec.pt
2023
Autores
Sulun, S; Oliveira, P; Viana, P;
Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT II
Abstract
We present a new large-scale emotion-labeled symbolic music dataset consisting of 12 k MIDI songs. To create this dataset, we first trained emotion classification models on the GoEmotions dataset, achieving state-of-the-art results with a model half the size of the baseline. We then applied these models to lyrics from two large-scale MIDI datasets. Our dataset covers a wide range of fine-grained emotions, providing a valuable resource to explore the connection between music and emotions and, especially, to develop models that can generate music based on specific emotions. Our code for inference, trained models, and datasets are available online.
2022
Autores
Sulun, S; Davies, MEP; Viana, P;
Publicação
IEEE ACCESS
Abstract
In this paper we present a new approach for the generation of multi-instrument symbolic music driven by musical emotion. The principal novelty of our approach centres on conditioning a state-of-the-art transformer based on continuous-valued valence and arousal labels. In addition, we provide a new large-scale dataset of symbolic music paired with emotion labels in terms of valence and arousal. We evaluate our approach in a quantitative manner in two ways, first by measuring its note prediction accuracy, and second via a regression task in the valence-arousal plane. Our results demonstrate that our proposed approaches outperform conditioning using control tokens which is representative of the current state of the art.
2021
Autores
Sulun, S; Davies, MEP;
Publicação
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING
Abstract
In this paper, we address a subtopic of the broad domain of audio enhancement, namely musical audio bandwidth extension. We formulate the bandwidth extension problem using deep neural networks, where a band-limited signal is provided as input to the network, with the goal of reconstructing a full-bandwidth output. Our main contribution centers on the impact of the choice of low-pass filter when training and subsequently testing the network. For two different state-of-the-art deep architectures, ResNet and U-Net, we demonstrate that when the training and testing filters are matched, improvements in signal-to-noise ratio (SNR) of up to 7 dB can be obtained. However, when these filters differ, the improvement falls considerably and under some training conditions results in a lower SNR than the band-limited input. To circumvent this apparent overfitting to filter shape, we propose a data augmentation strategy which utilizes multiple low-pass filters during training and leads to improved generalization to unseen filtering conditions at test time.
2020
Autores
Sulun, S; Tekalp, AM;
Publicação
Signal, Image and Video Processing
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.