Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por CTM

2016

Bio-inspired Boosting for Moving Objects Segmentation

Autores
Martins, I; Carvalho, P; Corte Real, L; Luis Alba Castro, JL;

Publicação
IMAGE ANALYSIS AND RECOGNITION (ICIAR 2016)

Abstract
Developing robust and universal methods for unsupervised segmentation of moving objects in video sequences has proved to be a hard and challenging task. State-of-the-art methods show good performance in a wide range of situations, but systematically fail when facing more challenging scenarios. Lately, a number of image processing modules inspired in biological models of the human visual system have been explored in different areas of application. This paper proposes a bio-inspired boosting method to address the problem of unsupervised segmentation of moving objects in video that shows the ability to overcome some of the limitations of widely used state-of-the-art methods. An exhaustive set of experiments was conducted and a detailed analysis of the results, using different metrics, revealed that this boosting is more significant when challenging scenarios are faced and state-of-the-art methods tend to fail.

2016

Cognition inspired format for the expression of computer vision metadata

Autores
Castro, H; Monteiro, J; Pereira, A; Silva, D; Coelho, G; Carvalho, P;

Publicação
MULTIMEDIA TOOLS AND APPLICATIONS

Abstract
Over the last decade noticeable progress has occurred in automated computer interpretation of visual information. Computers running artificial intelligence algorithms are growingly capable of extracting perceptual and semantic information from images, and registering it as metadata. There is also a growing body of manually produced image annotation data. All of this data is of great importance for scientific purposes as well as for commercial applications. Optimizing the usefulness of this, manually or automatically produced, information implies its precise and adequate expression at its different logical levels, making it easily accessible, manipulable and shareable. It also implies the development of associated manipulating tools. However, the expression and manipulation of computer vision results has received less attention than the actual extraction of such results. Hence, it has experienced a smaller advance. Existing metadata tools are poorly structured, in logical terms, as they intermix the declaration of visual detections with that of the observed entities, events and comprising context. This poor structuring renders such tools rigid, limited and cumbersome to use. Moreover, they are unprepared to deal with more advanced situations, such as the coherent expression of the information extracted from, or annotated onto, multi-view video resources. The work here presented comprises the specification of an advanced XML based syntax for the expression and processing of Computer Vision relevant metadata. This proposal takes inspiration from the natural cognition process for the adequate expression of the information, with a particular focus on scenarios of varying numbers of sensory devices, notably, multi-view video.

2016

Video Based Group Tracking and Management

Autores
Pereira, A; Familiar, A; Moreira, B; Terroso, T; Carvalho, P; Corte Real, L;

Publicação
IMAGE ANALYSIS AND RECOGNITION (ICIAR 2016)

Abstract
Tracking objects in video is a very challenging research topic, particularly when people in groups are tracked, with partial and full occlusions and group dynamics being common difficulties. Hence, its necessary to deal with group tracking, formation and separation, while assuring the overall consistency of the individuals. This paper proposes enhancements to a group management and tracking algorithm that receives information of the persons in the scene, detects the existing groups and keeps track of the persons that belong to it. Since input information for group management algorithms is typically provided by a tracking algorithm and it is affected by noise, mechanisms for handling such noisy input tracking information were also successfully included. Performed experiments demonstrated that the described algorithm outperformed state-of-the-art approaches.

2016

A multi-level tonal interval space for modelling pitch relatedness and musical consonance

Autores
Bernardes, G; Cocharro, D; Caetano, M; Guedes, C; Davies, MEP;

Publicação
JOURNAL OF NEW MUSIC RESEARCH

Abstract
In this paper we present a 12-dimensional tonal space in the context of the Tonnetz, Chew's Spiral Array, and Harte's 6-dimensional Tonal Centroid Space. The proposed Tonal Interval Space is calculated as the weighted Discrete Fourier Transform of normalized 12-element chroma vectors, which we represent as six circles covering the set of all possible pitch intervals in the chroma space. By weighting the contribution of each circle (and hence pitch interval) independently, we can create a space in which angular and Euclidean distances among pitches, chords, and regions concur with music theory principles. Furthermore, the Euclidean distance of pitch configurations from the centre of the space acts as an indicator of consonance.

2016

Conchord: An Application for Generating Musical Harmony by Navigating in the Tonal Interval Space

Autores
Bernardes, G; Cocharro, D; Guedes, C; Davies, MEP;

Publicação
Music, Mind, and Embodiment

Abstract
We present Conchord, a system for real-time automatic generation of musical harmony through navigation in a novel 12-dimensional Tonal Interval Space. In this tonal space, angular and Euclidean distances among vectors representing multi-level pitch configurations equate with music theory principles, and vector norms acts as an indicator of consonance. Building upon these attributes, users can intuitively and dynamically define a collection of chords based on their relation to a tonal center (or key) and their consonance level. Furthermore, two algorithmic strategies grounded in principles from function and root-motion harmonic theories allow the generation of chord progressions characteristic of Western tonal music.

2016

Harmony Generation Driven by a Perceptually Motivated Tonal Interval Space

Autores
Bernardes, G; Cocharro, D; Guedes, C; Davies, MEP;

Publicação
COMPUTERS IN ENTERTAINMENT

Abstract
We present D'accord, a generative music system for creating harmonically compatible accompaniments of symbolic and musical audio inputs with any number of voices, instrumentation, and complexity. The main novelty of our approach centers on offering multiple ranked solutions between a database of pitch configurations and a given musical input based on tonal pitch relatedness and consonance indicators computed in a perceptually motivated Tonal Interval Space. Furthermore, we detail a method to estimate the key of symbolic and musical audio inputs based on attributes of the space, which underpins the generation of key-related pitch configurations. The system is controlled via an adaptive interface implemented for Ableton Live, MAX, and Pure Data, which facilitates music creation for users regardless of music expertise and simultaneously serves as a performance, entertainment, and learning tool. We perform a threefold evaluation of D'accord, which assesses the level of accuracy of our key-finding algorithm, the user enjoyment of generated harmonic accompaniments, and the usability and learnability of the system.

  • 213
  • 368