2024
Autores
Nunes, S; Jorge, AM; Amorim, E; Sousa, HO; Leal, A; Silvano, PM; Cantante, I; Campos, R;
Publicação
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC/COLING 2024, 20-25 May, 2024, Torino, Italy.
Abstract
Narratives have been the subject of extensive research across various scientific fields such as linguistics and computer science. However, the scarcity of freely available datasets, essential for studying this genre, remains a significant obstacle. Furthermore, datasets annotated with narratives components and their morphosyntactic and semantic information are even scarcer. To address this gap, we developed the Text2Story Lusa datasets, which consist of a collection of news articles in European Portuguese. The first datasets consists of 357 news articles and the second dataset comprises a subset of 117 manually densely annotated articles, totaling over 50 thousand individual annotations. By focusing on texts with substantial narrative elements, we aim to provide a valuable resource for studying narrative structures in European Portuguese news articles. On the one hand, the first dataset provides researchers with data to study narratives from various perspectives. On the other hand, the annotated dataset facilitates research in information extraction and related tasks, particularly in the context of narrative extraction pipelines. Both datasets are made available adhering to FAIR principles, thereby enhancing their utility within the research community.
2024
Autores
de Jesus G.; Nunes S.;
Publicação
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
Abstract
This paper proposes Labadain Crawler, a data collection pipeline tailored to automate and optimize the process of constructing textual corpora from the web, with a specific target to low-resource languages. The system is built on top of Nutch, an open-source web crawler and data extraction framework, and incorporates language processing components such as a tokenizer and a language identification model. The pipeline efficacy is demonstrated through successful testing with Tetun, one of Timor-Leste's official languages, resulting in the construction of a high-quality Tetun text corpus comprising 321.7k sentences extracted from over 22k web pages. The contributions of this paper include the development of a Tetun tokenizer, a Tetun language identification model, and a Tetun text corpus, marking an important milestone in Tetun text information retrieval.
2024
Autores
Santos, T; Bispo, J; Cardoso, JMP;
Publicação
PROCEEDINGS OF THE 25TH ACM SIGPLAN/SIGBED INTERNATIONAL CONFERENCE ON LANGUAGES, COMPILERS, AND TOOLS FOR EMBEDDED SYSTEMS, LCTES 2024
Abstract
Modern hardware accelerators, such as FPGAs, allow offloading large regions of C/C++ code in order to improve the execution time and/or the energy consumption of software applications. An outstanding challenge with this approach, however, is solving the Hardware/Software (Hw/Sw) partitioning problem. Given the increasing complexity of both the accelerators and the potential code regions, one needs to adopt a holistic approach when selecting an offloading region by exploring the interplay between communication costs, data usage patterns, and target-specific optimizations. To this end, we propose representing a C application as an extended task graph (ETG) with flexible granularity, which can be manipulated through the merging and splitting of tasks. This approach involves generating a task graph overlay on the program's Abstract Syntax Tree (AST) that maps tasks to functions and the flexible granularity operations onto inlining/outlining operations. This maintains the integrity and readability of the original source code, which is paramount for targeting different accelerators and enabling code optimizations, while allowing the offloading of code regions of arbitrary complexity based on the data patterns of their tasks. To evaluate the ETG representation and its compiler, we use the latter to generate ETGs for the programs in Rosetta and MachSuite benchmark suites, and extract several metrics regarding data communication, task-level parallelism, and dataflow patterns between pairs of tasks. These metrics provide important information that can be used by Hw/Sw partitioning methods.
2024
Autores
Josipovic, L; Zhou, P; Shanker, S; Cardoso, JMP; Anderson, J; Yuichiro, S;
Publicação
HEART
Abstract
2024
Autores
Ferreira, PJS; Moreira, JM; Cardoso, JMP;
Publicação
10th IEEE World Forum on Internet of Things, WF-IoT 2024, Ottawa, ON, Canada, November 10-13, 2024
Abstract
Self-adaptive Systems (SaS) are becoming increasingly important for adapting to dynamic environments and for optimizing performance on resource-constrained devices. A practical approach to achieving self-adaptability involves using a Pareto-Front (PF) to store the system's hyper-parameters and the outcomes of hyperparameter combinations. This paper proposes a novel method to approximate a PF, offering a configurable number of solutions that can be adapted to the device's limitations. We conducted extensive experiments across various scenarios, where all PF solutions were replaced, and real world scenarios were performed using actual measurements from a Human Activity Recognition (HAR) system. Our results show that our method consistently outperforms previous methods, mainly when the maximum number of PF solutions is in the order of hundreds. The effectiveness of our method is most apparent in real-case scenarios where it achieves, when executed in a Raspberry Pi 5, up to 87% energy consumption reduction and lower execution times than the second-best algorithm. Additionally, our method ensures a more evenly distributed solution across the PF, preventing the high concentration of solutions. © 2024 IEEE.
2024
Autores
Assaf, R; Mendes, D; Rodrigues, R;
Publicação
COMPUTER GRAPHICS FORUM
Abstract
Collaboration in extended reality (XR) environments presents complex challenges that revolve around how users perceive the presence, intentions, and actions of their collaborators. This paper delves into the intricate realm of group awareness, focusing specifically on workspace awareness and the innovative visual cues designed to enhance user comprehension. The research begins by identifying a spectrum of collaborative situations drawn from an analysis of XR prototypes in the existing literature. Then, we describe and introduce a novel classification for workspace awareness, along with an exploration of visual cues recently employed in research endeavors. Lastly, we present the key findings and shine a spotlight on promising yet unexplored topics. This work not only serves as a reference for experienced researchers seeking to inform the design of their own collaborative XR applications but also extends a welcoming hand to newcomers in this dynamic field.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.