Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por HumanISE

2024

A Community-Driven Data-to-Text Platform for Football Match Summaries

Autores
Fernandes, P; Nunes, S; Santos, L;

Publicação
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC/COLING 2024, 20-25 May, 2024, Torino, Italy.

Abstract
Data-to-text systems offer a transformative approach to generating textual content in data-rich environments. This paper describes the architecture and deployment of Prosebot, a community-driven data-to-text platform tailored for generating textual summaries of football matches derived from match statistics. The system enhances the visibility of lower-tier matches, traditionally accessible only through data tables. Prosebot uses a template-based Natural Language Generation (NLG) module to generate initial drafts, which are subsequently refined by the reading community. Comprehensive evaluations, encompassing both human-mediated and automated assessments, were conducted to assess the system's efficacy. Analysis of the community-edited texts reveals that significant segments of the initial automated drafts are retained, suggesting their high quality and acceptance by the collaborators. Preliminary surveys conducted among platform users highlight a predominantly positive reception within the community.

2024

Text2Story Lusa: A Dataset for Narrative Analysis in European Portuguese News Articles

Autores
Nunes, S; Jorge, AM; Amorim, E; Sousa, HO; Leal, A; Silvano, PM; Cantante, I; Campos, R;

Publicação
LREC/COLING

Abstract
Narratives have been the subject of extensive research across various scientific fields such as linguistics and computer science. However, the scarcity of freely available datasets, essential for studying this genre, remains a significant obstacle. Furthermore, datasets annotated with narratives components and their morphosyntactic and semantic information are even scarcer. To address this gap, we developed the Text2Story Lusa datasets, which consist of a collection of news articles in European Portuguese. The first datasets consists of 357 news articles and the second dataset comprises a subset of 117 manually densely annotated articles, totaling over 50 thousand individual annotations. By focusing on texts with substantial narrative elements, we aim to provide a valuable resource for studying narrative structures in European Portuguese news articles. On the one hand, the first dataset provides researchers with data to study narratives from various perspectives. On the other hand, the annotated dataset facilitates research in information extraction and related tasks, particularly in the context of narrative extraction pipelines. Both datasets are made available adhering to FAIR principles, thereby enhancing their utility within the research community.

2024

Indexing Portuguese NLP Resources with PT-Pump-Up

Autores
Almeida, R; Campos, R; Jorge, A; Nunes, S;

Publicação
PROPOR (2)

Abstract

2024

Exploring Large Language Models for Relevance Judgments in Tetun

Autores
Jesus, Gd; Nunes, S;

Publicação
CoRR

Abstract

2024

Network-based Approach for Stopwords Detection

Autores
António Ali, FDM; Jesus, Gd; Cardoso, HL; Nunes, SS; Silva, RS;

Publicação
Proceedings of the 16th International Conference on Computational Processing of Portuguese, PROPOR 2024, Santiago de Compostela, Galicia/Spain, March 12-15, 2024, Volume 2

Abstract

2024

Adding human values on the deepfake: co-designing fact-checking solutions to combat misinformation

Autores
C. H. Maia; P. Ariel; S. Nunes;

Publicação
AI and Ethics

Abstract
Abstract The proliferation of misinformation poses a significant challenge to societies, and fact-checking emerges as a critical tool to combat this issue. In this work, we conduct an innovation impact assessment to question the use of technology to combat misinformation, specifically examining the ethical implications of this choice. To address this, we organized a workshop using the value sensitive design (VSD) methodology to explore questions in this context. The workshop introduced participants to the VSD framework, enabling them to critically assess whether specific scenarios align with human values, norms, and requirements. Real-world scenarios were discussed, including approaches implemented by legitimate news outlets and using 3D virtual characters by a Brazilian television employing deep learning. Participants analyzed how technology impacts journalism values, norms, and practices, focusing on aligning synthetic media technologies with automated fact-checking dissemination. In conclusion, the authors prepared recommendations from valuable insights into the complex ethical considerations surrounding synthetic media technologies for automatic fact-checking dissemination. It also facilitated cross-border discussions, with 11 participants from seven countries engaging in fruitful dialogue on this vital topic. The study proposed evaluation criteria for AI-generated content in this diversity, including privacy protection, inclusiveness, transparency, beauty standards conformity, engagement, meaningfulness, and effortlessness.

  • 57
  • 701