Publicacoes - INESC TEC

Publicações

Publicações por Vitor Rocio

2024

Does Fake News have Feelings?

Autores
Laroca, H; Rocio, V; Cunha, A;

Publicação
Procedia Computer Science

Abstract
Fake news spreads rapidly, creating issues and making detection harder. The purpose of this study is to determine if fake news contains sentiment polarity (positive or negative), identify the polarity of sentiment present in their textual content and determine whether sentiment polarity is a reliable indication of fake news. For this, we use a deep learning model called BERT (Bidirectional Encoder Representations from Transformers), trained on a sentiment polarity dataset to classify the polarity of sentiments from a dataset of true and fake news. The findings show that sentiment polarity is not a reliable single feature for recognizing false news correctly and must be combined with other parameters to improve classification accuracy. © 2024 The Author(s). Published by Elsevier B.V.

FecharLer Abstract

2023

Chatbots Scenarios for Education

Autores
Virkus, S; Mamede, HS; Ramos Rocio, VJ; Dickel, J; Zubikova, O; Butkiene, R; Vaiciukynas, E; Ceponiene, L; Gudoniene, D;

Publicação
ICIST

Abstract
Educational chatbots are digital tools designed to assist learners in various educational settings. These chatbots use natural language processing (NLP) and machine learning algorithms to simulate human conversation and respond to user queries in a way that facilitates learning. They can be integrated into various educational platforms such as learning management systems, educational apps, and websites to provide learners with a personalized and interactive learning experience. Our paper discusses different scenarios for educational purposes and suggests in total four scenarios for educational needs.

FecharLer Abstract

2012

Building and Exploring Semantic Equivalences Resources

Autores
Carvalho, G; de Matos, DM; Rocio, V;

Publicação
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION

Abstract
Language resources that include semantic equivalences at word level are common, and its usefulness is well established in text processing applications, as in the case of search. Named entities also play an important role for text based applications, but are not usually covered by the previously mentioned resources. The present work describes the WES base, Wikipedia Entity Synonym base, a freely available resource based on the Wikipedia. The WES base was built for the Portuguese Language, with the same format of another freely available thesaurus for the same language, the TeP base, which allows integration of equivalences both at word level and entity level. The resource has been built in a language independent way, so that it can be extended to different languages. The WES base was used in a Question Answering system, enhancing significantly its performance.

FecharLer Abstract

2009

IdSay: Question Answering for Portuguese

Autores
Carvalho, G; de Matos, DM; Rocio, V;

Publicação
EVALUATING SYSTEMS FOR MULTILINGUAL AND MULTIMODAL INFORMATION ACCESS

Abstract
IdSay is an open domain Question Answering (QA) system for Portuguese. Its current version can be considered a baseline version, using mainly techniques from the area of Information Retrieval (IR). The only external information it uses besides the text. collections is lexical information for Portuguese. It was submitted to the monolingual Portuguese task of the QA track of the Cross-Language Evaluation Forum 2008 (QA@CLEF) for the first time, and it answered correctly to 65 of the 200 questions in the first answer, and to 85 answers considering the three answers that could be returned per question. Generally, the types of questions that are answered better by IdSay system are measure factoids, Count factoids and definitions, but there is still work to be done in these areas, as well as in the treatment of time. List questions, location and people/organization factoids are the types of question with more room for improvement.

FecharLer Abstract

2012

Searching a Mixed Corpus in the Light of the New Portuguese Orthographic Norm

Autores
Carvalho, G; Falé, I; de Matos, DM; Rocio, V;

Publicação
Computational Processing of the Portuguese Language - 10th International Conference, PROPOR 2012, Coimbra, Portugal, April 17-20, 2012. Proceedings

Abstract
A mixed corpus of Portuguese is one in which texts of different origins produce different spelling variants for the same word. A new norm, which will bring together the written texts produced both in Portugal and Brazil, giving then a more uniform orthography, has been effective since 2009, but what happens in the perspective of search, to corpora created before the norm came into practice, or within the transition period? Is the information they contain outdated and worthless? Do they need to be converted to the new norm? In the present work we analyse these questions. © 2012 Springer-Verlag.

FecharLer Abstract

2010

Improving IdSay: A Characterization of Strengths and Weaknesses in Question Answering Systems for Portuguese

Autores
Carvalho, G; de Matos, DM; Rocio, V;

Publicação
COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROCEEDINGS

Abstract
IdSay is a Question Answering system for Portuguese that participated at QA@CLEF 2008 with a baseline version (IdSayBL). Despite the encouraging results, there was still much room for improvement. The participation of six systems in the Portuguese task, with very good results either individually or in an hypothetical combination run, provided a valuable source of information. We made an analysis of all the answers submitted by all systems to identify their strengths and weaknesses. We used the conclusions of that analysis to guide our improvements, keeping in mind the two key characteristics we want for the system: efficiency in terms of response time and robustness to treat different types of data. As a result, an improved version of IdSay was developed, including as the most important enhancement the introduction of semantic information. We obtained significantly better results, from an accuracy in the first answer of 32.5% in IdSayBL to 50.5% in IdSay, without degradation of response time.

FecharLer Abstract