Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2020

YAKE! Keyword extraction from single documents using multiple local features

Authors
Campos, R; Mangaravite, V; Pasquali, A; Jorge, A; Nunes, C; Jatowt, A;

Publication
Information Sciences

Abstract
As the amount of generated information grows, reading and summarizing texts of large collections turns into a challenging task. Many documents do not come with descriptive terms, thus requiring humans to generate keywords on-the-fly. The need to automate this kind of task demands the development of keyword extraction systems with the ability to automatically identify keywords within the text. One approach is to resort to machine-learning algorithms. These, however, depend on large annotated text corpora, which are not always available. An alternative solution is to consider an unsupervised approach. In this article, we describe YAKE!, a light-weight unsupervised automatic keyword extraction method which rests on statistical text features extracted from single documents to select the most relevant keywords of a text. Our system does not need to be trained on a particular set of documents, nor does it depend on dictionaries, external corpora, text size, language, or domain. To demonstrate the merits and significance of YAKE!, we compare it against ten state-of-the-art unsupervised approaches and one supervised method. Experimental results carried out on top of twenty datasets show that YAKE! significantly outperforms other unsupervised methods on texts of different sizes, languages, and domains. © 2019 Elsevier Inc.

2020

The 3$$^{\mathrm {rd}}$$ International Workshop on Narrative Extraction from Texts: Text2Story 2020

Authors
Campos, R; Jorge, A­; Jatowt, A; Bhatia, S;

Publication
Lecture Notes in Computer Science - Advances in Information Retrieval

Abstract

2020

Proceedings of Text2Story - Third Workshop on Narrative Extraction From Texts co-located with 42nd European Conference on Information Retrieval, Text2Story@ECIR 2020, Lisbon, Portugal, April 14th, 2020 [online only]

Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S;

Publication
Text2Story@ECIR

Abstract

2020

Incremental Approach for Automatic Generation of Domain-Specific Sentiment Lexicon

Authors
Muhammad, SH; Brazdil, P; Jorge, A;

Publication
Lecture Notes in Computer Science - Advances in Information Retrieval

Abstract

2020

MedLinker: Medical Entity Linking with Neural Representations and Dictionary Matching

Authors
Loureiro, D; Jorge, AM;

Publication
Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14-17, 2020, Proceedings, Part II

Abstract

2020

Sentence Compression for Portuguese

Authors
Asevedo Nóbrega, FA; Jorge, AM; Brazdil, P; Pardo, TAS;

Publication
Computational Processing of the Portuguese Language - 14th International Conference, PROPOR 2020, Evora, Portugal, March 2-4, 2020, Proceedings

Abstract
The task of Sentence Compression aims at producing a shorter version of a given sentence. This task may assist many other applications, as Automatic Summarization and Text Simplification. In this paper, we investigate methods for Sentence Compression for Portuguese. We focus on machine learning-based algorithms and propose new strategies. We also create reference corpora/datasets for the area, allowing to train and to test the methods of interest. Our results show that some of our methods outperform previous initiatives for Portuguese and produce competitive results with a state of the art method in the area. © Springer Nature Switzerland AG 2020.

  • 1
  • 209