Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2020

Transfer Learning in urban object classification: Online images to recognize point clouds

Authors
Balado, J; Sousa, R; Diaz Vilarino, L; Arias, P;

Publication
AUTOMATION IN CONSTRUCTION

Abstract
The application of Deep Learning techniques to point clouds for urban object classification is limited by the large number of samples needed. Acquiring and tagging point clouds is more expensive and tedious labour than its image equivalent process. Point cloud online datasets contain few samples for Deep Learning or not always the desired classes This work focuses on minimizing the use of point cloud samples for neural network training in urban object classification. The method proposed is based on the conversion of point clouds to images (pc-images) because it enables: the use of Convolutional Neural Networks, the generation of several samples (images) per object (point clouds) by means of multi-view, and the combination of pc-images with images from online datasets (ImageNet and Google Images). The study is conducted with ten classes of objects extracted from two street point clouds from two different cities. The network selected for the job is the InceptionV3. The training set consists of 5000 online images with a variable percentage (0% to 10%) of pc-images. The validation and testing sets are composed exclusively of pc-images. Although the network trained only with online images reached 47% accuracy, the inclusion of a small percentage of pc-images in the training set improves the classification to 99.5% accuracy with 6% pc-images. The network is also applied at IQmulus & TerraMobilita Contest dataset and it allows the correct classification of elements with few samples.

2020

YAKE! Keyword extraction from single documents using multiple local features

Authors
Campos, R; Mangaravite, V; Pasquali, A; Jorge, A; Nunes, C; Jatowt, A;

Publication
INFORMATION SCIENCES

Abstract
As the amount of generated information grows, reading and summarizing texts of large collections turns into a challenging task. Many documents do not come with descriptive terms, thus requiring humans to generate keywords on-the-fly. The need to automate this kind of task demands the development of keyword extraction systems with the ability to automatically identify keywords within the text. One approach is to resort to machine-learning algorithms. These, however, depend on large annotated text corpora, which are not always available. An alternative solution is to consider an unsupervised approach. In this article, we describe YAKE!, a light-weight unsupervised automatic keyword extraction method which rests on statistical text features extracted from single documents to select the most relevant keywords of a text. Our system does not need to be trained on a particular set of documents, nor does it depend on dictionaries, external corpora, text size, language, or domain. To demonstrate the merits and significance of YAKE!, we compare it against ten state-of-the-art unsupervised approaches and one supervised method. Experimental results carried out on top of twenty datasets show that YAKE! significantly outperforms other unsupervised methods on texts of different sizes, languages, and domains.

2020

The 3rd International Workshop on Narrative Extraction from Texts: Text2Story 2020

Authors
Campos, R; Jorge, A; Jatowt, A; Bhatia, S;

Publication
Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14-17, 2020, Proceedings, Part II

Abstract
The Third International Workshop on Narrative Extraction from Texts (Text2Story’20) [text2story20.inesctec.pt] held in conjunction with the 42nd European Conference on Information Retrieval (ECIR 2020) gives researchers of IR, NLP and other fields, the opportunity to share their recent advances in extraction and formal representation of narratives. This workshop also presents a forum to consolidate the multi-disciplinary efforts and foster discussions around the narrative extraction task, a hot topic in recent years. © Springer Nature Switzerland AG 2020.

2020

Proceedings of Text2Story - Third Workshop on Narrative Extraction From Texts co-located with 42nd European Conference on Information Retrieval, Text2Story@ECIR 2020, Lisbon, Portugal, April 14th, 2020 [online only]

Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S;

Publication
Text2Story@ECIR

Abstract

2020

Incremental Approach for Automatic Generation of Domain-Specific Sentiment Lexicon

Authors
Muhammad, SH; Brazdil, P; Jorge, A;

Publication
Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14-17, 2020, Proceedings, Part II

Abstract
Sentiment lexicon plays a vital role in lexicon-based sentiment analysis. The lexicon-based method is often preferred because it leads to more explainable answers in comparison with many machine learning-based methods. But, semantic orientation of a word depends on its domain. Hence, a general-purpose sentiment lexicon may gives sub-optimal performance compare with a domain-specific lexicon. However, it is challenging to manually generate a domain-specific sentiment lexicon for each domain. Still, it is impractical to generate complete sentiment lexicon for a domain from a single corpus. To this end, we propose an approach to automatically generate a domain-specific sentiment lexicon using a vector model enriched by weights. Importantly, we propose an incremental approach for updating an existing lexicon to either the same domain or different domain (domain-adaptation). Finally, we discuss how to incorporate sentiment lexicons information in neural models (word embedding) for better performance. © Springer Nature Switzerland AG 2020.

2020

MedLinker: Medical Entity Linking with Neural Representations and Dictionary Matching

Authors
Loureiro, D; Jorge, AM;

Publication
Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14-17, 2020, Proceedings, Part II

Abstract
Progress in the field of Natural Language Processing (NLP) has been closely followed by applications in the medical domain. Recent advancements in Neural Language Models (NLMs) have transformed the field and are currently motivating numerous works exploring their application in different domains. In this paper, we explore how NLMs can be used for Medical Entity Linking with the recently introduced MedMentions dataset, which presents two major challenges: (1) a large target ontology of over 2M concepts, and (2) low overlap between concepts in train, validation and test sets. We introduce a solution, MedLinker, that addresses these issues by leveraging specialized NLMs with Approximate Dictionary Matching, and show that it performs competitively on semantic type linking, while improving the state-of-the-art on the more fine-grained task of concept linking (+4 F1 on MedMentions main task). © Springer Nature Switzerland AG 2020.

  • 121
  • 469