Cookies Policy
We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out More
Close
  • Menu
Interest
Topics
Details

Details

  • Name

    Jorge Miguel Silva
  • Cluster

    Computer Science
  • Role

    Research Assistant
  • Since

    25th January 2016
Publications

2018

Parallel Asynchronous Strategies for the Execution of Feature Selection Algorithms

Authors
Silva, J; Aguiar, A; Silva, F;

Publication
International Journal of Parallel Programming

Abstract

2018

Hierarchical Expert Profiling Using Heterogeneous Information Networks

Authors
Silva, JMB; Ribeiro, P; Silva, FMA;

Publication
Discovery Science - Lecture Notes in Computer Science

Abstract

2018

OTARIOS: OpTimizing Author Ranking with Insiders/Outsiders Subnetworks

Authors
Silva, JMB; Aparício, DO; Silva, FMA;

Publication
Studies in Computational Intelligence - Complex Networks and Their Applications VII

Abstract

2017

Feature extraction for the author name disambiguation problem in a bibliographic database

Authors
Silva, JMB; Silva, FMA;

Publication
Proceedings of the Symposium on Applied Computing, SAC 2017, Marrakech, Morocco, April 3-7, 2017

Abstract
Author name disambiguation in bibliographic databases has been, and still is, a challenging research task due to the high uncertainty there is when matching a publication author with a concrete researcher. Common approaches normally either resort to clustering to group author's publications, or use a binary classifier to decide whether a given publication is written by a specific author. Both approaches benefit from authors publishing similar works (e.g. subject areas and venues), from the previous publication history of an author (the higher, the better), and validated publicationauthor associations for model creation. However, whenever such an algorithm is confronted with different works from an author, or an author without publication history, often it makes wrong identifications. In this paper, we describe a feature extraction method that aims to avoid the previous problems. Instead of generally characterizing an author, it selectively uses features that associate the author to a certain publication. We build a Random Forest model to assess the quality of our set of features. Its goal is to predict whether a given author is the true author of a certain publication. We use a bibliographic database named Authenticus with more than 250, 000 validated author-publication associations to test model quality. Our model achieved a top result of 95.37% accuracy in predicting matches and 91.92% in a real test scenario. Furthermore, in the last case the model was able to correctly predict 61.86% of the cases where authors had no previous publication history. Copyright 2017 ACM.

2015

A Parallel Computing Hybrid Approach for Feature Selection

Authors
Silva, J; Aguiar, A; Silva, F;

Publication
2015 IEEE 18TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE)

Abstract
The ultimate goal of feature selection is to select the smallest subset of features that yields minimum generalization error from an original set of features. This effectively reduces the feature space, and thus the complexity of classifiers. Though several algorithms have been proposed, no single one outperforms all the other in all scenarios, and the problem is still an actively researched field. This paper proposes a new hybrid parallel approach to perform feature selection. The idea is to use a filter metric to reduce feature space, and then use an innovative wrapper method to search extensively for the best solution. The proposed strategy is implemented on a shared memory parallel environment to speedup the process. We evaluated its parallel performance using up to 32 cores and our results show 30 times gain in speed. To test the performance of feature selection we used five datasets from the well known NIPS challenge and were able to obtain an average score of 95.90% for all solutions.