Publications

Publications by David Oliveira Aparício

2018

OTARIOS: OpTimizing Author Ranking with Insiders/Outsiders Subnetworks

Authors
Silva, JMB; Aparício, DO; Silva, FMA;

Publication
Complex Networks and Their Applications VII - Volume 1 Proceedings The 7th International Conference on Complex Networks and Their Applications COMPLEX NETWORKS 2018, Cambridge, UK, December 11-13, 2018.

Abstract
Evaluating scientists based on their scientific production is often a controversial topic. Nevertheless, bibliometrics and algorithmic approaches can assist traditional peer review in numerous tasks, such as attributing research grants, deciding scientific committees, or choosing faculty promotions. Traditional bibliometrics focus on individual measures, disregarding the whole data (i.e., the whole network). Here we put forward OTARIOS, a graph-ranking method which combines multiple publication/citation criteria to rank authors. OTARIOS divides the original network in two subnetworks, insiders and outsiders, which is an adequate representation of citation networks with missing information. We evaluate OTARIOS on a set of five real networks, each with publications in distinct areas of Computer Science. When matching a metric’s produced ranking with best papers awards received, we observe that OTARIOS is >20 more accurate than traditional bibliometrics. We obtain the best results when OTARIOS considers (i) the author’s publication volume and publication recency, (ii) how recently his work is being cited by outsiders, and (iii) how recently his work is being cited by insiders and how individual he his. © 2019, Springer Nature Switzerland AG.

CloseRead Abstract

2019

Temporal network alignment via GoT-WAVE

Authors
Aparicio, D; Ribeiro, P; Milenkovic, T; Silva, F;

Publication
BIOINFORMATICS

Abstract
Motivation: Network alignment (NA) finds conserved regions between two networks. NA methods optimize node conservation (NC) and edge conservation. Dynamic graphlet degree vectors are a state-of-the-art dynamic NC measure, used within the fastest and most accurate NA method for temporal networks: DynaWAVE. Here, we use graphlet-orbit transitions (GoTs), a different graphlet-based measure of temporal node similarity, as a new dynamic NC measure within DynaWAVE, resulting in GoT-WAVE. Results: On synthetic networks, GoT-WAVE improves DynaWAVE's accuracy by 30% and speed by 64%. On real networks, when optimizing only dynamic NC, the methods are complementary. Furthermore, only GoT-WAVE supports directed edges. Hence, GoT-WAVE is a promising new temporal NA algorithm, which efficiently optimizes dynamic NC. We provide a user-friendly user interface and source code for GoT-WAVE.

CloseRead Abstract

2019

Feature-enriched author ranking in incomplete networks

Authors
Silva, J; Aparicio, D; Silva, F;

Publication
APPLIED NETWORK SCIENCE

Abstract
Evaluating scientists based on their scientific production is a controversial topic. Nevertheless, bibliometrics and algorithmic approaches can assist traditional peer review in numerous tasks, such as attributing research grants, deciding scientific committees, or choosing faculty promotions. Traditional bibliometrics rank individual entities (e.g., researchers, journals, faculties) without looking at the whole data (i.e., the whole network). Network algorithms, such as PageRank, have been used to measure node importance in a network, and have been applied to author ranking. However, traditional PageRank only uses network topology and ignores relevant features of scientific collaborations. Multiple extensions of PageRank have been proposed, more suited for author ranking. These methods enrich the network with information about the author’s productivity or the venue and year of the publication/citation. Most state-of-the-art (STOA) feature-enriched methods either ignore or do not combine effectively this information. Furthermore, STOA algorithms typically disregard that the full network is not known for most real-world cases.Here we describe OTARIOS, an author ranking method recently developed by us, which combines multiple publication/citation criteria (i.e., features) to evaluate authors. OTARIOS divides the original network into two subnetworks, insiders and outsiders, which is an adequate representation of citation networks with missing information. We evaluate OTARIOS on a set of five real networks, each with publications in distinct areas of Computer Science, and compare it against STOA methods. When matching OTARIOS’ produced ranking with a ground-truth ranking (comprised of best paper award nominations), we observe that OTARIOS is >30% more accurate than traditional PageRank (i.e., topology based method) and >20% more accurate than STOA (i.e., competing feature enriched methods). We obtain the best results when OTARIOS considers (i) the author’s publication volume and publication recency, (ii) how recently the author’s work is being cited by outsiders, and (iii) how recently the author’s work is being cited by insiders and how individual he is. Our results showcase (a) the importance of efficiently combining relevant features and (b) how to adequately perform author ranking in incomplete networks. © 2019, The Author(s).

CloseRead Abstract

2019

Finding Dominant Nodes Using Graphlets

Authors
Aparício, D; Ribeiro, P; Silva, F; Silva, JMB;

Publication
Complex Networks and Their Applications VIII - Volume 1 Proceedings of the Eighth International Conference on Complex Networks and Their Applications COMPLEX NETWORKS 2019, Lisbon, Portugal, December 10-12, 2019.

Abstract
Finding important nodes is a classic task in network science. Nodes are important depending on the context; e.g., they can be (i) nodes that, when removed, cause the network to collapse or (ii) influential spreaders (e.g., of information, or of diseases). Typically, central nodes are assumed to be important, and numerous network centrality measures have been proposed such as the degree centrality, the betweenness centrality, and the subgraph centrality. However, centrality measures are not tailored to capture one particular kind of important nodes: dominant nodes. We define dominant nodes as nodes that dominate many others and are not dominated by many others. We then propose a general graphlet-based measure of node dominance called graphlet-dominance (GD). We analyze how GD differs from traditional network centrality measures. We also study how certain parameters (namely the importance of dominating versus not being dominated and indirect versus direct dominances) influence GD. Finally, we apply GD to author ranking and verify that GD is superior to PageRank in four of the five citation networks tested. © 2020, Springer Nature Switzerland AG.

CloseRead Abstract

2020

FOCAS: Penalising friendly citations to improve author ranking

Authors
Silva, J; Aparicio, D; Ribeiro, P; Silva, F;

Publication
PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20)

Abstract
Scientific impact is commonly associated with the number of citations received. However, an author can easily boost his own citation count by (i) publishing articles that cite his own previous work (self-citations), (ii) having co-authors citing his work (co-author citations), or (iii) exchanging citations with authors from other research groups (reciprocated citations). Even though these friendly citations inflate an author's perceived scientific impact, author ranking algorithms do not normally address them. They, at most, remove self-citations. Here we present Friends-Only Citations AnalySer (FOCAS), a method that identifies friendly citations and reduces their negative effect in author ranking algorithms. FOCAS combines the author citation network with the co-authorship network in order to measure author proximity and penalises citations between friendly authors. FOCAS is general and can be regarded as an independent module applied while running (any) PageRank-like author ranking algorithm. FOCAS can be tuned to use three different criteria, namely authors' distance, citation frequency, and citation recency, or combinations of these. We evaluate and compare FOCAS against eight state-of-the-art author ranking algorithms. We compare their rankings with a ground-truth of best paper awards. We test our hypothesis on a citation and co-authorship network comprised of seven Information Retrieval top-conferences. We observed that FOCAS improved author rankings by 25% on average and, in one case, leads to a gain of 46%.

CloseRead Abstract

2021

A Survey on Subgraph Counting: Concepts, Algorithms, and Applications to Network Motifs and Graphlets

Authors
Ribeiro, P; Paredes, P; Silva, MEP; Aparicio, D; Silva, F;

Publication
ACM COMPUTING SURVEYS

Abstract
Computing subgraph frequencies is a fundamental task that lies at the core of several network analysis methodologies, such as network motifs and graphlet-based metrics, which have been widely used to categorize and compare networks from multiple domains. Counting subgraphs is, however, computationally very expensive, and there has been a large body of work on efficient algorithms and strategies to make subgraph counting feasible for larger subgraphs and networks. This survey aims precisely to provide a comprehensive overview of the existing methods for subgraph counting. Our main contribution is a general and structured review of existing algorithms, classifying them on a set of key characteristics, highlighting their main similarities and differences. We identify and describe the main conceptual approaches, giving insight on their advantages and limitations, and we provide pointers to existing implementations. We initially focus on exact sequential algorithms, but we also do a thorough survey on approximate methodologies (with a trade-off between accuracy and execution time) and parallel strategies (that need to deal with an unbalanced search space).

CloseRead Abstract