Detalhes
Nome
Miguel Gonçalves AreiasDesde
16 maio 2011
Nacionalidade
PortugalCentro
Centro de Sistemas de Computação AvançadaContactos
+351220402963
miguel.g.areias@inesctec.pt
2022
Autores
Alves, F; Martins, FMS; Areias, M; Munoz Merida, A;
Publicação
SCIENTIFIC REPORTS
Abstract
Analysis of intra- and inter-population diversity has become important for defining the genetic status and distribution patterns of a species and a powerful tool for conservation programs, as high levels of inbreeding could lead into whole population extinction in few generations. Microsatellites (SSR) are commonly used in population studies but discovering highly variable regions across species' genomes requires demanding computation and laboratorial optimization. In this work, we combine next generation sequencing (NGS) with automatic computing to develop a genomic-oriented tool for characterizing SSRs at the population level. Herein, we describe a new Python pipeline, named Micro-Primers, designed to identify, and design PCR primers for amplification of SSR loci from a multi-individual microsatellite library. By combining commonly used programs for data cleaning and microsatellite mining, this pipeline easily generates, from a fastq file produced by high-throughput sequencing, standard information about the selected microsatellite loci, including the number of alleles in the population subset, and the melting temperature and respective PCR product of each primer set. Additionally, potential polymorphic loci can be identified based on the allele ranges observed in the population, to easily guide the selection of optimal markers for the species. Experimental results show that Micro-Primers significantly reduces processing time in comparison to manual analysis while keeping the same quality of the results. The elapsed times at each step can be longer depending on the number of sequences to analyze and, if not assisted, the selection of polymorphic loci from multiple individuals can represent a major bottleneck in population studies.
2022
Autores
Areias, M; Rocha, R;
Publicação
COMPUTING
Abstract
A key aspect of any hash map design is the problem of dynamically resizing it in order to deal with hash collisions. Compression in tree-based hash maps is the ability of reducing the depth of the internal hash levels that support the hash map. In this context, elasticity refers to the ability of automatically resizing the internal data structures that support the hash map operations in order to meet varying workloads, thus optimizing the overall memory consumption of the hash map. This work extends a previous lock-free hash trie map design to support elastic hashing, i.e., expand saturated hash levels and compress unused hash levels, such that, at each point in time, the number of levels in a path is adjusted, as closely as possible, to the set of keys that is stored in the data structure. To materialize our design, we introduce a new compress operation for hash levels, which requires redesigning the existing search, insert, remove and expand operations in order to maintain the lock-freedom property of the data structure. Experimental results show that elasticity effectively improves the search operation and, in doing so, our design becomes very competitive when compared to other state-of-the-art designs implemented in Java.
2021
Autores
Areias, M; Rocha, R;
Publicação
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
Abstract
Hash tries are a trie-based data structure with nearly ideal characteristics for the implementation of hash maps. In this paper, we present a novel, simple and scalable hash trie map design that fully supports the concurrent search, insert and remove operations on hash maps. To the best of our knowledge, our proposal is the first that puts together the following characteristics: (i) be lock free; (ii) use fixed size data structures; and (iii) maintain the access to all internal data structures as persistent memory references. Our design is modular enough to allow different types of configurations aimed for different performances in memory usage and execution time and can be easily implemented in any type of language, library or within other complex data structures. We discuss in detail the key algorithms required to easily reproduce our implementation by others and we present a proof of correctness showing that our proposal is linearizable and lock-free for the search, insert and remove operations. Experimental results show that our proposal is quite competitive when compared against other state-of-the-art proposals implemented in Java.
2021
Autores
Moreno, P; Areias, M; Rocha, R;
Publicação
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
Abstract
Hash tries are a trie-based data structure with nearly ideal characteristics for the implementation of hash maps. Starting from a particular lock-free hash map data structure, named Lock-Free Hash Tries, we focus on solving the problem of memory reclamation without losing the lock-freedom property. To the best of our knowledge, outside garbage collected environments, there is no current implementation of hash maps that is able to reclaim memory in a lock-free manner. To achieve this goal, we propose an approach for memory reclamation specific to Lock-Free Hash Tries that explores the characteristics of its structure in order to achieve efficient memory reclamation with low and well-defined memory bounds. We present and discuss in detail the key algorithms required to easily reproduce our implementation by others. Experimental results show that our approach obtains better results when compared with other state-of-the-art memory reclamation methods and provides a competitive and scalable hash map implementation, if compared to lock-based implementations.
2021
Autores
Areias, M; Rocha, R;
Publicação
2021 20TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC)
Abstract
A key aspect of any hash map design is the problem of dynamically resizing it in order to deal with hash collisions. In this context, elasticity refers to the ability to automatically resize the internal data structures that support the hash map operations in order to meet varying workloads, thus optimizing the overall memory consumption of the hash map. This work extends a previous lock-free hash trie design to support elastic hashing, i.e., expand saturated hash levels and compress unused hash levels, such that, at each point in time, the number of levels in a path matches the current demand as closely as possible. Experimental results show that elasticity effectively improves the search operation and, in doing so, our design becomes very competitive when compared to other state-of-the-art designs implemented in Java.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.