2019
Autores
Jorge, AM; Campos, R; Jatowt, A; Bhatia, S;
Publicação
CEUR Workshop Proceedings
Abstract
2019
Autores
Jahromi, HN; Jorge, AM;
Publicação
PROCEEDINGS OF THE INSTITUTION OF CIVIL ENGINEERS-ENERGY
Abstract
Low oil and gas prices have motivated petroleum executives to look into cost reduction in their supply chains more seriously. To this end, a technology considered in hydrocarbon exploration is data science. There are three exploration-related geoscientific domains in which data science is applied: surface geology, structural geology and reservoir property issues. This research provides an in-depth perspective on data science applications in these domains by considering a variety of work in each of them. The result is an understanding of the specific geoscientific problems studied in the literature along with the relative data science models. This way, this work tries to lay the ground for a mutual understanding on oil and gas exploration between the data scientists and geoscientists.
2019
Autores
Nogueira, DM; Zarmehri, MN; Ferreira, CA; Jorge, AM; Antunes, L;
Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2019, PT I
Abstract
Cardiovascular disease is the leading cause of death around the world and its early detection is a key to improving long-term health outcomes. To detect possible heart anomalies at an early stage, an automatic method enabling cardiac health low-cost screening for the general population would be highly valuable. By analyzing the phonocardiogram (PCG) signals, it is possible to perform cardiac diagnosis and find possible anomalies at an early-term. Accordingly, the development of intelligent and automated analysis tools of the PCG is very relevant. In this work, the PCG signals are studied with the main objective of determining whether a PCG signal corresponds to a “normal” or “abnormal” physiological state. The main contribution of this work is the evidence provided that time domain features can be combined with features extracted from a wavelet transformation of PCG signals to improve automatic cardiac disease classification. We empirically demonstrate that, from a pool of alternatives, the best classification results are achieved when both time and wavelet features are used by a Support Vector Machine with a linear kernel. Our approach has obtained better results than the ones reported by the challenge participants which use large amounts of data and high computational power. © Springer Nature Switzerland AG 2019.
2019
Autores
Vinagre, J; Jorge, AM; Bifet, A; Al Ghossein, M;
Publicação
RECSYS 2019: 13TH ACM CONFERENCE ON RECOMMENDER SYSTEMS
Abstract
The ever-growing nature of user generated data in online systems poses obvious challenges on how we process such data. Typically, this issue is regarded as a scalability problem and has been mainly addressed with distributed algorithms able to train on massive amounts of data in short time windows. However, data is inevitably adding up at high speeds. Eventually one needs to discard or archive some of it. Moreover, the dynamic nature of data in user modeling and recommender systems, such as change of user preferences, and the continuous introduction of new users and items make it increasingly difficult to maintain up-to-date, accurate recommendation models. The objective of this workshop is to bring together researchers and practitioners interested in incremental and adaptive approaches to stream-based user modeling, recommendation and personalization, including algorithms, evaluation issues, incremental content and context mining, privacy and transparency, temporal recommendation or software frameworks for continuous learning.
2019
Autores
Figueiredo, F; Jorge, A;
Publicação
INFORMATION SCIENCES
Abstract
Hashtags have become a crucial social media tool. The categorization of posts in a simple and informal way helps to spread the content through the web. At the same time, it enables users to easily find messages within a specific topic. However, the flexibility provided to use and create a hashtag carries some problems. Equivalent expressions, like synonyms, are handled like entirely different words. On the other hand, the same hashtag may refer to different topics. In this paper, we present TORHID (Topic Relevant Hashtag Identification), a method that employs topic modeling with the purpose of retrieving and identifying hashtags relevant to a specific topic in Twitter streams, starting from a seed hashtag and resorting to a classifier to remove non relevant hashtags. The result is a network of hashtags related to the seed, that we can use to deepen the initial search.
2019
Autores
Loureiro, D; Jorge, AM;
Publicação
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019)
Abstract
Contextual embeddings represent a new generation of semantic representations learned from Neural Language Modelling (NLM) that addresses the issue of meaning conflation hampering traditional word embeddings. In this work, we show that contextual embeddings can be used to achieve unprecedented gains in Word Sense Disambiguation (WSD) tasks. Our approach focuses on creating sense-level embeddings with full-coverage of WordNet, and without recourse to explicit knowledge of sense distributions or task-specific modelling. As a result, a simple Nearest Neighbors (k-NN) method using our representations is able to consistently surpass the performance of previous systems using powerful neural sequencing models. We also analyse the robustness of our approach when ignoring part-of-speech and lemma features, requiring disambiguation against the full sense inventory, and revealing shortcomings to be improved. Finally, we explore applications of our sense embeddings for concept-level analyses of contextual embeddings and their respective NLMs.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.