Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Interest
Topics
Details

Details

  • Name

    João Cordeiro
  • Cluster

    Computer Science
  • Role

    Senior Researcher
  • Since

    01st December 2012
Publications

2022

NLP-based platform as a service: a brief review

Authors
Pais, S; Cordeiro, J; Jamil, ML;

Publication
JOURNAL OF BIG DATA

Abstract
Natural language processing (NLP) refers to the field of study that focuses on the interactions between human language and computers. It has recently gained much attention for analyzing human language computationally and has spread its applications for various tasks such as machine translation, information extraction, summarization, question answering, and others. With the rapid growth of cloud computing services, merging NLP in the cloud is a significant benefit. It allows researchers to conduct NLP-related experiments on large amounts of data handled by big data techniques while harnessing the cloud's vast, on-demand computing power. However, it has not sufficiently spread its tools and applications as a service in the cloud and there is little literature available that discusses the scope of interdisciplinary work. NLP, cloud Computing, and big data are vast domains and contain their challenges and potentials. By overcoming those challenges and integrating these fields, great potential for NLP and its applications can be unleashed. This paper presents a survey of NLP in cloud computing with a key focus on the comparison of cloud-based NLP services, challenges of NLP and big data while emphasizing the necessity of viable cloud-based NLP services. In the first part of this paper, an overview of NLP is presented by discussing different levels of NLP and components of natural language generation (NLG), followed by the applications of NLP. In the second part, the concept of cloud computing is discussed that highlights the architectural layers and deployment models of cloud computing and cloud-hosted NLP services. In the third part, the field of big data in the cloud is discussed with an emphasis on NLP. Furthermore, information extraction via NLP techniques within big data is introduced.

2022

Detection of extreme sentiments on social networks with BERT

Authors
Jamil, ML; Pais, S; Cordeiro, J; Dias, G;

Publication
SOCIAL NETWORK ANALYSIS AND MINING

Abstract
Online social networking platforms allow people to freely express their ideas, opinions, and emotions negatively or positively. Previous studies have examined sentiments on these platforms to study their behavior in different contexts and purposes. The mechanism of collecting public opinion information has attracted researchers to automatically classify the polarity of public opinions based on the use of concise language in messages, such as tweets, by analyzing social media data. In this paper, we extend the preceding work where an unsupervised approach to automatically detect extreme opinions/posts in social networks is proposed. The performance of the proposed approach is evaluated on five different social network and media datasets. In this work, we use a semi-supervised approach known as BERT to reevaluate the accuracy of our prior approach and the obtained classified dataset. The experiment proves that in these datasets, posts that were previously classified as negative or positive extreme are extremely negative or positive in many cases while using BERT. Furthermore, BERT shows the capability to classify the extreme sentiments when fine-tuned with an appropriate extreme sentiments dataset.

2021

A Comparative Study of Linguistic and Computational Features Based on a Machine Learning for Arabic Anaphora Resolution

Authors
Abolohom, A; Omar, N; Pais, S; Cordeiro, J;

Publication
AI IN COMPUTATIONAL LINGUISTICS

Abstract
Anaphora resolution is one of the problems in natural language processing. It is the process of disambiguating the antecedent of a referring expression from the set of entities in a discourse. The correct interpretation of pronouns plays an important role in the construction of meaning Thus, the resolution of pronominal anaphors remains a very important task for many natural language processing applications. Additionally, it plays an increasingly significant role in computational linguistics. However, a significant amount of work on anaphora resolution is focused on English; anaphora resolution for other languages, including Arabic, is still limited. In this paper, we present a new set of computational and linguistic features to resolve Arabic anaphors using a machine learning approach. In this paper, an in-depth study was conducted on a set of computational and linguistic features to exploit their effectiveness and investigate their effect on anaphora resolution. The aim was to efficiently integrate different feature sets and classification algorithms to synthesize a more accurate classification procedure. Four well-known machine learning algorithms k-nearest neighbor, maximum entropy, decision tree and meta-classifier, were employed as base-classifiers for each of the feature sets. A wide range of comparative experiments on Quran datasets was conducted, the discussion presented, and conclusions were drawn. The experimental results show that our approach gives satisfactory results. (C) 2021 The Authors. Published by Elsevier B.V.

2019

Association and Temporality between News and Tweets

Authors
Moutinho, V; Brazdil, P; Cordeiro, J;

Publication
Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2019, Volume 1: KDIR, Vienna, Austria, September 17-19, 2019.

Abstract
With the advent of social media, the boundaries of mainstream journalism and social networks are becoming blurred. User-generated content is increasing, and hence, journalists dedicate considerable time searching platforms such as Facebook and Twitter to announce, spread, and monitor news and crowd check information. Many studies have looked at social networks as news sources, but the relationship and interconnections between this type of platform and news media have not been thoroughly investigated. In this work, we have studied a series of news articles and examined a set of related comments on a social network during a period of six months. Specifically, a sample of articles from generalist Portuguese news sources published on the first semester of 2016 was clustered, and the resulting clusters were then associated with tweets of Portuguese users with the recourse to a similarity measure. Focusing on a subset of clusters, we have performed a temporal analysis by examining the evolution of the two types of documents (articles and tweets) and the timing of when they appeared. It appears that for some stories, namely Brexit and the European Football Cup, the publishing of news articles intensifies on key dates (event-oriented), while the discussion on social media is more balanced throughout the months leading up to those events. Copyright

2019

SocialNetCrawler: Online Social Network Crawler

Authors
Pais, S; Cordeiro, J; Martins, R; Albardeiro, M;

Publication
11th International Conference on Management of Digital EcoSystems, MEDES 2019, Limassol, Cyprus, November, 2019

Abstract
The emergence and popularization of online social networks suddenly made available a large amount of data from social organization, interaction and human behavior. All this information opens new perspectives and challenges to the study of social systems, being of interest to many fields. Although most online social networks are recent, a vast amount of scientific papers was already published on this topic, dealing with a broad range of analytical methods and applications. Therefore, the development of a tool capable of gather tailored information from social networks is something that can help a lot of researchers on their work, especially in the area of Natural Language Processing (NLP). Nowadays, the daily base medium where people use more often text language lays precisely on social networks. Therefore, the ubiquitous crawling of social networks is of the utmost importance for researchers. Such a tool will allow the researcher to get the relevant needed information, allowing a faster research in what really matters, without loosing time on the development of his own crawler. In this paper, we present an extensive analysis of the existing social networks and their APIs, and also describe the conception and design of a social network crawler which will help NLP researchers. © 2019 Association for Computing Machinery.