Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by João Gama

2019

Special Issue of DASFAA 2019

Authors
Li, G; Gama, J; Yang, J;

Publication
Data Sci. Eng.

Abstract

2020

A scalable saliency-based feature selection method with instance-level information

Authors
Cancela, B; Bolón Canedo, V; Alonso Betanzos, A; Gama, J;

Publication
KNOWLEDGE-BASED SYSTEMS

Abstract
Classic feature selection techniques remove irrelevant or redundant features to achieve a subset of relevant features in compact models that are easier to interpret and so improve knowledge extraction. Most such techniques operate on the whole dataset, but are unable to provide the user with useful information when only instance-level information is required; in other words, classic feature selection algorithms do not identify the most relevant information in a sample. We have developed a novel feature selection method, called saliency-based feature selection (SFS), based on deep-learning saliency techniques. Our algorithm works under any architecture that is trained by using gradient descent techniques (Neural Networks, SVMs, ...), and can be used for classification or regression problems. Experimental results show our algorithm is robust, as it allows to transfer the feature ranking result between different architectures, achieving remarkable results. The versatility of our algorithm has been also demonstrated, as it can work either in big data environments as well as with small datasets.

2020

BRIGHT-Drift-Aware Demand Predictions for Taxi Networks

Authors
Saadallah, A; Moreira Matias, L; Sousa, R; Khiari, J; Jenelius, E; Gama, J;

Publication
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Abstract
Massive data broadcast by GPS-equipped vehicles provide unprecedented opportunities. One of the main tasks in order to optimize our transportation networks is to build data-driven real-time decision support systems. However, the dynamic environments where the networks operate disallow the traditional assumptions required to put in practice many off-the-shelf supervised learning algorithms, such as finite training sets or stationary distributions. In this paper, we propose BRIGHT: a drift-aware supervised learning framework to predict demand quantities. BRIGHT aims to provide accurate predictions for short-term horizons through a creative ensemble of time series analysis methods that handles distinct types of concept drift. By selecting neighborhoods dynamically, BRIGHT reduces the likelihood of overfitting. By ensuring diversity among the base learners, BRIGHT ensures a high reduction of variance while keeping bias stable. Experiments were conducted using three large-scale heterogeneous real-world transportation networks in Porto (Portugal), Shanghai (China), and Stockholm (Sweden), as well as with controlled experiments using synthetic data where multiple distinct drifts were artificially induced. The obtained results illustrate the advantages of BRIGHT in relation to state-of-the-art methods for this task.

2019

Main Factors Driving the Open Rate of Email Marketing Campaigns

Authors
Conceição, A; Gama, J;

Publication
DS

Abstract
Email Marketing is one of the most important traffic sources in Digital Marketing. It yields a high return on investment for the company and offers a cheap and fast way to reach existent or potential clients. Getting the recipients to open the email is the first step for a successful campaign. Thus, it is important to understand how marketers can improve the open rate of a marketing campaign. In this work, we analyze what are the main factors driving the open rate of financial email marketing campaigns. For that purpose, we develop a classification algorithm that can accurately predict if a campaign will be labeled as Successful or Failure. A campaign is classified as Successful if it has an open rate higher than the average, otherwise it is labeled as Failure. To achieve this, we have employed and evaluated three different classifiers. Our results showed that it is possible to predict the performance of a campaign with approximately 82% accuracy, by using the Random Forest algorithm and the redundant filter selection technique. With this model, marketers will have the chance to sooner correct potential problems in a campaign that could highly impact its revenue. Additionally, a text analysis of the subject line and preheader was performed to discover which keywords and keyword combinations trigger a higher open rate. The results obtained were then validated in a real setting through A/B testing.

2020

Impact of Trust and Reputation Based Brokerage on the CloudAnchor Platform

Authors
Veloso, B; Malheiro, B; Burguillo, JC; Gama, J;

Publication
PAAMS

Abstract
This paper analyses the impact of trust and reputation modelling on CloudAnchor, a business-to-business brokerage platform for the transaction of single and federated resources on behalf of Small and Medium Sized Enterprises (SME). In CloudAnchor, businesses act as providers or consumers of Infrastructure as a Service (IaaS) resources. The platform adopts a multi-layered multi-agent architecture, where providers, consumers and virtual providers, representing provider coalitions, engage in trust & reputation-based provider look-up, invitation, acceptance and resource negotiations. The goal of this work is to assess the relevance of the distributed trust model and centralised fuzzified reputation service in the number of resources successfully transacted, the global turnover, brokerage fees, losses, expenses and time response. The results show that trust and reputation based brokerage has a positive impact on the CloudAnchor performance by reducing losses and the execution time for the provision of both single and federated resources and increasing considerably the number of federated resources provided.

2020

REST framework: A modelling approach towards cooling energy stress mitigation plans for future cities in warming Global South

Authors
Bardhan, R; Debnath, R; Gama, J; Vijay, U;

Publication
SUSTAINABLE CITIES AND SOCIETY

Abstract
Future cities of the Global South will not only rapidly urbanise but will also get warmer from climate change and urbanisation induced effects. It will trigger a multi-fold increase in cooling demand, especially at a residential level, mitigation to which remains a policy and research gap. This study forwards a novel residential energy stress mitigation framework called REST to estimate warming climate-induced energy stress in residential buildings using a GIS-driven urban heat island and energy modelling approach. REST further estimates rooftop solar potential to enable solar photo-voltaic (PV) based decentralised energy solutions and establish an optimised routine for peer-to-peer energy sharing at a neighbourhood scale. The optimised network is classified through a decision tree algorithm to derive sustainability rules for mitigating energy stress at an urban planning scale. These sustainability rules established distributive energy justice variables in urban planning context. The REST framework is applied as a proof-of-concept on a future smart city of India, named Amaravati. Results show that cooling energy stress can be reduced by 80 % in the study area through sensitive use of planning variables like Floor Space Index (FSI) and built-up density. It has crucial policy implications towards the design and implementation of a national level cooling action plans in the future cities of the Global South to meet the UN-SDG - 7 (clean and affordable energy) and SDG - 11 (sustainable cities and communities) targets.

  • 35
  • 96