Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Facts & Numbers
000
Presentation

Artificial Intelligence and Decision Support

At LIAAD, we work on the very strategic area of Data Science, which has an increasing interest worldwide and is critical to all areas of human activity. The huge amounts of collected data (Big Data) and the ubiquity of devices with sensors and/or processing power offer opportunities and challenges to scientists and engineers. Moreover, the demand for complex models for objective decision support is spreading in business, health, science, e-government and e-learning, which encourages us to invest in different approaches to modelling.

Our overall strategy is to take advantage of the data flood and diversification, and to invest in research lines that will help reduce the gap between collected and useful data, while offering diverse modelling solutions.

At LIAAD, our fundamental scientific principals are machine learning, statistics, optimisation and mathematics.

Latest News
Artificial Intelligence

Preventing environmental crimes in waste transportation: INESC TEC has the solution

The EnSafe project (Enhancing Environmental Protection: Anomaly Detection in Waste Transportation using Network Science) is developing AI-based solutions to tackle environmental crimes, focusing on waste transportation chains. EnSafe benefits from the active involvement of INESC TEC, which is developing technologies to detect irregular and suspicious behaviours in a sector that is vulnerable to fraud and environmental corruption.

26th May 2025

Computer Science and Engineering

Consulting Clinical Reports to Support Medical Decisions Made Easier with Award-Winning INESC TEC Tool

Supporting physicians in making complex and rare clinical decisions is the goal of MedLink, a tool developed by researchers at INESC TEC, which won the Best Demo Paper award at the European Conference on Information Retrieval—one of the most prestigious conferences in Europe in the field of information retrieval.

08th May 2025

Computer Science and Engineering

Less common language varieties also have a place in the era of AI, as demonstrated by two INESC TEC papers presented at a top conference

It's hard to think of current technologies or innovations that do not resort to Language Models (LM) or Natural Language Processing (NLP). Their presence in various society domains - some with significant relevance, like the legal or healthcare sectors - raise issues (and concerns) that often end up focusing on the same question: are LM-based technologies reaching all communities? Recently, two scientific papers featuring INESC TEC - both accepted at AAAI, an A* conference - sought to address some of the challenges in this new era, which directly influence the Portuguese language.

28th February 2025

Computer Science and Engineering

Tell me what you're looking for and I'll tell you what you need. INESC TEC-Amazon collaboration optimises search engine results for special dates

The seasonality of search queries in search engines could be a factor for online businesses to consider if they seek to improve the ranking of their results. A new demo-paper featuring INESC TEC explored the creation of a database to present the Occasion-aware Recommender solution.

26th February 2025

Computer Science and Engineering

INESC TEC developed natural language processing resources for the Portuguese language

The main goal of the PTicola project was to expand and build new Natural Language Processing (NLP) capabilities for the Portuguese language. The results of this project - which include, for example, an English/European Portuguese translator and a PT-BR/PT-PT language variety identifier - address the gap in NLP resources available for PT-PT compared to PT-BR.

14th February 2025

094

Featured Projects

CitiLink

CitiLink - Enhancing municipal transparency and citizen engagement through AI: from unstructured to structured data

2025-2026

OBSERVA

Optical Signals and Acoustic Surveillance Observatory

2025-2026

EnSafe

Enhancing Environmental Protection: Anomaly Detection in Waste Transportation using Network Science

2025-2026

NEIS

Deteção de novas construções utilizando Inteligência Artificial e imagens de muito grande resolução espacial

2025-2025

TSP2Net

Time Series Privacy-Preserving: New Approaches via Complex Networks

2025-2026

NuClim

Nuclear observations to improve Climate research and GHG emission estimates

2024-2028

HALM

Humanitarian Accounting Logistics with Machine learning

2024-2024

AI4REALNET

AI for REAL-world NETwork operation

2023-2027

AIBOOST

Artificial intelligence for better opportunities and scientific progress towards trustworthy and human-centric digital environment

2023-2027

AzDIH

Azores Digital Innovation Hub on Tourism and Sustainability

2023-2025

PAPVI2

Previsão Avançada de Preços de Venda de Imóveis

2023-2025

PFAI4_4eD

Programa de Formação Avançada Industria 4 - 4a edição

2023-2023

StorySense

Reaching the Semantic Layers of Stories in Text

2023-2026

ATTRACT_DIH

Digital Innovation Hub for Artificial Intelligence and High-Performance Computing

2022-2025

Produtech_R3

Agenda Mobilizadora da Fileira das Tecnologias de Produção para a Reindustrialização

2022-2025

EMERITUS

Environmental crimes’ intelligence and investigation protocol based on multiple data sources

2022-2025

FAIST

Fábrica Ágil Inteligente Sustentável e Tecnológica

2022-2025

ADANET

Internet das Coisas Assistida por Drones

2022-2026

PFAI4_3ed

Programa de Formação Avançada Industria 4 - 3a edição

2022-2022

FORM_I40

Formação Indústria 4.0

2022-2022

DAnon

Supervised Deanonymization of Dark Web Traffic for Cybercrime Investigation

2022-2023

THEIA

Automated Perception Driving

2022-2023

City Analyser

An agnostic platform to analyse massive mobility patterns

2021-2023

HfPT

Health from Portugal

2021-2025

AgWearCare

Wearables para Monitorização das Condições de Trabalho no Agroflorestal

2021-2023

SADCoPQ

Sistema de Apoio à Decisão no Controlo Preditivo da Qualidade na Indústria Metalomecânica da Precisão

2021-2023

SIGIPRO

Sistema inteligente de gestão de processos habilitados espacialmente

2021-2023

DigitalBudget_VE

Aplicação computacional para orçamentação automática de postos de carregamento de VE

2021-2021

XPM

eXplainable Predictive Maintenance

2021-2024

SSPM

Student Success Prediction Model

2021-2022

OnlineAIOps

Online Artificial Intelligence for IT Operations

2021-2023

AI_Sov

AI Sovereignty

2021-2021

PORT XXI

Space Enabled Sustainable Port Services

2020-2022

Training4DS

Formação Avançada em Data Science - Altice Labs

2020-2020

PFAI4.0

Programa de Formação Avançada Industria 4.0

2020-2021

HumanE-AI-Net

HumanE AI Network

2020-2024

MetaFLow

A Meta Learning work-flow for a Low Code Platform

2020-2021

PAIQAFSR

Provision of advisory inputs and quality assurance of the final study report.

2020-2020

Continental FoF

Fábrica do Futuro da Continental Advanced Antenna

2020-2023

PAFML

Investigação e desenvolvimento para aplicação de Machine Learning a dados de pacientes com Paramiloidose

2020-2023

AIDA

Adaptive, Intelligent and Distributed Assurance Platform

2020-2023

SLSNA

Prestação de Serviços no ambito do projeto SKORR

2020-2021

MINE4HEALTH

Text mining e clinical decision-making

2020-2021

Text2Story

Extracting journalistic narratives from text and representing them in a narrative modeling language

2019-2023

T4CDTKC

Training 4 Cotec, Digital Transformation Knowledge Challenge - Elaboração de Programa de Formação “CONHECER E COMPREENDER O DESAFIO DAS TECNOLOGIAS DE TRANSFORMAÇÃO DIGITAL”

2019-2021

PROMESSA

PROject ManagEment intellingent aSSistAnt

2019-2023

NDTECH

NDtech 4.0 - Smart and Connected - Estudo e Caderno de Encargos

2019-2019

RISKSENS

Market Risk Sensitivities

2019-2020

RAMnet

Risk Assessment for Microfinance

2019-2021

HOUSEVALUE

Estimativa de Valor de Avaliação de Imóveis

2019-2019

MLABA

Machine Learn Based Adaptive Business Assurance

2019-2019

Humane_AI

Toward AI Systems That Augment and Empower Humans by Understanding Us, our Society and the World Around Us

2019-2020

Moveo

Prestação de serviços de investigação e desenvolvimento relativos ao sistema MOVEO

2019-2019

FIN-TECH

A FINancial supervision and TECHnology compliance training programme

2019-2021

FailStopper

Early failure detection of public transport vehicles in operational context

2018-2021

TerraAlva

Terr@Alva

2018-2019

MDG

Modelling, dynamics and games

2018-2022

NITROLIMIT

Life at the edge: define the boundaries of the nitrogen cycle in the extreme Antarctic environments

2018-2022

RUTE

Randtech Update and Test Environment

2018-2020

MaLPIS

Aprendizagem Automática para Deteção de Ataques e Identificação de Perfis Segurança na Internet

2018-2022

SKORR

Advancing the Frontier of Social Media Management Tools

2018-2021

FAST-manufacturing

Flexible And sustainable manufacturing

2018-2022

FLOWTEE

Desenvolvimento de um programa que monitorize automaticamente os níveis de bem-estar (ou felicidade) dos funcionários, a partir de dados disponíveis online

2018-2019

MDIGIREC

Context Recommendation in Digital Marketing

2017-2018

NEXT-NET

Next generation Technologies for networked Europe

2017-2019

RECAP

Research on European Children and Adults born Preterm

2017-2021

SmartFarming

Ferramenta avançada para operacionalização da agricultura de precisão

2016-2018

PANACea

Perfis para Anomalias Consumo

2016-2019

BI4UP2

Business Intelligence (BI) Tool

2016-2017

Dynamics2

Dynamics, optimization and modelling

2016-2019

CORAL-TOOLS

CORAL – Sustainable Ocean Exploitation: Tools and Sensors

2016-2018

MarineEye

MarinEye - A prototype for multitrophic oceanic monitoring

2015-2017

FOUREYES

TEC4Growth - RL FourEyes - Intelligence, Interaction, Immersion and Innovation for media industries

2015-2019

NanoStima-RL5

NanoSTIMA - Advanced Methodologies for Computer-Aided Detection and Diagnosis

2015-2019

iMAN

iMAN - Intelligence for advanced Manufacturing systems

2015-2019

NanoStima-RL3

NanoSTIMA - Health data infrastructure

2015-2019

NanoStima-RL4

NanoSTIMA - Health Data Analysis & Decision

2015-2019

SMILES

SMILES - Smart, Mobile, Intelligent and Large scale Sensing and analytics

2015-2019

FOTOCATGRAF

Graphene-based semiconductor photocatalysis for a safe and sustainable water supply: an advanced technology for emerging pollutants removal

2015-2018

SEA

SEA-Sistema de ensino autoadaptativo

2015-2015

MAESTRA

Learning from Massive, Incompletely annotated, and Structured Data

2014-2017

BI4UP

Business Intelligence (BI) Tool

2014-2014

SIBILA

Towards Smart Interacting Blocks that Improve Learned Advice

2013-2015

SmartManufacturing

Smart Manufacturing and Logistics

2013-2015

SmartGrids

Smart Grids

2013-2015

Dynamics

Dynamics and Applications

2012-2015

e-Policy

Engineering for the Policy-making Life Cycle (ePolicy)

2011-2014

SIMULESP

Expert system to support network operator on real time decision

2011-2015

CRN

Trust-aware Automatic E-Contract Negotiation in Agent-based Adaptive Normative Environments

2010-2013

KDUS

Knowledge Discovery from Ubiquitous Data Streams

2010-2013

Palco3.0

Intelligent Web system to support the management of a social network on music

2008-2011

Argos

Wind power forecasting system

2008-2012

MOREWAQ

Monitoring and Forecasting of Water Quality Parameters

2008-2011

ORANKI

Resource-bounded outlier detection

2008-2011

Team
Publications

LIAAD Publications

View all Publications

2025

Online boxplot derived outlier detection

Authors
Mazarei, A; Sousa, R; Mendes Moreira, J; Molchanov, S; Ferreira, HM;

Publication
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS

Abstract
Outlier detection is a widely used technique for identifying anomalous or exceptional events across various contexts. It has proven to be valuable in applications like fault detection, fraud detection, and real-time monitoring systems. Detecting outliers in real time is crucial in several industries, such as financial fraud detection and quality control in manufacturing processes. In the context of big data, the amount of data generated is enormous, and traditional batch mode methods are not practical since the entire dataset is not available. The limited computational resources further compound this issue. Boxplot is a widely used batch mode algorithm for outlier detection that involves several derivations. However, the lack of an incremental closed form for statistical calculations during boxplot construction poses considerable challenges for its application within the realm of big data. We propose an incremental/online version of the boxplot algorithm to address these challenges. Our proposed algorithm is based on an approximation approach that involves numerical integration of the histogram and calculation of the cumulative distribution function. This approach is independent of the dataset's distribution, making it effective for all types of distributions, whether skewed or not. To assess the efficacy of the proposed algorithm, we conducted tests using simulated datasets featuring varying degrees of skewness. Additionally, we applied the algorithm to a real-world dataset concerning software fault detection, which posed a considerable challenge. The experimental results underscored the robust performance of our proposed algorithm, highlighting its efficacy comparable to batch mode methods that access the entire dataset. Our online boxplot method, leveraging dataset distribution to define whiskers, consistently achieved exceptional outlier detection results. Notably, our algorithm demonstrated computational efficiency, maintaining constant memory usage with minimal hyperparameter tuning.

2025

KDBI special issue: Time-series pattern verification in CNC turning-A comparative study of one-class and binary classification

Authors
da Silva, JP; Nogueira, AR; Pinto, J; Curral, M; Alves, AC; Sousa, R;

Publication
EXPERT SYSTEMS

Abstract
Integrating Industry 4.0 and Quality 4.0 optimises manufacturing through IoT and ML, improving processes and product quality. The primary challenge involves identifying patterns in computer numerical control (CNC) machining time-series data to boost manufacturing quality control. The proposed solution involves an experimental study comparing one-class and binary classification algorithms. This study aims to classify time-series data from CNC turning machines, offering insight into monitoring and adjusting tool wear to maintain product quality. The methodology entails extracting spectral features from time-series data to train both one-class and binary classification algorithms, assessing their effectiveness and computational efficiency. Although certain models consistently outperform others, determining the best performing is not possible, as a trade-off between classification and computational performance is observed, with gradient boosting standing out for effectively balancing both aspects. Thus, the choice between one-class and binary classification ultimately relies on dataset's features and task objectives.

2025

Anomaly Detection in Pet Behavioural Data

Authors
Silva, I; Ribeiro, RP; Gama, J;

Publication
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2023, PT II

Abstract
Pet owners are increasingly becoming conscious of their pet's necessities and are paying more attention to their overall wellness. The well-being of their pets is intricately linked to their own emotional and physical well-being. Some veterinary system solutions are emerging to provide proactive healthcare options for pets. One such solution offers the continuous monitoring of a pet's activity through accelerometer tracking devices. Based on data collected by this application, in this paper, we study different time aggregation and three unsupervised machine learning techniques to identify anomalies in pet behaviour data. Specifically, three algorithms, Isolation Forest, Local Outlier Factor, and K-Nearest Neighbour, with various thresholds to differentiate between normal and abnormal events. Results conducted on ten pets (five cats and five dogs) show that the most effective approach is to use daily data divided into periods. Moreover, the Local Outlier Factor is the best algorithm for detecting anomalies when prioritizing the identification of true positives. However, it also produces a high false positive ratio.

2025

Data Science for Fighting Environmental Crime

Authors
Barbosa, M; Ribeiro, C; Gomes, F; Ribeiro, RP; Gama, J;

Publication
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2023, PT II

Abstract
The rise of environmental crimes has become a major concern globally as they cause significant damage to ecosystems, public health and result in economic losses. The availability of vast sensor data provides an opportunity to analyze environmental data proactively. This helps to detect irregularities and uncover potential criminal activities. This paper highlights the critical role played by machine learning (ML) and remote sensing technologies in the continuously evolving scenarios of environmental crime. By examining some case studies on detecting illegal fishing, illegal oil spills, illegal landfills, and illegal logging, we delve into the practical implementation of data-driven approaches for environmental crime detection. Our goal with this study is to provide an overview of the existing research in this area and foster the use of ML and data science techniques to enhance environmental crime detection.

2025

Evaluating Short Text Stream Clustering on Large E-commerce Datasets

Authors
Andrade, C; Ribeiro, RP; Gama, J;

Publication
INTELLIGENT SYSTEMS, BRACIS 2024, PT III

Abstract
Latent Dirichlet Allocation (LDA) is a fundamental method for clustering short text streams. However, when applied to large datasets, it often faces significant challenges, and its performance is typically evaluated in domain-specific datasets such as news and tweets. This study aims to fill this gap by evaluating the effectiveness of short text clustering methods in a large and diverse e-commerce dataset. We specifically investigate how well these clustering algorithms adapt to the complex dynamics and larger scale of e-commerce text streams, which differ from their usual application domains. Our analysis focuses on the impact of high homogeneity scores on the reported Normalized Mutual Information (NMI) values. We particularly examine whether these scores are inflated due to the prevalence of single-element clusters. To address potential biases in clustering evaluation, we propose using the Akaike Information Criterion (AIC) as an alternative metric to reduce the formation of single-element clusters and provide a more balanced measure of clustering performance. We present new insights for applying short text clustering methodologies in real-world situations, especially in sectors like e-commerce, where text data volumes and dynamics present unique challenges.

Facts & Figures

72Researchers

2016

29Senior Researchers

2016

14Proceedings in indexed conferences

2020