Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Facts & Numbers
000
Presentation

Artificial Intelligence and Decision Support

At LIAAD, we work on the very strategic area of Data Science, which has an increasing interest worldwide and is critical to all areas of human activity. The huge amounts of collected data (Big Data) and the ubiquity of devices with sensors and/or processing power offer opportunities and challenges to scientists and engineers. Moreover, the demand for complex models for objective decision support is spreading in business, health, science, e-government and e-learning, which encourages us to invest in different approaches to modelling.

Our overall strategy is to take advantage of the data flood and diversification, and to invest in research lines that will help reduce the gap between collected and useful data, while offering diverse modelling solutions.

At LIAAD, our fundamental scientific principals are machine learning, statistics, optimisation and mathematics.

Latest News
Artificial Intelligence

Preventing environmental crimes in waste transportation: INESC TEC has the solution

The EnSafe project (Enhancing Environmental Protection: Anomaly Detection in Waste Transportation using Network Science) is developing AI-based solutions to tackle environmental crimes, focusing on waste transportation chains. EnSafe benefits from the active involvement of INESC TEC, which is developing technologies to detect irregular and suspicious behaviours in a sector that is vulnerable to fraud and environmental corruption.

26th May 2025

Computer Science and Engineering

Consulting Clinical Reports to Support Medical Decisions Made Easier with Award-Winning INESC TEC Tool

Supporting physicians in making complex and rare clinical decisions is the goal of MedLink, a tool developed by researchers at INESC TEC, which won the Best Demo Paper award at the European Conference on Information Retrieval—one of the most prestigious conferences in Europe in the field of information retrieval.

08th May 2025

Computer Science and Engineering

Less common language varieties also have a place in the era of AI, as demonstrated by two INESC TEC papers presented at a top conference

It's hard to think of current technologies or innovations that do not resort to Language Models (LM) or Natural Language Processing (NLP). Their presence in various society domains - some with significant relevance, like the legal or healthcare sectors - raise issues (and concerns) that often end up focusing on the same question: are LM-based technologies reaching all communities? Recently, two scientific papers featuring INESC TEC - both accepted at AAAI, an A* conference - sought to address some of the challenges in this new era, which directly influence the Portuguese language.

28th February 2025

Computer Science and Engineering

Tell me what you're looking for and I'll tell you what you need. INESC TEC-Amazon collaboration optimises search engine results for special dates

The seasonality of search queries in search engines could be a factor for online businesses to consider if they seek to improve the ranking of their results. A new demo-paper featuring INESC TEC explored the creation of a database to present the Occasion-aware Recommender solution.

26th February 2025

Computer Science and Engineering

INESC TEC developed natural language processing resources for the Portuguese language

The main goal of the PTicola project was to expand and build new Natural Language Processing (NLP) capabilities for the Portuguese language. The results of this project - which include, for example, an English/European Portuguese translator and a PT-BR/PT-PT language variety identifier - address the gap in NLP resources available for PT-PT compared to PT-BR.

14th February 2025

004

Featured Projects

PROD_AI

Solução IA/ML preditiva aplicada ao procurement e gestão de produção:

2025-2027

Doc2FraudDetection

Automated Detection of Fraudulent Documents

2025-2026

Easy4ALL

AI Assistant for No-Code Plataform

2024-2026

PTPumpup

Building Portuguese Language Resources through machine learning and limited human interaction

2021-2024

Team
Publications

LIAAD Publications

View all Publications

2025

Online boxplot derived outlier detection

Authors
Mazarei, A; Sousa, R; Mendes Moreira, J; Molchanov, S; Ferreira, HM;

Publication
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS

Abstract
Outlier detection is a widely used technique for identifying anomalous or exceptional events across various contexts. It has proven to be valuable in applications like fault detection, fraud detection, and real-time monitoring systems. Detecting outliers in real time is crucial in several industries, such as financial fraud detection and quality control in manufacturing processes. In the context of big data, the amount of data generated is enormous, and traditional batch mode methods are not practical since the entire dataset is not available. The limited computational resources further compound this issue. Boxplot is a widely used batch mode algorithm for outlier detection that involves several derivations. However, the lack of an incremental closed form for statistical calculations during boxplot construction poses considerable challenges for its application within the realm of big data. We propose an incremental/online version of the boxplot algorithm to address these challenges. Our proposed algorithm is based on an approximation approach that involves numerical integration of the histogram and calculation of the cumulative distribution function. This approach is independent of the dataset's distribution, making it effective for all types of distributions, whether skewed or not. To assess the efficacy of the proposed algorithm, we conducted tests using simulated datasets featuring varying degrees of skewness. Additionally, we applied the algorithm to a real-world dataset concerning software fault detection, which posed a considerable challenge. The experimental results underscored the robust performance of our proposed algorithm, highlighting its efficacy comparable to batch mode methods that access the entire dataset. Our online boxplot method, leveraging dataset distribution to define whiskers, consistently achieved exceptional outlier detection results. Notably, our algorithm demonstrated computational efficiency, maintaining constant memory usage with minimal hyperparameter tuning.

2025

KDBI special issue: Time-series pattern verification in CNC turning-A comparative study of one-class and binary classification

Authors
da Silva, JP; Nogueira, AR; Pinto, J; Curral, M; Alves, AC; Sousa, R;

Publication
EXPERT SYSTEMS

Abstract
Integrating Industry 4.0 and Quality 4.0 optimises manufacturing through IoT and ML, improving processes and product quality. The primary challenge involves identifying patterns in computer numerical control (CNC) machining time-series data to boost manufacturing quality control. The proposed solution involves an experimental study comparing one-class and binary classification algorithms. This study aims to classify time-series data from CNC turning machines, offering insight into monitoring and adjusting tool wear to maintain product quality. The methodology entails extracting spectral features from time-series data to train both one-class and binary classification algorithms, assessing their effectiveness and computational efficiency. Although certain models consistently outperform others, determining the best performing is not possible, as a trade-off between classification and computational performance is observed, with gradient boosting standing out for effectively balancing both aspects. Thus, the choice between one-class and binary classification ultimately relies on dataset's features and task objectives.

2025

Preface

Authors
Campos, R; Jorge, M; Jatowt, A; Bhatia, S; Litvak, M;

Publication
CEUR Workshop Proceedings

Abstract
[No abstract available]

2025

The 8th International Workshop on Narrative Extraction from Texts: Text2Story 2025

Authors
Campos, R; Jorge, A; Jatowt, A; Bhatia, S; Litvak, M;

Publication
Advances in Information Retrieval - 47th European Conference on Information Retrieval, ECIR 2025, Lucca, Italy, April 6-10, 2025, Proceedings, Part V

Abstract
For seven years, the Text2Story Workshop series has fostered a vibrant community dedicated to understanding narrative structure in text, resulting in significant contributions to the field and developing a shared understanding of the challenges in this domain. While traditional methods have yielded valuable insights, the advent of Transformers and LLMs have ignited a new wave of interest in narrative understanding. The previous iteration of the workshop also witnessed a surge in LLM-based approaches, demonstrating the community’s growing recognition of their potential. In this eighth edition we propose to go deeper into the role of LLMs in narrative understanding. While LLMs have revolutionized the field of NLP and are the go-to tools for any NLP task, the ability to capture, represent and analyze contextual nuances in longer texts is still an elusive goal, let alone the understanding of consistent fine-grained narrative structures in text. Consequently, this iteration of the workshop will explore the issues involved in using LLMs to unravel narrative structures, while also examining the characteristics of narratives generated by LLMs. By fostering dialogue on these emerging areas, we aim to continue the workshop's tradition of driving innovation in narrative understanding research. Text2Story encompasses sessions covering full research papers, work-in-progress, demos, resources, position and dissemination papers, along with one keynote talk. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

2025

Enhancing Portuguese Variety Identification with Cross-Domain Approaches

Authors
Sousa, HO; Almeida, R; Silvano, P; Cantante, I; Campos, R; Jorge, AM;

Publication
AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25 - March 4, 2025, Philadelphia, PA, USA

Abstract
Recent advances in natural language processing have raised expectations for generative models to produce coherent text across diverse language varieties. In the particular case of the Portuguese language, the predominance of Brazilian Portuguese corpora online introduces linguistic biases in these models, limiting their applicability outside of Brazil. To address this gap and promote the creation of European Portuguese resources, we developed a cross-domain language variety identifier (LVI) to discriminate between European and Brazilian Portuguese. Motivated by the findings of our literature review, we compiled the PtBrVarId corpus, a cross-domain LVI dataset, and study the effectiveness of transformer-based LVI classifiers for cross-domain scenarios. Although this research focuses on two Portuguese varieties, our contribution can be extended to other varieties and languages. We open source the code, corpus, and models to foster further research in this task. © 2025, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Facts & Figures

19Papers in indexed journals

2020

72Researchers

2016

14Proceedings in indexed conferences

2020