Publicacoes - INESC TEC

Publicações

2025

Robotic Process Automation Comparative Analysis of Market Solutions

Autores
Silva, A; Mamede, HS; Santos, V; Santos, A; Silveira, C;

Publicação
MARKETING AND SMART TECHNOLOGIES, ICMARKTECH 2024, VOL 1

Abstract
Numerous Robotic Process Automation (RPA) market solutions with wildly disparate capabilities and business models are being put forth. RPA is still in its infancy, and its technology framework is continually evolving. There are very few comparative studies of RPA systems, and they do not make it simple to tailor the solution to the needs of the business choosing it. Thus, the research question is that it feasible to design a procedure that enables the choice of the most appropriate RPA tool while accounting for a particular business domain, reality, and set of requirements? In order to accomplish this, this study builds an artifact that comprises a collection of indicators to enable the long-term selection of the best RPA solution for each organization and/or business process using the methodological approach of Design Science Research. The artifact offers a methodology to categorize the level of adaptability of each solution for automating business processes, performs a comparative analysis of existing RPA solutions using a particular framework, and provides an overview of the features of currently available solutions on the market. The viability of the artifact is demonstrated using a real-world case situation. This test demonstrated the artifact's capacity to meet the goals.

FecharLer Abstract

2025

Enhancing carsharing pricing and operations through integrated choice models

Autores
Oliveira, BB; Ahipasaoglu, SD;

Publicação
TRANSPORTATION RESEARCH PART E-LOGISTICS AND TRANSPORTATION REVIEW

Abstract
Balancing supply and demand in free-floating one-way carsharing systems is a critical operational challenge. This paper presents a novel approach that integrates a binary logit model into a mixed integer linear programming framework to optimize short-term pricing and fleet relocation. Demand modeling, based on a binary logit model, aggregates different trips under a unified utility model and improves estimation by incorporating information from similar trips. To speed up the estimation process, a categorizing approach is used, where variables such as location and time are classified into a few categories based on shared attributes. This is particularly beneficial for trips with limited observations as information gained from similar trips can be used for these trips effectively. The modeling framework adopts a dynamic structure where the binary logit model estimates demand using accumulated observations from past iterations at each decision point. This continuous learning environment allows for dynamic improvement in estimation and decision-making. At the core of the framework is a mathematical program that prescribes optimal levels of promotion and relocation. The framework then includes simulated market responses to the decisions, allowing for real-time adjustments to effectively balance supply and demand. Computational experiments demonstrate the effectiveness of the proposed approach and highlight its potential for real-world applications. The continuous learning environment, combining demand modeling and operational decisions, opens avenues for future research in transportation systems.

FecharLer Abstract

2025

Does Every Computer Scientist Need to Know Formal Methods?

Autores
Broy, M; Brucker, AD; Fantechi, A; Gleirscher, M; Havelund, K; Kuppe, MA; Mendes, A; Platzer, A; Ringert, JO; Sullivan, A;

Publicação
FORMAL ASPECTS OF COMPUTING

Abstract
We focus on the integration of Formal Methods as mandatory theme in any Computer Science University curriculum. In particular, when considering the ACM Curriculum for Computer Science, the inclusion of Formal Methods as a mandatory Knowledge Area needs arguing for why and how does every computer science graduate benefit from such knowledge. We do not agree with the sentence While there is a belief that formal methods are important and they are growing in importance, we cannot state that every computer science graduate will need to use formal methods in their career. We argue that formal methods are and have to be an integral part of every computer science curriculum. Just as not all graduates will need to know how to work with databases either, it is still important for students to have a basic understanding of how data is stored and managed efficiently. The same way, students have to understand why and how formal methods work, what their formal background is, and how they are justified. No engineer should be ignorant of the foundations of their subject and the formal methods based on these. In this article, we aim at highlighting why every computer scientist needs to be familiar with formal methods. We argue that education in formal methods plays a key role by shaping students' programming mindset, fostering an appreciation for underlying principles, and encouraging the practice of thoughtful program

FecharLer Abstract

2025

Clustering source code from automated assessment of programming assignments

Autores
Paiva, JC; Leal, JP; Figueira, A;

Publicação
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS

Abstract
Clustering of source code is a technique that can help improve feedback in automated program assessment. Grouping code submissions that contain similar mistakes can, for instance, facilitate the identification of students' difficulties to provide targeted feedback. Moreover, solutions with similar functionality but possibly different coding styles or progress levels can allow personalized feedback to students stuck at some point based on a more developed source code or even detect potential cases of plagiarism. However, existing clustering approaches for source code are mostly inadequate for automated feedback generation or assessment systems in programming education. They either give too much emphasis to syntactical program features, rely on expensive computations over pairs of programs, or require previously collected data. This paper introduces an online approach and implemented tool-AsanasCluster-to cluster source code submissions to programming assignments. The proposed approach relies on program attributes extracted from semantic graph representations of source code, including control and data flow features. The obtained feature vector values are fed into an incremental k-means model. Such a model aims to determine the closest cluster of solutions, as they enter the system, timely, considering clustering is an intermediate step for feedback generation in automated assessment. We have conducted a twofold evaluation of the tool to assess (1) its runtime performance and (2) its precision in separating different algorithmic strategies. To this end, we have applied our clustering approach on a public dataset of real submissions from undergraduate students to programming assignments, measuring the runtimes for the distinct tasks involved: building a model, identifying the closest cluster to a new observation, and recalculating partitions. As for the precision, we partition two groups of programs collected from GitHub. One group contains implementations of two searching algorithms, while the other has implementations of several sorting algorithms. AsanasCluster matches and, in some cases, improves the state-of-the-art clustering tools in terms of runtime performance and precision in identifying different algorithmic strategies. It does so without requiring the execution of the code. Moreover, it is able to start the clustering process from a dataset with only two submissions and continuously partition the observations as they enter the system.

FecharLer Abstract

2025

Streamlining Acceptance Test Generation for Mobile Applications Through Large Language Models: An Industrial Case Study

Autores
Fonseca, PL; Lima, B; Faria, JP;

Publicação
2025 40TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE

Abstract
Mobile acceptance testing remains a bottleneck in modern software development, particularly for cross-platform mobile development using frameworks like Flutter. While developers increasingly rely on automated testing tools, creating and maintaining acceptance test artifacts still demands significant manual effort. To help tackle this issue, we introduce AToMIC, an automated framework leveraging specialized Large Language Models to generate Gherkin scenarios, Page Objects, and executable UI test scripts directly from requirements (JIRA tickets) and recent code changes. Applied to BMW's MyBMW app, covering 13 real-world issues in a 170+ screen codebase, AToMIC produced executable test artifacts in under five minutes per feature on standard hardware. The generated artifacts were of high quality: 93.3% of Gherkin scenarios were syntactically correct upon generation, 78.8% of PageObjects ran without manual edits, and 100% of generated UI tests executed successfully. In a survey, all practitioners reported time savings (often a full developer-day per feature) and strong confidence in adopting the approach. These results confirm AToMIC as a scalable, practical solution for streamlining acceptance test creation and maintenance in industrial mobile projects.

FecharLer Abstract

2025

PAP900: A dataset of semantic relationships between affective words in Portuguese

Autores
dos Santos, AF; Leal, JP; Alves, RA; Jacques, T;

Publicação
DATA IN BRIEF

Abstract
The PAP900 dataset centers on the semantic relationship between affective words in Portuguese. It contains 900 word pairs, each annotated by at least 30 human raters for both semantic similarity and semantic relatedness. In addition to the semantic ratings, the dataset includes the word categorization used to build the word pairs and detailed sociodemographic information about annotators, enabling the analysis of the influence of personal factors on the perception of semantic relationships. Furthermore, this article describes in detail the dataset construction process, from word selection to agreement metrics. Data was collected from Portuguese university psychology students, who completed two rounds of questionnaires. In the first round annotators were asked to rate word pairs on either semantic similarity or relatedness. The second round switched the relation type for most annotators, with a small percentage being asked to repeat the same relation. The instructions given emphasized the differences between semantic relatedness and semantic similarity, and provided examples of expected ratings of both. There are few semantic relations datasets in Portuguese, and none focusing on affective words. PAP900 is distributed in distinct formats to be easy to use for both researchers just looking for the final averaged values and for researchers looking to take advantage of the individual ratings, the word categorization and the annotator data. This dataset is a valuable resource for researchers in computational linguistics, natural language processing, psychology, and cognitive science. (c) 2025TheAuthors.

FecharLer Abstract

216
4496