2025
Authors
Jesus, G; Singh, SAK; Nunes, S; Yates, A;
Publication
PROCEEDINGS OF THE 2025 INTERNATIONAL ACM SIGIR CONFERENCE ON INNOVATIVE CONCEPTS AND THEORIES IN INFORMATION RETRIEVAL, ICTIR 2025
Abstract
Dense retrieval models are generally trained using supervised learning approaches for representation learning, which require a labeled dataset (i.e., query-document pairs). However, training such models from scratch is not feasible for most languages, particularly under-resourced ones, due to data scarcity and computational constraints. As an alternative, pretrained dense retrieval models can be fine-tuned for specific downstream tasks or applied directly in zero-shot settings. Given the lack of labeled data for Tetun and the fact that existing dense retrieval models do not currently support the language, this study investigates their application in zero-shot, out-of-distribution scenarios. We adapted these models to Tetun documents, producing zero-shot embeddings, to evaluate their performance across various document representations and retrieval strategies for the ad-hoc text retrieval task. The results show that most pretrained monolingual dense retrieval models outperformed their multilingual counterparts in various configurations. Given the lack of dense retrieval models specialized for Tetun, we combine Hiemstra LM with ColBERTv2 in a hybrid strategy, achieving a relative improvement of +2.01% in P@10, +4.24% in MAP@10, and +2.45% in NDCG@10 over the baseline, based on evaluations using 59 queries and up to 2,000 retrieved documents per query. We propose dual tuning parameters for the score fusion approach commonly used in hybrid retrieval and demonstrate that enriching document titles with summaries generated by a large language model (LLM) from the documents' content significantly enhances the performance of hybrid retrieval strategies in Tetun. To support reproducibility, we publicly release the LLM-generated document summaries and run files.
2025
Authors
Sousa, H; Almeida, R; Silvano, P; Cantante, I; Campos, R; Jorge, A;
Publication
THIRTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, AAAI-25, VOL 39 NO 24
Abstract
Recent advances in natural language processing have raised expectations for generative models to produce coherent text across diverse language varieties. In the particular case of the Portuguese language, the predominance of Brazilian Portuguese corpora online introduces linguistic biases in these models, limiting their applicability outside of Brazil. To address this gap and promote the creation of European Portuguese resources, we developed a cross-domain language variety identifier (LVI) to discriminate between European and Brazilian Portuguese. Motivated by the findings of our literature review, we compiled the PtBrVarId corpus, a cross-domain LVI dataset, and study the effectiveness of transformer-based LVI classifiers for cross-domain scenarios. Although this research focuses on two Portuguese varieties, our contribution can be extended to other varieties and languages. We open source the code, corpus, and models to foster further research in this task.
2025
Authors
Sousa, J; Brandau, B; Darabi, R; Sousa, A; Brueckner, F; Reis, A; Reis, LP;
Publication
IEEE ACCESS
Abstract
Laser-based additive manufacturing (LAM) offers the ability to produce near-net-shape metal parts with unparalleled energy efficiency and flexibility in both geometry and material selection. Despite advantages, these processes are inherently, as they are characterized by multiphysics interactions, multiscale phenomena, and highly dynamic behaviors, making their modeling and optimization particularly challenging. Artificial intelligence (AI) has emerged as a promising tool for enhancing the monitoring and control of additive manufacturing. This paper presents a systematic review of AI applications for real-time control of laser-based manufacturing processes, analyzing 16 relevant articles sourced from Scopus, IEEE Xplore, and Web of Science databases. The primary objective of this work is to contribute to the advancement of autonomous manufacturing systems capable of self-monitoring and self-correction, ensuring optimal part quality, enhanced efficiency, and reduced human intervention. Our findings indicate that 62.5 % of the 16 analyzed studies have deployed AI-driven controllers in real-world scenarios, with over 56 % using AI for control strategies, such as Reinforcement Learning. Furthermore, 62.5 % of the studies employed AI for process modeling or monitoring, which was integral to the development or data pipelines of the controllers. By defining a groundwork for future developments, this review not only highlights current advancements but also hints future innovations that will likely include AI-based controllers.
2025
Authors
Veiga, A; Gomes, AM; Remiao, F;
Publication
JOURNAL OF APPLIED RESEARCH IN HIGHER EDUCATION
Abstract
PurposeThe present study aims to analyse the presumed relationship between VLC use and students' grades.Design/methodology/approachThe research strategy unfolds as a case study (Yin, 1994), framed by how undergraduate students of pharmaceutical sciences used video lecture capture (VLC) and the impact of VLC on pedagogic differentiation. Looking at the course of Mechanistic Toxicology (MecTox), the objective is to describe this case of pharmaceutical sciences in depth.FindingsThe findings reveal that over 90% of students engaged with VLC videos, with the average viewing time exceeding the total available video minutes, indicating strong student engagement. The study particularly highlights VLC's positive impact on students with lower academic performance (grades D and E), suggesting that VLC can help reduce the performance gap and support a more inclusive educational environment.Research limitations/implicationsThe findings may have limited generalisability beyond the specific context and sample used. However, this study allows the research findings to be compared with previous research (Remi & atilde;o et al., 2022), contributing to the debate on how pedagogic research can promote evidence-based decisions regarding innovative strategies. The meaning of educational inclusion processes and diversity is, thus, contingent on the institutionalisation of research as a practice of teaching and learning.Practical implicationsThe results of this study thus provide interesting insights for the design of strategic action, considering the diversity of students as seen in parents' academic qualifications and students' conditions (e.g. student-workers, living away from home, holding a grant of economic and social support).Social implicationsThe implications of research findings for society bring the issue of equity in education to the fore. By addressing the diverse needs of students, HEIs can contribute to greater educational equity.Originality/valueUsing VLC as a differentiated pedagogic device might give diversity real content insofar as institutional and national policies can mitigate the possible negative effects of parents' low academic qualifications and the students' conditions of living away from their residence area and holding a grant of economic and social support.
2025
Authors
Venkatesan, V; Blunt, S; Wang, JJ; Lacour, S; Marleau, GD; Coleman, GAL; Guerrero, L; Balmer, WO; Pueyo, L; Stolker, T; Kammerer, J; Pourré, N; Nowak, M; Rickman, E; Sivaramakrishnan, A; Sing, D; Wagner, K; Lagrange, AM; Abuter, R; Amorim, A; Asensio-Torres, R; Berger, JP; Beust, H; Boccaletti, A; Bonnefoy, M; Bonnet, H; Bordoni, MS; Bourdarot, G; Brandner, W; Cantalloube, F; Caselli, P; Charnay, B; Chauvin, G; Chavez, A; Chomez, A; Choquet, E; Christiaens, V; Clénet, Y; du Foresto, VC; Cridland, A; Davies, R; Dembet, R; Dexter, J; Drescher, A; Duvert, G; Eckart, A; Eisenhauer, F; Schreiber, NMF; Garcia, P; Lopez, RG; Gendron, E; Genzel, R; Gillessen, S; Girard, JH; Grant, S; Haubois, X; Heissel, G; Henning, T; Hinkley, S; Hippler, S; Houllé, M; Hubert, Z; Jocou, L; Keppler, M; Kervella, P; Kreidberg, L; Kurtovic, NT; Lapeyrère, V; Le Bouquin, JB; Lutz, D; Maire, AL; Mang, F; Mérand, A; Mordasini, C; Mouillet, D; Nasedkin, E; Ott, T; Otten, GPPL; Paladini, C; Paumard, T; Perraut, K; Perrin, G; Petrus, S; Pfuhl, O; Ribeiro, DC; Rustamkulov, Z; Shangguan, J; Shimizu, T; Shields, A; Stadler, J; Straub, O; Straubmeier, C; Sturm, E; Tacconi, LJ; Vigan, A; Vincent, F; von Fellenberg, SD; Widmann, F; Winterhalder, TO; Woillez, J; Yazici, S;
Publication
ASTROPHYSICAL JOURNAL
Abstract
Understanding the orbits of giant planets is critical for testing planet formation models, particularly at wide separations (>10 au) where traditional core accretion becomes inefficient. However, constraining orbits at these separations has historically been challenging due to sparse orbital coverage and related degeneracies in the orbital parameters. In this work, we use existing high-resolution (R similar to 100,000) spectroscopic measurements from CRIRES+, astrometric data from SPHERE, NACO, and Atacama Large Millimeter/submillimeter Array, and combine it with new high-precision GRAVITY astrometry data to refine the orbit of GQ Lup B, a similar to 30 M-J companion at similar to 100 au, in a system that also hosts a circumstellar disk and a wide companion, GQ Lup C. Including radial velocity (RV) data significantly improves orbital constraints by breaking the degeneracy between inclination and eccentricity that plagues astrometry-only fits for long-period companions. Our work is one of the first to combine high-precision astrometry with the companion's relative radial velocity measurements to achieve significantly improved orbital constraints. The eccentricity is refined from e=0.47(-0.16)(+0.14 )(GRAVITY only) to e=0.35(-0.09)(+0.10) when RVs and GRAVITY data are combined. We also compute the mutual inclinations between the orbit of GQ Lup B, the circumstellar disk, the stellar spin axis, and the disk of GQ Lup C. The orbit is misaligned by 63(-14)(+6) degrees relative to the circumstellar disk, 52(-24)(+19 )degrees with the host star's spin axis, but appears more consistent ( 34-13+6 degrees) with the inclination of the wide tertiary companion GQ Lup C's disk. These results support a formation scenario for GQ Lup B consistent with cloud fragmentation. They highlight the power of combining companion RV constraints with interferometric astrometry to probe the dynamics and formation of wide-orbit substellar companions.
2025
Authors
Pereira, RR; Bono, J; Ferreira, HM; Ribeiro, P; Soares, C; Bizarro, P;
Publication
ECML/PKDD (9)
Abstract
When the available data for a target domain is limited, transfer learning (TL) methods leverage related data-rich source domains to train and evaluate models, before deploying them on the target domain. However, most TL methods assume fixed levels of labeled and unlabeled target data, which contrasts with real-world scenarios where both data and labels arrive progressively over time. As a result, evaluations based on these static assumptions may not reflect how methods perform in practice. To support a more realistic assessment of TL methods in dynamic settings, we propose an evaluation framework that (1) simulates varying data availability over time, (2) creates multiple domains via resampling of a given dataset and (3) introduces inter-domain variability through controlled transformations, e.g., including time-dependent covariate and concept shifts. These capabilities enable the systematic simulation of a large number of variants of the experiments, providing deeper insights into how algorithms may behave when deployed. We demonstrate the usefulness of the proposed framework by performing a case study on a proprietary real-world suite of card payment datasets. To support reproducibility, we also apply the framework on the publicly available Bank Account Fraud (BAF) dataset. By providing a methodology for evaluating TL methods over time and in different data availability conditions, our framework supports a better understanding of model behavior in real-world environments, which enables more informed decisions when deploying models in new domains.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.