2023
Authors
Clemente, F; Ribeiro, GM; Quemy, A; Santos, MS; Pereira, RC; Barros, A;
Publication
NEUROCOMPUTING
Abstract
ydata-profiling is an open-source Python package for advanced exploratory data analysis that enables users to generate data profiling reports in a simple, fast, and efficient manner, fostering a standardized and visual understanding of the data. Beyond traditional descriptive properties and statistics, ydata-profiling follows a Data-Centric AI approach to exploratory analysis, as it focuses on the automatic detection and highlighting of complex data characteristics often associated with potential data quality issues, such as high ratios of missing or imbalanced data, infinite, unique, or constant values, skewness, high correlation, high cardinality, non-stationarity, seasonality, duplicate records, and other inconsistencies. The source code, documentation, and examples are available in the GitHub repository: https://github.com/ydataai/ydata-profiling.
2023
Authors
Silvano, P; Cordeiro, J; Leal, A; Pais, S;
Publication
LDK
Abstract
The main objective of this paper is to introduce
a new language resource for some varieties of
Portuguese - European, Brazilian, Mozambican,
and Angolan - and for British English,
called DRIPPS (Discourse Relations In Perfect
Participial Sentences). The corpus DRIPPS
comprises, at the moment, 993 adverbial perfect
participial sentences annotated with Discourse
Relations and with the following Discourse
Relational Devices: connectors, ordering
of the clauses, temporal relations, tenses,
and aspectual types. Additionally, an application
with a Graphical User Interface (GUI)
has been developed not only to browse and
manipulate the corpus but also to allow the
activation of specific Discourse Relation constraints,
thereby selecting specific cases from
the data set that can be analyzed separately.
Besides calculating simple counts and percentages,
insightful statistical graphs can be generated
and visualized on the fly from the combination
of the user-selected constraints and the
loaded corpora. The application is pre-loaded
with Portuguese and English cases and allows
to import/load further cases from different languages/
varieties.
2023
Authors
Silvano, P; Damova, M;
Publication
LDK
Abstract
2023
Authors
Damova, M; Mishev, K; Oleskeviciene, GV; Liebeskind, C; Silvano, P; Trajanov, D; Truica, CO; Apostol, ES; Chiarcos, C; Baczkowska, A;
Publication
LDK
Abstract
2023
Authors
Torres, Ana; Fonseca, Leonor; Ferreira, Marta; Silvano, Maria da Purificação; Lobo, José Manuel Sousa; Almeida, Isabel F.;
Publication
Abstract
Nowadays, the information regarding cosmetic products available to the community is vast,
although not always trustworthy. The Pharmaceutical Technology Laboratory of the Faculty of
Pharmacy of the University of Porto (FFUP) launched the Portal infoCosméticos aiming to
provide professionals involved in cosmetic advice with reliable information, supported by up-to-date
scientific evidence, while empowering Portuguese-speaking consumers to make better informed
choices. Pre and post-graduates of the master's degree in Pharmaceutical Sciences
are responsible for developing contents, which are submitted to a linguistic review by students
of the Faculty of Arts and Humanities. Firstly, a relevant question is identified, following a
comprehensive search on the topic and the creation of an infographic. The scientific validation
is carried out by national and international scholars, and the national regulatory authority,
INFARMED. Since it was released in 2017, the website has hit more than 170,000 visualizations,
covering topics related to regulatory affairs, safety and efficacy, cosmetic ingredients and
cosmetic products. The most accessed topics by digital users were disclosed by monitoring the
visualizations of each question with Google Analytics, considering the publication date.
According to the records, consumers seem to be more concerned about the safety of cosmetics
and interested to know more about their composition.
2023
Authors
Chiarcos, C; Silvano, P; Damova, M; Oleskeviciene, GV; Liebeskind, C; Trajanov, D; Truica, CO; Apostol, ES; Baczkowska, A;
Publication
RASPRAVE
Abstract
Linguistic Linked Open Data (LLOD) are technologies that provide a powerful instrument for representing and interpreting language phenomena on a web-scale. The main objective of this paper is to demonstrate how LLOD technologies can be applied to represent and annotate a corpus composed of multiword discourse markers, and what the effects of this are. In particular, it is our aim to apply semantic web standards such as RDF and OWL for publishing and integrating data. We present a novel scheme for discourse annotation that combines ISO standards describing discourse relations and dialogue acts - ISO DR-Core (ISO 24617-8) and ISO-Dialogue Acts (ISO 24617-2) in 9 languages (cf. Silvano and Damova 2022; Silvano et al. 2022). We develop an OWL ontology to formalize that scheme, provide a newly annotated dataset and link its RDF edition with the ontology. Consequently, we describe the conjoint querying of the ontology and the annotations by means of SPARQL, the standard query language for the web of data. The ultimate result is that we are able to perform queries over multiple, interlinked datasets with complex internal structure. This is a first, but essential step, in developing novel, powerful, and groundbreaking means for the corpus-based study of multilingual discourse, communication analysis, or attitudes discovery.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.