2013
Authors
da Silva, JR; Ribeiro, C; Lopes, JC;
Publication
INNOVATIONS IN XML APPLICATIONS AND METADATA MANAGEMENT: ADVANCING TECHNOLOGIES
Abstract
This chapter consists of a solution for the management of research data at a higher education and research institution. The chapter is based on a small-scale data audit study, which included contacts with researchers and yielded some preliminary requirements and use cases. These requirements led to the design of a data curation workflow involving the researcher, the curator, and a data repository. The authors describe the features of the data repository prototype, which is an extension to the widely used DSpace repository platform and introduced a set of features mentioned by the majority of the interviewed researchers as relevant for a data repository. The data repository platform contributes to the curation workflow at the university, with XML technology at its core-data is stored using XML documents, which can be systematically processed and queried unlike its original-format counterpart. This system is capable of indexing, querying, and retrieving, in whole or in part, datasets represented in tabular form. There is also the possibility of using elements from domain-specific XML schemas for the cataloguing process, improving the interoperability and quality of the deposited data. Copyright (C) 2013, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
2013
Authors
Lopes, CT; Ribeiro, C;
Publication
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY
Abstract
English is by far the most used language on the web. In some domains, the existence of less content in the users' native language may not be problematic and even help to cope with the information overload. Yet, in domains such as health, where information quality is critical, a larger quantity of information may mean easier access to higher quality content. Query translation may be a good strategy to access content in other languages, but the presence of medical terms in health queries makes the translation process more difficult, even for users with very good language proficiencies. In this study, we evaluate how translating a health query affects users with different language proficiencies. We chose English as the non-native language because it is a widely spoken language and it is the most used language on the web. Our findings suggest that non-English-speaking users having at least elementary English proficiency can benefit from a system that suggests English alternatives for their queries, or automatically retrieves English content from a non-English query. This awareness of the user profile results in higher precision, more accurate medical knowledge, and better access to high-quality content. Moreover, the suggestions of English-translated queries may also trigger new health search strategies.
2013
Authors
Coelho, F; Devezas, JL; Ribeiro, C;
Publication
Open research Areas in Information Retrieval, OAIR '13, Lisbon, Portugal, May 15-17, 2013
Abstract
2013
Authors
Lopes, CT; Dias, D; Ribeiro, C;
Publication
ADVANCES IN INFORMATION SYSTEMS AND TECHNOLOGIES
Abstract
In this paper we propose a multilingual method to identify health-related queries and classify them into health categories. Our method uses a consumer health vocabulary and the Unified Medical Language System semantic structure to compute the association degree of a query to medical concepts and categories. This method can be applied in different languages with translated versions of the health vocabulary. To evaluate its efficacy and applicability in two languages we used two manually classified sets of queries, each on a different language. Results are better for the English sample where a distance of 0.38 to the ROC optimal point (0,1) was obtained. This shows some influence of the translation in the method's performance.
2013
Authors
Lopes, CT; Ribeiro, C;
Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Abstract
We conducted a user study to analyze how health literacy, topic familiarity and the terminology used in past queries affect query behavior in health searches. We found that users with inadequate health literacy have less success in web searches and show more difficulties in query formulation. These users and the ones not familiar with the topic use medico-scientific terminology less often than users with more health literacy and topic familiarity. We conclude that search engines should help these groups of users in query formulation and, since technical documents stimulate the use of medico-scientific terminology in query reformulation, mechanisms like query suggestion can have long-term benefits. © 2013 Springer-Verlag.
2013
Authors
Toledo, FMB; Carravilla, MA; Ribeiro, C; Oliveira, JF; Gomes, AM;
Publication
INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS
Abstract
The nesting problem, also known as irregular packing problem, belongs to the generic class of cutting and packing (C&P) problems. It differs from other 2-D C&P problems in the irregular shape of the pieces. This paper proposes a new mixed-integer model in which binary decision variables are associated with each discrete point of the board (a dot) and with each piece type. It is much more flexible than previously proposed formulations and solves to optimality larger instances of the nesting problem, at the cost of having its precision dependent on board discretization. To date no results have been published concerning optimal solutions for nesting problems with more than 7 pieces. We ran computational experiments on 45 problem instances with the new model, solving to optimality 34 instances with a total number of pieces ranging from 16 to 56, depending on the number of piece types, grid resolution and the size of the board. A strong advantage of the model is its insensitivity to piece and board geometry, making it easy to extend to more complex problems such as non-convex boards, possibly with defects. Additionally, the number of binary variables does not depend on the total number of pieces but on the number of piece types, making the model particularly suitable for problems with few piece types. The discrete nature of the model requires a trade-off between grid resolution and problem size, as the number of binary variables grows with the square of the selected grid resolution and with board size.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.