Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by HumanISE

2010

Studying blog features over link popularity

Authors
Devezas, JoseLuis; Ribeiro, Cristina; Nunes, Sergio;

Publication
Proceedings of the 3rd Workshop on Social Network Mining and Analysis, SNAKDD 2009, Paris, France, June 28, 2009

Abstract
The study of the blogosphere can provide sociologically relevant data. We analyze the links between blogs in the portuguese blogosphere, in order to understand how they group and interact, to identify clusters and to characterize them. Our data set contains post data for more than 70,000 blogs, with over 400,000 links. The linkage data is represented as a blog graph and partitioned into several slices, according to their in-degree. We then study the evolution of blog features, and observe a consistent pattern of decrease in posting frequency, number of out-links, and post length, as we move from the highly-cited blogs to the less cited ones. Copyright 2010 ACM.

2010

Context effect on query formulation and subjective relevance in health searches

Authors
Teixeira Lopes, C; Ribeiro, C;

Publication
IIiX 2010 - Proceedings of the 2010 Information Interaction in Context Symposium

Abstract
It is recognized by the Information Retrieval community that context affects the retrieval process. Query formulation and relevance assessment are stages where the user role is central. The first determines what the system will search for and the second is frequently used to evaluate how the system behaved. With a large human involvement, these stages are expected to be largely influenced by user and task characteristics. To analyze the influence of these context features on the specified stages of health information retrieval, we conducted a user study in which we collected user features through two questionnaires. User characteristics include features like age, gender, web search experience, health search experience and familiarity with the medical topic. Task features include the medical specialty, the question type, the task's clarity and the task's easiness. Besides user and task features, the relevance assessment analysis also covered features related to the query and document. We found many variables do indeed affect query formulation and relevance judgment. Some of our results question evaluations using test collections and ask for evaluation models that incorporate other kind of success measures. Copyright 2010 ACM.

2010

Using Local Precision to Compare Search Engines in Consumer Health Information Retrieval

Authors
Lopes, CT; Ribeiro, C;

Publication
SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL

Abstract
We have conducted a user study to evaluate several generalist and health-specific search engines on health information retrieval. Users evaluated the relevance of the top 30 documents of 4 search engines in two different health information needs. We introduce the concepts of local and global precision and analyze how they affect the evaluation. Results show that Google surpasses the precision of all other engines, including the health-specific ones, and that precision differs with the type of clinical question and its medical specialty.

2010

Evaluation of global descriptors for multimedia retrieval in medical applications

Authors
Coelho, F; Ribeiro, C;

Publication
Proceedings - 21st International Workshop on Database and Expert Systems Applications, DEXA 2010

Abstract
In this paper, global descriptors from MPEG7, GIST and Compact Composite Descriptors are evaluated for image retrieval in the IRMA-2007 medical collection. This evaluation tests descriptors using every image from each class instead of a small group of representative images. The evaluation results obtained by Mean-Average Precision (MAP) and precision@N indicate that MPEG7 EH, GIST and Fuzzy BTDH outperform the other global descriptors analyzed by a large margin, even more when combined by late-fusion rank aggregation. A multimedia retrieval evaluation systemwas developed to support the experiment and offers the possibility of textual, visual and combined searches over the medical collection. © 2010 IEEE.

2010

FEUP at TREC 2010 blog track: Using h-index for blog ranking

Authors
Devezas, JL; Nunes, S; Ribeiro, C;

Publication
NIST Special Publication

Abstract
This paper describes the participation of FEUP, from the University of Porto, in the TREC 2010 Blog Track. FEUP participated in the baseline blog distillation task with work focused on the use of link features available in the TREC Blogs08 collection. The approach presented in this paper uses the link information available in most individual posts to amplify each post's score. Blog scores, and subsequent ranks, are obtained by combining individual post scores. We boost post scores using the in-degree of each post and the h-index of each blog. This results in an improvement of P@10, over our baseline, for the in-degree and the h-index runs. When compared to the in- degree, the h-index run results in higher performance values for each of the applied evaluation metrics.

2010

AN ARCHITECTURE FOR COLLABORATIVE DATA MINING

Authors
Correia, F; Camacho, R; Lopes, JC;

Publication
KDIR 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL

Abstract
Collaborative Data Mining (CDM) develops techniques to solve complex problems of data analysis requiring sets of experts in different domains that may be geographically separate. An important issue in CDM is the sharing of experience among the different experts. In this paper we report on a framework that enables users with different expertise to perform data analysis activities and profit, in a collaborative fashion, from expertise and results of other researchers. The collaborative process is supported by web services that seek for relevant knowledge available among the collaborative web sites. We have successfully designed and deployed a prototype for collaborative Data Mining in domains of Molecular Biology and Chemoinformatics.

  • 569
  • 662