2011
Authors
Ferreira, P; Dutra, I; Fonseca, NA; Woods, R; Burnside, E;
Publication
HEALTHINF 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON HEALTH INFORMATICS
Abstract
Breast screening is the regular examination of a woman's breasts to find breast cancer in an initial stage. The sole exam approved for this purpose is mammography that, despite the existence of more advanced technologies, is considered the cheapest and most efficient method to detect cancer in a preclinical stage. We investigate, using machine learning techniques, how attributes obtained from mammographies can relate to malignancy. In particular, this study focus is on how mass density can influence malignancy from a data set of 348 patients containing, among other information, results of biopsies. To this end, we applied different learning algorithms on the data set using the WEKA tools, and performed significance tests on the results. The conclusions are threefold: (1) automatic classification of a mammography can reach equal or better results than the ones annotated by specialists, which can help doctors to quickly concentrate on some specific mammogram for a more thorough study; (2) mass density seems to be a good indicator of malignancy, as previous studies suggested; (3) we can obtain classifiers that can predict mass density with a quality as good as the specialist blind to biopsy.
2009
Authors
Fonseca, NA; Dutra, I;
Publication
IBERGRID: 3RD IBERIAN GRID INFRASTRUCTURE CONFERENCE PROCEEDINGS
Abstract
From an application point of view, the Grid computing with its powerful processing power and large amounts of data storage offers the possibility to process large quantities of data, to run computationally-intensive operations, or both. For instance, in computational biological pipelines, one often has to process large quantities of data in individually computationally-intensive operations. To process this data in the Grid, hundreds, or even thousands of jobs need to be submitted and their results processed. Obviously, performing these tasks manually is unfeasible. On the other hand, developing software to this end, specifically for a single application, is unproductive because if the application changes, or the Grid submission engine changes, then the code needs to be rewritten. In this paper we present a middleware that facilitates the submission of jobs to grids (or clusters) and helps handling their results. The middleware, that we call UbiDis (Ubiquitous Distribution), copies all files necessary for running the program to the UI or front-end host (in a Grid or cluster), compiles programs on the UI or front-end (if necessary), generates and submits the jobs, and copies the outputs to the local machine. Furthermore, UbiDis transparently generates jobs to different job managers, allowing the user to easily and quickly change the location to where the jobs are submitted. Finally, we illustrate the usefulness of UbiDis using two applications.
2008
Authors
Costa, VS; Fonseca, NA; Camacho, R;
Publication
2008 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, PROCEEDINGS
Abstract
One of the most well known successes of Inductive Logic Programming (ILP) is on Structure-Activity Relationship (SAR) problems. In such problems, ILP has proved several times to be capable of constructing expert comprehensible models that hell) to explain the activity of chemical compounds based on their structure and properties. However, despite its successes on SAR problems, ILP has severe scalability problems that prevent its application oil larger datasets. In this paper we present LogCHEM, an ILP based tool for discriminative interactive mining of chemical fragments. LogCHEM tackles ILP's scalability issues in the context of SAR applications. We show that LogCHEM benefits from the flexibility of ILP both by its ability to quickly extend the original mining model, and by its ability, to interface with external tools. Furthermore, We demonstrate that LogCHEM can be used to mine effectively large chemoinformatics datasets, namely, several datasets from EPA's DSSTox database and on a dataset based on the DTP AIDS anti-viral screen.
2012
Authors
Camacho, R; Ferreira, R; Rosa, N; Guimaraes, V; Fonseca, NA; Costa, VS; de Sousa, M; Magalhaes, A;
Publication
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS
Abstract
The functions of proteins in living organisms are related to their 3-D structure, which is known to be ultimately determined by their linear sequence of amino acids that together form these macromolecules. It is, therefore, of great importance to be able to understand and predict how the protein 3D-structure arises from a particular linear sequence of amino acids. In this paper we report the application of Machine Learning methods to predict, with high values of accuracy, the secondary structure of proteins, namely alpha-helices and beta-sheets, which are intermediate levels of the local structure.
1998
Authors
Fonseca, N; Costa, VS; Dutra, ID;
Publication
LOGIC PROGRAMMING - PROCEEDINGS OF THE 1998 JOINT INTERNATIONAL CONFERENCE AND SYMPOSIUM ON LOGIC PROGRAMMING
Abstract
One of the most important advantages of logic programming systems is that they allow the transparent exploitation of parallelism. The different forms of parallelism available and the complex nature of logic programming applications present interesting problems to both the users and the developers of these systems. Graphical visualisation tools can give a particularly important contribution, as they are easier to understand than text based tools, and allow both for a general overview of an execution and for focusing on its important details. Towards these goals, we propose VisAll, anew tool to visualise the parallel execution of logic programs. VisAll benefits from a modular design centered in a graph that represents a parallel execution. A main graphical shell commands the different modules and presents VisAll as an unified system. Several input components, or translators, support the well-known VisAndor and VACE trace formats, plus a new format designed for independent and-parallel plus or-parallel execution in the SEA. Several output components, or visualisers, allow for different visualisations of the same execution.
2011
Authors
Camacho, Rui; Pereira, Max; Costa, VitorSantos; Fonseca, NunoA.; Gonçalves, CarlosAdriano; Simões, CarlosJ.V.; Brito, RuiM.M.;
Publication
J. Integrative Bioinformatics
Abstract
It has been recognized that the development of new therapeutic drugs is a complex and expensive process. A large number of factors affect the activity in vivo of putative candidate molecules and the propensity for causing adverse and toxic effects is recognized as one of the major hurdles behind the current "target-rich, lead-poor" scenario. Structure-Activity Relationship (SAR) studies, using relational Machine Learning (ML) algorithms, have already been shown to be very useful in the complex process of rational drug design. Despite the ML successes, human expertise is still of the utmost importance in the drug development process. An iterative process and tight integration between the models developed by ML algorithms and the know-how of medicinal chemistry experts would be a very useful symbiotic approach. In this paper we describe a software tool that achieves that goal--iLogCHEM. The tool allows the use of Relational Learners in the task of identifying molecules or molecular fragments with potential to produce toxic effects, and thus help in stream-lining drug design in silico. It also allows the expert to guide the search for useful molecules without the need to know the details of the algorithms used. The models produced by the algorithms may be visualized using a graphical interface, that is of common use amongst researchers in structural biology and medicinal chemistry. The graphical interface enables the expert to provide feedback to the learning system. The developed tool has also facilities to handle the similarity bias typical of large chemical databases. For that purpose the user can filter out similar compounds when assembling a data set. Additionally, we propose ways of providing background knowledge for Relational Learners using the results of Graph Mining algorithms. Copyright 2011 The Author(s). Published by Journal of Integrative Bioinformatics.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.