Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por CRACS

1998

VisAll: A universal tool to visualise the parallel execution of logic programs

Autores
Fonseca, N; Costa, VS; Dutra, ID;

Publicação
LOGIC PROGRAMMING - PROCEEDINGS OF THE 1998 JOINT INTERNATIONAL CONFERENCE AND SYMPOSIUM ON LOGIC PROGRAMMING

Abstract
One of the most important advantages of logic programming systems is that they allow the transparent exploitation of parallelism. The different forms of parallelism available and the complex nature of logic programming applications present interesting problems to both the users and the developers of these systems. Graphical visualisation tools can give a particularly important contribution, as they are easier to understand than text based tools, and allow both for a general overview of an execution and for focusing on its important details. Towards these goals, we propose VisAll, anew tool to visualise the parallel execution of logic programs. VisAll benefits from a modular design centered in a graph that represents a parallel execution. A main graphical shell commands the different modules and presents VisAll as an unified system. Several input components, or translators, support the well-known VisAndor and VACE trace formats, plus a new format designed for independent and-parallel plus or-parallel execution in the SEA. Several output components, or visualisers, allow for different visualisations of the same execution.

1998

Optimising parallel logic programming systems for scalable machines

Autores
Costa, VS; Bianchini, R;

Publicação
EURO-PAR '98 PARALLEL PROCESSING

Abstract
Parallel logic programming (PLP) systems have obtained good performance on traditional bus-based shared-memory architectures. However, the scalable multiprocessors being developed today pose new challenges. Our experience with a sophisticated PLP system, Andorra-I, demonstrates that indeed performance suffers greatly on modern architectures. In order to improve performance, we perform a detailed analysis of the cache behaviour of all Andorra-I data structures via execution-driven simulation of a DASH-like multiprocessor. Based on this analysis we optimise the Andorra-I code using 5 different techniques. Our results show that the techniques provide significant performance improvements, leading to the conclusion that PLP systems can and should perform well on modern scalable multiprocessors.

1998

Distance: A New Metric for Controlling Granularity for Parallel Execution

Autores
Shen, K; Costa, VS; King, A;

Publicação
Proceedings of the 1998 Joint International Conference and Symposium on Logic Programming, Manchester, UK, June 15-19, 1998

Abstract

1997

Thread- and process-based implementations of the pSystem parallel programming environment

Autores
Lopes, LMB; Silva, FMA;

Publicação
SOFTWARE-PRACTICE & EXPERIENCE

Abstract
Run-time work distribution in parallel programming systems is usually accomplished through the use of dynamic scheduling heuristics. Their sensitivity to run-time information such as global work-load, task granularity, data dependencies, locality of information, among others, is essential when trying to optimize performance. Adaptive schedulers that base their decisions on feed-back from the system are therefore of special importance. We have developed and used a general purpose parallel programming system, the pSystem, that also served as a test-bed environment on which we have experimented and studied the performance of distinct scheduling heuristics. Currently, we have two versions of the system: one based on Unix processes; and the other on Solaris threads. Threads (particularly user-level threads) are usually associated with low execution overheads, since they require minimal interaction with the operating system kernel This suggests that lower grain parallelism may be more effectively exploited with a thread-based parallel programming system. Performance analysis of both implementations over a Set of well known benchmarks, with various schedulers, shows that threads scale better under higher system loads and/or when the granularity of the tasks being executed is below a given threshold value. This paper starts with a description of the design and implementation of the pSystem computational model, followed by a detailed description of several experiments and the analysis of their results. (C) 1997 John Wiley & Sons, Ltd.

1997

The SBA: Exploiting orthogonality in AND-OR parallel systems

Autores
Correia, ME; Silva, F; Costa, VS;

Publicação
LOGIC PROGRAMMING - PROCEEDINGS OF THE 1997 INTERNATIONAL SYMPOSIUM

Abstract
One of the advantages of logic programming in the fact that it offers many sources of implicit parallelism, such as and-parallelism and or-parallelism Recently, research has been concentrated on integrating the different forms of parallelism into a single combined system. In this work we concentrate on the problem of integrating or-parallelism and independent and-parallelism for parallel Prolog systems. We contend that previous data structures require pure recomputation and therefore do not allow for orthogonality between and parallelism and or-parallelism. In contrast, we submit that a simpler solution, the sparse binding array, does guarantee this goal, and explain in detail how independent and-parallelism and or-parallelism can thus be efficiently combined.

1997

Evaluating parallel logic programming systems on scalable multiprocessors

Autores
Costa, VS; Bianchini, R; de, CDI;

Publicação
International Symposium on Parallel Symbolic Computation, Proceedings, PASCO

Abstract
Parallel logic programming systems are sophisticated examples of symbolic computing systems. They address problems such as dynamic memory allocation, scheduling irregular execution patterns, and managing different types of implicit parallelism. Most parallel logic programming systems have been developed for bus-based shared-memory architectures. The complexity of parallel logic programming systems and the large amount of data they process raises the question of whether logic programming systems can still obtain good performance on scalable architectures, such as distributed shared-memory systems. In this work we use execution-driven simulation to investigate the access patterns and caching behaviour exhibited by a parallel logic programming system, Andorra-I. We show that the system obtains reasonable performance, but that it does not scale well. By studying the behaviour of the major data structures in Andorra-I in detail, we conclude that this result is largely a consequence of the scheduling and work manipulation implementation used in the system. We also show that the Andorra-I's data structures exhibit widely-varying memory access patterns and caching behaviour, which not only depend on the number of processors, but also on the amount and type of parallelism available in the application program. Some of these data structures clearly favour invalidate-based cache coherence protocols, while others favour update-based protocols. Since most of Andorra-I's data structures are common to other parallel logic programming systems, we believe that these systems can greatly benefit from flexible coherence schemes where either the compiler can specify the protocol to be used for each data structure or the protocol can adapt to varying memory access patterns.

  • 195
  • 201